Causal Inference - World Bank Group

CAUSAL INFERENCE
Shwetlena Sabarwal
Africa Program for Education Impact Evaluation
Accra, Ghana, May 2010
Motivation


Goal of any evaluation is to estimate the causal
effect of intervention X on outcome Y.
Example: does an education intervention improve
test scores (learning)?
 Reducing
class size
 Teacher training
 In-school nutrition
2
Causation is not correlation!

Any two variable (X and Y) can move together
1. Male teachers & academic performance of students.
2. Health and income.

But, they may have nothing to do with each other.

Other explanations?
3
Evaluation problem: Potential Outcomes
Approach

Ideal way to evaluate the impact of an intervention:
 observe
agent in and out of program, at a point in
time.

But, think about the only way in which we can evaluate
the impacts of an intervention:
 observe
time.
agent in or out of program, at any point in
How to assess causality?


Let Y= outcome of interest (test score)
P= participation in program
= 1 if in
= 0 if out
Formally, program impact is:
α = (Y | P=1) - (Y | P=0)
Outcome
w/ program

Outcome
w/out program
Program Impact: difference in outcomes for individuals in and
out of program.
Another Way to Think of Evaluation
Problem


The problem we face is that:
 (Y | P=0) is not observed for program participants.
 (Y | P=1) is not observed for non-participants
Missing Data Problem:
 Counterfactual
 what
not observed.
would have happened to agent without the
intervention?
Solving the evaluation problem

Generate the counterfactual
 find
a control or comparison observation for agent
facing the intervention.

Criteria for selecting comparison observation:
1. Observationally similar, at baseline (and after
intervention).
2. Face same contemporaneous “shocks” as the
treatment group.
“Counterfeit” Counterfactuals
1. Before and after:
 Same individual before the treatment
2. Non-Participants:
 Those who choose not to enroll in program
 Those who were not offered the program
“Counterfeit” Counterfactual
Number 1: Before and After
9

Consider how you might evaluate an agricultural
assistance program.
 Suppose
program offers free/subsidized fertilizer.
 Compare rice yields before and after

Q: If you find no change in rice yield, can you
conclude the program failed?
 What
else changed?
Drought? Lots of rainfall?
Scholarship Program and School
Enrollment, Before and After




Ultimate goal is to estimate α
(Yit | P=1) - (Yi,t| P=0)
Estimate the impact on treated
individuals:
"A-O"=(Yi,t| P=1) - (Yi,t-1| P=1)
Second, estimate counterfactual
"B-O"=(Yi,t| P=0) - (Yi,t-1| P=0)
“Impact” = A-B
Y
Before
After
A
α’
B
O
t-1
t
Time
Scholarship Program and School
Enrollment, Before and After
But, impact "A-B" may
misrepresent true
counterfactual.
Y
Before
After
A
 Suppose
C is the correct
counterfactual.
α’’
B
O
C

Here, the impact of the
intervention is "A-C".
t-1
t
Time
12
“Counterfeit” Counterfactual
Number 2: Non-Participants….

Compare non-participants to participants

Counterfactual: non-participant outcomes



Impact estimate:
αi = (Yit | P=1) - (Yj,t| P=0)
Assumption:
(Yj,t| P=0) = (Yi,t| P=0)
Issue: why did the j’s not participate?
13
Non-participants Example : Job
Training and Employment


Compare employment & earning of individuals who
sign up for training to those who do not.
Who signs up?
 Those
who are most likely to benefit, i.e. those with
more ability
 Would have higher earnings than non-participants
without job training

Poor estimate of counterfactual
Non-participants Example 2: Health
Insurance and Demand for Medical Care
14


Compare health care utilization (# doctor visits) of
those who got insurance to those who did not.
But, who buys insurance?
 those

who expect large medical expenditures (unhealthy)
Those who do not buy insurance have less need for
medical care.
 Poor
estimate of counterfactual
The problem is selection bias.
15


Selection bias: People choose to participate in
program for specific reasons.
Problem occurs when reasons for participation are
related to the outcome of interest:
 Job
Training: ability and earning
 Health Insurance: health status and medical-care
utilization.

Cannot separately identify impact of the program
from these other factors/reasons
Need to know…
16



Know all reasons why someone gets the program
and others not
reasons why individuals are in the treatment versus
control group
If reasons correlated w/ outcome
 cannot
identify/separate program impact from other
explanations of differences in outcomes
Possible Solutions…
17

We need to understand the data generation
process
 How
beneficiaries are selected and how benefits are
assigned

Guarantee comparability of treatment and control
groups, so ONLY unaccounted for difference is the
intervention.