TAG | internal validity
I’m Laura Peck, recovering professor and now full-time evaluator with Abt Associates. For many years I taught graduate Research Methods and Program Evaluation courses. One part I enjoyed most was introducing students to the concepts of causality, internal validity and the counterfactual – summarized here as hot tips.
#1: What is causality?
Correlation is not causation. For an intervention to cause a change in outcomes, the two must be associated and the intervention must temporally precede the change in outcomes. These two criteria are necessary. The sufficient criterion is that no other plausible, rival explanations can take credit for the change in outcomes.
#2: What is internal validity? And why is it threatened?
In evaluation parlance, these “plausible rival explanations” are known as “threats to internal validity.” Internal validity refers to an evaluation design’s ability to establish that causal connection between intervention and impact. As such, the threats to internal validity are those factors in the world that might explain a change in outcomes that you think your program achieved independently. For example, children mature and learn simply by exposure to the world, so how much of an improvement in their reading is due to your tutoring program as opposed to their other experiences and maturation processes? Another example is job training that assists unemployed people: one cannot be any less employed than being unemployed, and so “regression to the mean” implies that some people will improve (get jobs) regardless of the training. These two “plausible rival explanations” are known as the “threats to validity” of maturation and regression artifact. Along with selection bias and historical explanations (recession, election, national mood swings), these can claim credit for changes in outcomes observed in the world, regardless of what interventions try to do to improve conditions.
#3: Why I stopped worrying and learned to love the counterfactual.
I want interventions to be able to take credit for improving outcomes, when in fact they do. That is why I like randomization. Randomizing individuals or classes or schools or cities to gain access to an intervention—and randomizing some not to gain access—provides a reliable “counterfactual.” In evaluation parlance, the “counterfactual” is what would have happened in the absence of the intervention. Having a group that is randomized out (e.g., to experience business as usual) means that it experiences all the historical, selection, regression-to-the-mean, and maturation forces as do those who are randomized in. As such, the difference between the two groups’ outcomes represents the program’s impact.
As a professor, I would challenge my students to use the word “counterfactual” at social gatherings. Try it! You’ll be the life of the party.
For additional elaboration on these points, please read my Why Randomize? Primer.
The American Evaluation Association is celebrating the Design & Analysis of Experiments TIG Week. The contributions all week come from Experiments TIG members. Do you have questions, concerns, kudos, or content to extend this aea365 contribution? Please add them in the comments section for this post on the aea365 webpage so that we may enrich our community of practice. Would you like to submit an aea365 Tip? Please send a note of interest to firstname.lastname@example.org . aea365 is sponsored by the American Evaluation Association and provides a Tip-a-Day by and for evaluators.