Experiments TIG Week: What Can Experimental Evaluations Tell Us? And Why We Should Not Be So Doubtful About What They Won’t Tell Us by Laura Peck
Hello AEA365 readers! I am Laura Peck, founder and co-chair of the AEA’s recently-established (and growing) Design & Analysis of Experiments TIG. I work at Abt Associates as an evaluator in the Social & Economic Policy Division and director of Abt’s Research & Evaluation Expertise Center. Today’s AEA365 blogpost recaps what experimental evaluations typically tell us and highlights recent research that helps tell us more.
As noted yesterday, dividing eligible program participants randomly into groups—a “treatment group” that gets the intervention and a “control group” that does not—means the difference in the groups’ outcomes is the intervention’s “impact.” This is the “average treatment effect” of the “intent to treat” (ITT). The ITT is the effect of the offer of treatment, regardless whether those offered “take up” the offer. There can also be interest in (a) the effect of taking up the offer; and (b) the impact of other, post-randomization milestone events within the overall treatment, two areas where pushing experimental evaluation data can tell us more.
The ITT effect is commonly considered to be the most policy relevant: in a world where program sponsors don’t mandate participation but instead make services available, the ITT captures the average effect of making the offer.
Fortunately, a widely-accepted approach exists for converting the ITT into the effect of the treatment-on-the-treated (TOT). The ITT can be rescaled by the participation rate—under the assumption that members of the treatment group who do not participate (“no-shows”) experience none of the program’s impact. For example, if the ITT estimate shows an improvement of $1,000 in earnings, in a study where 80% of the treatment group took up the training ($1,000 divided by 0.80), then the TOT effect would be $1,250 for the average participant.
In addition, an active body of research advances methods for understanding mediators—those things that happen after the point of randomization that subsequently influence program impact. For example, although improving earnings may be a job training program’s ultimate goal, we might want to know whether earning a credential generates additional earnings gains. Techniques that leverage the experimental design to produce strong estimates of the effect of some mediator include: capitalizing on cross-site and cross-participant variation, instrumental variables (including principal stratification), propensity score matching, and analysis of symmetrically-predicted endogenous subgroups (ASPES). These use existing experimental data and increasingly are being planned into evaluations.
From this examination of the challenge of the day, we conclude that social experiments can provide useful information on the effects of participation and the effects of post-randomization events in addition to the standard (ITT) average treatment effect.
Up for discussion tomorrow: are the counterfactual conditions that experiments create the right ones for policy comparisons?
The American Evaluation Association is celebrating the Design & Analysis of Experiments TIG Week. The contributions all week come from Experiments TIG members. Do you have questions, concerns, kudos, or content to extend this aea365 contribution? Please add them in the comments section for this post on the aea365 webpage so that we may enrich our community of practice. Would you like to submit an aea365 Tip? Please send a note of interest to firstname.lastname@example.org . aea365 is sponsored by the American Evaluation Association and provides a Tip-a-Day by and for evaluators.