Experiments TIG Week: Laura Peck on The Origins and Meaning of the “Black Box” Label and How Innovative Experimental Research is Working Shake It

Hi again, it’s Laura Peck here, that evaluator from Abt Associates. To close out the Design & Analysis of Experiments TIG’s first week of contributions to the AEA365 blog, I focus on one of the main critiques of experimental evaluations, that known as the “black box” criticism.

Experimentally-designed evaluations can isolate the impact of an intervention: the difference in the treatment and control group outcomes—or the impact—cannot be attributed to other forces (see Tuesday’s blogpost), when an experiment is appropriately implemented. But what the program is is often referred to as a “black box”—a total unknown—in experiments.

Good evaluations (all those that I am involved in) couple implementation analysis with impact analysis as a way to respond to the criticism that impact analysis alone cannot tell us what happens inside an intervention. That implementation analysis exposes what is inside the “black box” so that the impact analysis can use that information in interpreting its results: for large impacts, what happened in the intervention that explains why they arose; for null impacts, what did not happen in the intervention to explain why they did not.

Although implementation evaluation remains valuable in and of itself, experimental impact evaluations increasingly build on design and analytic innovations to answer “black box” questions on their own.

On the design front, for example, evaluations are adding treatment arms to isolate the effect of program variants. In the News Family Options Study (funded by the U.S. Department of Housing and Urban Development) randomized to four arms to test the relative effectiveness of various treatment models for homeless families. Although the income tax experiments of the 1970s used a fractional factorial design, few evaluations have followed that lead. Now funders have a renewed interest in factorial designs in order to help answer some of those “black box” questions (Solmeyer & Constance, 2015).

On the analysis front, substantial methodological advancements capitalize on experimental data to help inform what’s inside that black box: what is it about programs and their participants that drive program impacts? Pushing propensity score methods, instrumental variable estimation and principal stratification-based analyses is increasing what evaluators have in their toolkit (find more detail here and here).

Rad Resources:

Perhaps the first published reference to “black box” appeared in a 1992 Institute for Research on Poverty discussion paper, “Prying the Lid from the Black Box” by David Greenberg, Rob Meyer and Mike Wiseman (although one of the authors credits Larry Orr for using the “black box” term before then).
American Journal of Evaluation’s 2015 volume 36, is fully dedicated to “black box” opening, and the New Directions for Evaluation’s 2016 issue 152 considers design and analytic innovations for Social Experiments in Practice.

The American Evaluation Association is celebrating the Design & Analysis of Experiments TIG Week. The contributions all week come from Experiments TIG members. Do you have questions, concerns, kudos, or content to extend this aea365 contribution? Please add them in the comments section for this post on the aea365 webpage so that we may enrich our community of practice. Would you like to submit an aea365 Tip? Please send a note of interest to aea365@eval.org . aea365 is sponsored by the American Evaluation Association and provides a Tip-a-Day by and for evaluators.

Leave a Comment Cancel Reply