AEA365 | A Tip-a-Day by and for Evaluators

TAG | research design

Keith Zvoch here. I am an Associate Professor at the University of Oregon. In this post, I would like to discuss regression discontinuity (RD) and interrupted time series (ITS) designs, two strong and practical alternatives to the randomized control trial (RCT).

Cool Trick: Take Advantage of Naturally Occurring Design Contexts

Evaluators are often charged with investigating programs or policy in situations where need-based assignment to conditions is required. In these contexts, separation of program effects from the background, maturational, and motivational characteristics of program participants is challenging. However, if performance on a preprogram measure allocates program services (RD), or if repeated measurement of an outcome exists prior to and contiguous with the intervention (ITS), then evaluators can draw on associated design frameworks to strengthen inference regarding program impact.

Lessons Learned: Design Practicality and Strength vs. Analytic Rigor

RD designs derive strength from knowledge of the selection mechanism used to assign individuals to treatment, whereas ITS designs leverage the timing of a change in policy or practice to facilitate rigorous comparison of adjacent developmental trends. Although procedurally distinct, the designs are conceptually similar in that a specific point along a continuum serves as a basis for the counterfactual. In RD designs, effects are revealed when a discontinuity in a score linking assignment and outcome exists at a cutpoint used to allocate program services. In ITS designs, effects are identified when the level or slope of the intervention time series deviates from the pre-intervention time series.

RD and ITS designs are particularly appropriate for program evaluators as they are minimally intrusive and are consistent with the need based provisioning of limited resources often found in applied service contexts. Nonetheless, it should be noted that strength of inference in both designs depends on treatment compliance, the absence of a spurious relationship coincidental with the cutpoint, and statistical conclusion validity in modeling the functional form of pre-post relationships. In many cases, inferential strength can be further enhanced by incorporating other naturally occurring design elements (e.g., multiple cutpoints, multiple treatment replications) or by drawing on administrative datasets to construct additional comparison or control groups.

The need for more extensive data collection and sensitivity analyses may of course present a nontrivial challenge in some evaluation contexts, but when considered relative to the practical and ethical difficulties that often surround implementation of a field-based RCT, an increase in analytic rigor will often prove an acceptable trade-off.

Hot Tip: Follow Within Study Design research!  WSD involves head-to-head comparisons of experimental and quasi-experimental designs, including the conditions under which selected quasi-experimental evaluation designs can replicate experimental results.

Rad Resource: This article explores how to use additional design elements to improve inference from RD designs.

The American Evaluation Association is celebrating the Design & Analysis of Experiments TIG Week. The contributions all week come from Experiments TIG members. Do you have questions, concerns, kudos, or content to extend this aea365 contribution? Please add them in the comments section for this post on the aea365 webpage so that we may enrich our community of practice. Would you like to submit an aea365 Tip? Please send a note of interest to aea365@eval.org . aea365 is sponsored by the American Evaluation Association and provides a Tip-a-Day by and for evaluators.

· · · ·

I am Melinda Davis, a Research Assistant Professor at the University of Arizona in Psychology, coordinate the Program Evaluation and Research Methods minor, and serve as Editor-in-Chief for the Journal of Methods and Measurement in the Social Sciences.  In an ideal world, evaluation studies compare two groups that differ only on the treatment assignment.  Unfortunately, there are many ways that a comparison group can differ from the intervention group.

Lesson Learned: As evaluators, we conduct experiments in order to examine the effects of potentially beneficial treatments.  We need control groups in order to evaluate the effects of treatments. Participants assigned to a control group usually receive a placebo intervention or the status quo intervention (business-as-usual). Individuals who have been assigned to a treatment-as-usual control group may refuse randomization, drop out during the course of the study, or obtain the treatment on their own.  It can be quite challenging to create a plausible placebo condition, or what evaluators call the “counterfactual” condition, particularly for a social services intervention.  Participants in a placebo condition may receive a “mock” intervention that differs in the amount of time, attention, or desirability, all of which can result in differential attrition or attitudes about the effectiveness of the treatment.  At the end of a study, evaluators may not know if an observed effect is due to time spent, attention received, participant satisfaction, group differences resulting from differential dropout rates, or the active component of treatment.  Many threats to validity can appear as problems with the control group, such as maturation, selection, differential loss of respondents across groups, and selection-maturation interactions (see Shadish, Cook and Campbell, 2002).

Cool Trick: Shadish, Clark and Steiner demonstrate an elegant approach to the control group problem. While the focus of their study was not control group issues, their doubly randomized preference trial (DRPT) included a well-designed control group.  Some participants were randomized to math or vocabulary treatment whereas the other group was randomized into their choice of instruction.

The evaluators collected math and vocabulary outcomes for all participants throughout the study.  The effects of the vocabulary intervention on the vocabulary outcome, the effects of the mathematics intervention on the mathematics outcome, and changes across the treated versus untreated condition could be compared, taking covariates into account.  This design allowed the evaluators to parse out the effects of participant bias, and the effect of treatment on the outcomes.

As evaluators, it is helpful to be aware of potential threats to validity and novel study designs that we can use to reduce such threats.

The American Evaluation Association is celebrating the Design & Analysis of Experiments TIG Week. The contributions all week come from Experiments TIG members. Do you have questions, concerns, kudos, or content to extend this aea365 contribution? Please add them in the comments section for this post on the aea365 webpage so that we may enrich our community of practice. Would you like to submit an aea365 Tip? Please send a note of interest to aea365@eval.org . aea365 is sponsored by the American Evaluation Association and provides a Tip-a-Day by and for evaluators.

· · ·

This is part of a series remembering and honoring evaluation pioneers leading up to Memorial Day in the USA on May 30.

My name is Mel Mark, a former AEA President and former editor of the American Journal of Evaluation. Don Campbell used pithy phrases to communicate complex philosophical or methodological issues. My favorite was: “Cousin to the amoeba, how can we know for certain?” This encapsulates his philosophy of science which informed his contributions to evaluation.

Pioneering and enduring contributions:

Campbell’s pioneering contributions included work on bias in social perception, intergroup stereotyping, visual illusion, measurement, research design and validity, and evaluation, which was at the center of his vision of “an experimenting society.” He believed in the evolution of knowledge through learning: “In science we are like sailors who must repair a rotting ship while it is afloat at sea. We depend on the relative soundness of all other planks while we replace a particularly weak one. Each of the planks we now depend on we will in turn have to replace. No one of them is a foundation, nor point of certainty, no one of them is incorrigible.”

Donald T. Campbell

Donald T. Campbell

Campbell’s work reminds us that every approach to evaluation is founded in epistemological assumptions and that being explicit about those assumptions, and their implications, is part of our responsibility as evaluators. Campbell wanted science, and evaluation, to keep the goal of truth, testing and inferring what is real in the world. But he acknowledged this goal as unattainable so “we accept a . . . surrogate goal of increasing coherence even if we regard this as merely our best available approximation of the truth.”

Campbell was an intellectual giant but disarmingly modest. He was gracious and helpful to students and colleagues, and equally gracious to his critics. His openness to criticism and self-criticism modeled his vision of a “mutually monitoring, disputatious community of scholars.” Those who knew Don Campbell know with all the certainty allowed to humans just how special he was.

Reference for quotations:

Mark, M. M.(1998). The Philosophy of science (and of life) of Donald T. Campbell,

American Journal of Evaluation, 19, 3: 399-402.

 Resources:

Bickman, L., Cook, T. D., Mark, M.M., Reichardt, C.S., Sechest, L., Shadish, W.R., & Trochim, W.M.K. (1998). Tributes to Donald T. Campbell. American Journal of Evaluation, 19(3): 397-426.

Brewer, M.B. & Collins, B.E. (Eds.) (1981) Scientific inquiry and the social sciences: a volume in honor of Donald T. Campbell. Jossey-Bass.

Campbell, D.T. (1994). Retrospective and prospective on program impact assessment. American Journal of Evaluation, 15 (3): 291-298.

Campbell, D.T. & Russo, J. (2001). Social measurement. Sage.

The American Evaluation Association is celebrating Memorial Week in Evaluation: Remembering and Honoring Evaluation’s Pioneers. The contributions this week are remembrances of evaluation pioneers who made enduring contributions to our field. Do you have questions, concerns, kudos, or content to extend this aea365 contribution? Please add them in the comments section for this post on the aea365 webpage so that we may enrich our community of practice. Would you like to submit an aea365 Tip? Please send a note of interest to aea365@eval.org . aea365 is sponsored by the American Evaluation Association and provides a Tip-a-Day by and for evaluators.

· ·

Hello, I’m Shirah Hecht, Ph.D,  Program Evaluator with Academic Technology Services within the Harvard University Info Tech area.  Here is a simple “trick” for beginning to develop a research design.

I call this “system mapping.”  You may connect it to stakeholder analysis or concept mapping, since it blends the two in a way – but it goes a bit further than either, for research purposes.  It comes from a simple suggestion given to me by my graduate school mentor who taught qualitative field methods at Northwestern University, Howard S. Becker.  He credited the sociologist Everett C. Hughes for this method.

Essentially, the technique is to identify a central event or person, then radiate out from there to consider all the constituencies or positions or groups that connect to that central event or person.  This is a way of jump-starting your thinking about what the relevant data sources might be and to identify questions about your central topic.

For example, in education, the central event might be the classroom; the radiating circles might identify students, teachers, parents, and administrators, among others.  Alternatively, the central circle might hold the student as a central person; the radiating circles then might include the parents, teachers, other students, guidance counselors, testing agencies, etc.

After identifying these outer circles, you can pose relevant questions such as:

  • What is the perspective of each constituency on the central event or person?  What matters to them?  What is their investment in this process or person?
  • At what points do they interact with the central event, for the purposes of my research questions?
  • What “data” might they hold, whether in terms of process or perspective, to define or address my research questions?

This process also fits in nicely with developing a logic model with the program provider, to develop an evaluation project.  Even if you are not logic model-bound, it can frame a good conversation and understanding of the research planning process and the final decisions about data collection.

Here is a simple version of this map, generalized from a program for which I developed an evaluation plan.  The green highlights indicate the data collection sources: a focus group with program volunteers and a survey of clients.

Hecht

Lessons Learned: In research planning, move from the perspective of the constituency to specific research questions for the project.

Hot Tip: Combine this mapping process with the Tearless Logic Model process, to jumpstart the conversation about research plans with program staff. 

Rad Resources:  Everett C. Hughes offers the sociological eye on any and all processes we might want to research: The Sociological Eye: Selected Papers (Social Science Classics

The Tearless Logic Model

Do you have questions, concerns, kudos, or content to extend this aea365 contribution? Please add them in the comments section for this post on the aea365 webpage so that we may enrich our community of practice. Would you like to submit an aea365 Tip? Please send a note of interest to aea365@eval.org . aea365 is sponsored by the American Evaluation Association and provides a Tip-a-Day by and for evaluators.

· ·

Archives

To top