AEA365 | A Tip-a-Day by and for Evaluators

TAG | RCT

Keith Zvoch here. I am an Associate Professor at the University of Oregon. In this post, I would like to discuss regression discontinuity (RD) and interrupted time series (ITS) designs, two strong and practical alternatives to the randomized control trial (RCT).

Cool Trick: Take Advantage of Naturally Occurring Design Contexts

Evaluators are often charged with investigating programs or policy in situations where need-based assignment to conditions is required. In these contexts, separation of program effects from the background, maturational, and motivational characteristics of program participants is challenging. However, if performance on a preprogram measure allocates program services (RD), or if repeated measurement of an outcome exists prior to and contiguous with the intervention (ITS), then evaluators can draw on associated design frameworks to strengthen inference regarding program impact.

Lessons Learned: Design Practicality and Strength vs. Analytic Rigor

RD designs derive strength from knowledge of the selection mechanism used to assign individuals to treatment, whereas ITS designs leverage the timing of a change in policy or practice to facilitate rigorous comparison of adjacent developmental trends. Although procedurally distinct, the designs are conceptually similar in that a specific point along a continuum serves as a basis for the counterfactual. In RD designs, effects are revealed when a discontinuity in a score linking assignment and outcome exists at a cutpoint used to allocate program services. In ITS designs, effects are identified when the level or slope of the intervention time series deviates from the pre-intervention time series.

RD and ITS designs are particularly appropriate for program evaluators as they are minimally intrusive and are consistent with the need based provisioning of limited resources often found in applied service contexts. Nonetheless, it should be noted that strength of inference in both designs depends on treatment compliance, the absence of a spurious relationship coincidental with the cutpoint, and statistical conclusion validity in modeling the functional form of pre-post relationships. In many cases, inferential strength can be further enhanced by incorporating other naturally occurring design elements (e.g., multiple cutpoints, multiple treatment replications) or by drawing on administrative datasets to construct additional comparison or control groups.

The need for more extensive data collection and sensitivity analyses may of course present a nontrivial challenge in some evaluation contexts, but when considered relative to the practical and ethical difficulties that often surround implementation of a field-based RCT, an increase in analytic rigor will often prove an acceptable trade-off.

Hot Tip: Follow Within Study Design research!  WSD involves head-to-head comparisons of experimental and quasi-experimental designs, including the conditions under which selected quasi-experimental evaluation designs can replicate experimental results.

Rad Resource: This article explores how to use additional design elements to improve inference from RD designs.

The American Evaluation Association is celebrating the Design & Analysis of Experiments TIG Week. The contributions all week come from Experiments TIG members. Do you have questions, concerns, kudos, or content to extend this aea365 contribution? Please add them in the comments section for this post on the aea365 webpage so that we may enrich our community of practice. Would you like to submit an aea365 Tip? Please send a note of interest to aea365@eval.org . aea365 is sponsored by the American Evaluation Association and provides a Tip-a-Day by and for evaluators.

· · · ·

I’m Allan Porowski, a Principal Associate at Abt Associates and a fan of experiments – when they’re conducted under the right circumstances. Experiments, commonly referred to as RCTs (randomized controlled trials) go through three stages: (1) crazy start-up period, (2) normal data collection period, and (3) crazy analysis period.

Hot Tips:  Here are some tips to make that start-up period less crazy:

  • Don’t Fall in Love with the Method: Too often, evaluators try to force a given method to fit reality instead of using it to measure reality. Even though we may really want to conduct an RCT, it may not be appropriate. Experiments are not appropriate for new initiatives because they may not yet have excess demand for services, necessary data collection infrastructure, an randomization-accommodating intake process, or staff buy-in. If these criteria are not met, then the program is not ready to be tested experimentally.
  • Be Forward-Looking by Working Backwards: There’s no substitute for in-person planning sessions to hammer out evaluation details. A half day (or better yet, a full day) on-site is needed; and you’ll need a big whiteboard. It helps to start with a discussion of what the site hopes learn, and design the study to meet those goals. Starting out with the big-picture and moving into the details also gets the conversation off to a more productive start than diving into the nuances of randomization.
  • Know Your Audience, and Let Them Know You: Don’t forget that when conducting an RCT, you are asking staff to replace professional judgment with a completely random process. That’s not an easy proposition to make. It’s really important to convey your understanding that RCTs can be disruptive, and explain what can be done to minimize that disruption. Likewise, teach program staff to think like an evaluator. Get them involved in formulating research questions, identifying mediators, and developing hypotheses about the relationship between program services to outcomes. Keep in mind that nodding does not equal understanding. RCTs are not intuitive to most people, including many researchers, so take the time to explain study procedures in multiple ways.
  • Pressure-Test Your Sampling Frame: RCTs are often knocked for lacking generalizability, and unfortunately, that criticism is often warranted. Did you just recruit a bunch of sites that only serve left-handed kids in Boston? Recruitment is tough, but it’s even tougher to make a case that results are generalizable when your sampling frame doesn’t represent the program participants you’re studying.

Rad Resource:  Key Items To Get Right When Conducting a Randomized Controlled Trial in Education. Though over 10 years old, the advice is timeless.

The American Evaluation Association is celebrating the Design & Analysis of Experiments TIG Week. The contributions all week come from Experiments TIG members. Do you have questions, concerns, kudos, or content to extend this aea365 contribution? Please add them in the comments section for this post on the aea365 webpage so that we may enrich our community of practice. Would you like to submit an aea365 Tip? Please send a note of interest to aea365@eval.org . aea365 is sponsored by the American Evaluation Association and provides a Tip-a-Day by and for evaluators.

· ·

Feb/16

15

Joseph E. Bauer on Observational Studies

Hi, I’m Joseph E. Bauer, Director of Survey Research & Evaluation in the Statistics & Evaluation Center (SEC) for the American Cancer Society (ACS) in Atlanta, Georgia. I am in my eleventh year as an internal evaluator. I am the former Chair of the Organizational Learning and Evaluation Capacity Building (OL-ECB) TIG, and am currently on our Leadership Team.

Lesson Learned: Observational studies are a broad class of research that are numerous across multiple fields of study, from medical research to health care and health policy, to health promotions, and to social and behavioral research. Quite often this type of research is framed as ‘inferior’ or ‘weak’ as compared to randomized controlled trials (RCT’s). That’s because the data derived in these studies (convenience samples) pose a threat to the validity of statements made about causal inference, because it introduces selection bias (among other kinds of bias) into the treatment assignment. On balance, observational studies yield relatively ‘low grade evidence’ and lack the ability to make valid causal statements or to generalize findings to a wider population. However, this does not mean observational studies are worthless, as they can and do provide understanding and insight into the human condition, are generally easier to implement, generate results more quickly, are less expensive, and are useful for generating hypotheses to be followed up on with more rigorous study designs. Interestingly, these same kinds of biases and weaknesses can occur in RCT’s as well, especially ones that are designed more as efficacy studies. However, RCT’s can be designed as effectiveness studies (also called pragmatic studies (practical)) and control for those biases and weaknesses. So, one must be careful and thorough in thinking through study designs for your own research and/or evaluations. Every design has strengths and weaknesses – it is not an ‘either/ or’ problem. We are often trying to balance efficacy (does the treatment work?) with effectiveness (does the treatment work in the ‘real world’ for different kinds of people?). Where one chooses to calibrate that balance depends on a number of factors, including the philosophy of your design. Observational studies and RCT’s can be valuable. The key word is transparency.

Hot Tips: While this short piece will not answer all the questions you may have about observational studies or randomized controlled trials – it will hopefully lead you to address the larger issue of the need to be transparent in the reporting of the methods used in the conduct of research or evaluation. Refer to: STROBE Statement (Strengthening The Reporting of OBservational studies in Epidemiology) and CONSORT (CONsolidated Standards Of Reporting Trials).

Rad Resources: The American Evaluation Association’s Guiding Principles for Evaluators–which are intended to guide the professional practice for evaluators.

Do you have questions, concerns, kudos, or content to extend this aea365 contribution? Please add them in the comments section for this post on the aea365 webpage so that we may enrich our community of practice. Would you like to submit an aea365 Tip? Please send a note of interest to aea365@eval.org . aea365 is sponsored by the American Evaluation Association and provides a Tip-a-Day by and for evaluators.

Archives

To top