Gov’t Eval TIG Week: Lauren Supplee on Replication and Trust in Government Evaluation

My name is Lauren Supplee and I work in the Office of Planning, Research and Evaluation at the Administration for Children and Families. Recent media and academic attention to transparency, replication, trust in science and lack of replication of findings in medicine research and psychology raises issues for evaluation as seen in articles in Nature Medicine and The Guardian. While evaluators can debate the concept of replication, one of the core issues of replication is trust in the evidence evaluation generates as a condition of whether it is used in policy or practice. As an evaluator I know that the perceived utility of my work to policy and practice is only as strong as the user’s trust in my findings.

While the evaluation field can’t address all of the aspects involved in the public’s trust in research and evaluation, we can proactively address building confidence and trust in design, analysis and interpretation of findings.

Hot Tips: Registering studies: A colleague and I recently wrote a commentary on the Society for Prevention Research’s revised evidence standards for prevention science. In the commentary we noted our disappointment that the new standards did not take transparency and trust head on. We stated the field needs to seriously consider engaging in practices such as pre-registering studies, pre-specifying analytic plans and sharing data with other evaluators to allow for replication of findings by independent analysts. There are multiple registries including the Open Science Framework which allows for publically sharing multiple aspects of project design and analysis; and for clinical trials new registries have been created by American Economic Association, Registry of Clinical Trials on What Works Clearinghouse, and clinicaltrials.gov.

Issues related to analysis: While pre-registering analysis plans may not always be appropriate for every study, the lack of adjustment for multiple comparisons or pre-specification of primary versus secondary outcome variables does not increase the public and policy-makers’ trust in our findings. Another factor in lack of replication is under-powered studies. A recent article in American Psychologist discusses this aspect and proposes the field should be considering statistical techniques such as Bayesian methods.

Interpretation of findings: My colleague who does work in tribal communities emphasizes the importance of having the community’s input in the interpretation of findings. In community-based participatory work, the partnership is embedded from the start and can naturally include this step. In some “high-stakes” policy-evaluation, a firewall has been built between the evaluator and the evaluated to gain independence of the findings.

Get Involved: How can we broaden the conversation to the larger community? What other ways can we build trust in evaluation findings, and ensure clear guidance on how to benefit from participant interpretation while still maintaining trust in the findings?

The American Evaluation Association is celebrating Gov’t Eval TIG Week with our colleagues in the Government Evaluation Topical Interest Group. The contributions all this week to aea365 come from our Gov’t Eval TIG members. Do you have questions, concerns, kudos, or content to extend this aea365 contribution? Please add them in the comments section for this post on the aea365 webpage so that we may enrich our community of practice. Would you like to submit an aea365 Tip? Please send a note of interest to aea365@eval.org. aea365 is sponsored by the American Evaluation Association and provides a Tip-a-Day by and for evaluators.

Leave a Comment Cancel Reply