Tech TIG Week: The Evaluation Time Machine: Charting the Course of Evaluation Intelligence by Zach Tilton

Greetings, colleagues. I’m Zach Tilton, Evaluation Specialist at The MERL Tech Initiative and Sandbox Working Group Co-lead for the MERL Tech Initiative Natural Language Processing Community of Practice (NLP-CoP). Today, I’d like to take you on a journey through the evolution of our field and introduce a concept I call “Evaluation Intelligence.”

Imagine, if you will, a time machine. Not the shiny, chrome variety from sci-fi films, but one built from logic models, survey instruments, and a dash of evaluative thinking. As we embark on this journey, our first stop is 1950, where we encounter a young Michael Scriven writing about artificial intelligence:

“Perhaps the day will yet come when we, having promoted ourselves to the leading role by discovering there is no one above us, will find ourselves in the role of the magician, the possessor of mysterious powers, and snapping at our heels will be the machines.”

Fast forward to today. Scriven’s “machines” are now sophisticated AI systems working alongside us. This brings us to Evaluation Intelligence, a deliberate approach to integrating AI capabilities with evaluative expertise, reasoning, and thinking. It encompasses:

Thoughtful application of AI in evaluation processes
New competencies for evaluators in the AI era
Frameworks for assessing AI-assisted evaluations
Exploration of AI’s potential to enhance evaluation methodologies
Critical examination of AI’s impact on evaluation practice and society

Importantly, Evaluation Intelligence aligns closely with the goals of the NLP-CoP, which aims to democratize understanding of NLP and Generative AI, develop ethical tools and guidance, and influence responsible practices in the MERL Tech sector.

Hot Tips

Cultivate Critical AI Literacy. To effectively leverage Evaluation Intelligence, invest time in understanding AI basics. This knowledge will help you identify opportunities for AI integration and potential pitfalls. Remember to interrogate the technology as we integrate it into our evaluation practices. The MERL Tech Initiative NLP-CoP offers a wealth of working groups, resources, and webinars for evaluators looking to incorporate AI into their practice.
Crowdsource and Learn in Public. Contribute to digital public goods and learn from colleagues about AI tools for evaluation. The community is building shared resources like this crowdsourced spreadsheet of AI tools. By openly sharing experiences, we learn faster together. NLP-CoP working groups explore such approaches in various contexts.

Rad Resource

United Nations Population Fund (UNFPA) GenAI-Powered Evaluation Strategy. The UNFPA Independent Evaluation Office has developed a comprehensive strategy for leveraging responsible and ethical Generative AI (GenAI) in their evaluation function. This pioneering framework covers key aspects from goals to implementation roadmaps.

Lesson Learned

Balance is Key. While AI can enhance capabilities, evaluation fundamentally relies on human judgment. Strive for a balance where AI augments, rather than replaces, evaluator expertise. This echoes the NLP-CoP’s commitment to responsible AI applications in the MERL Tech sector.

As we conclude our time-traveling expedition, we find ourselves not at an endpoint, but at a new beginning. Evaluation Intelligence represents an opportunity to elevate our practice, but it requires intentional development and collaboration between evaluators and AI specialists. It calls for us to:

Embrace value-conscious technology development, recognizing the non-neutrality of AI and evaluation systems.
Foster transdisciplinary collaboration between evaluators, AI developers, and policymakers.
Develop adaptive evaluation frameworks for rapidly evolving AI technologies.
Ensure ethical alignment between AI systems and human values in evaluation contexts.

The future of evaluation is unfolding before us. By embracing Evaluation Intelligence and engaging with communities like the NLP-CoP, we can ensure that our field remains relevant, rigorous, and responsive to the complexities of an increasingly digital world. Together, we can shape a future where AI enhances our ability to understand, assess, and improve the programs and policies that impact people’s lives.

The American Evaluation Association is hosting Integrating Technology into Evaluation TIG Week with our colleagues in the Integrating Technology into Evaluation Topical Interest Group. The contributions all this week to AEA365 come from ITE TIG members. Do you have questions, concerns, kudos, or content to extend this AEA365 contribution? Please add them in the comments section for this post on the AEA365 webpage so that we may enrich our community of practice. Would you like to submit an AEA365 Tip? Please send a note of interest to AEA365@eval.org. AEA365 is sponsored by the American Evaluation Association and provides a Tip-a-Day by and for evaluators. The views and opinions expressed on the AEA365 blog are solely those of the original authors and other contributors. These views and opinions do not necessarily represent those of the American Evaluation Association, and/or any/all contributors to this site.

Hot Tips

Rad Resource

Lesson Learned

Leave a Comment Cancel Reply