RTD TIG Week: Augmenting Expert Opinion with Data-driven Approaches by Ian Hutchins

I’m Ian Hutchins, and I am an Assistant Professor of Data and Information Science at the University of Wisconsin-Madison in the Information School. I use quantitative analysis of information networks to find ways to improve the research enterprise and accelerate biomedical research advances.

Research evaluations, like those conducted for science funding agencies in policymaking, are entering a golden age. Scientific portfolio analysis, or the application of scientific methods toward informing portfolio management and policy decisions, has leaped forward with the use of modern data science approaches. Public data infrastructure has likewise been built out to support scientific portfolio analysis. This powerful combination of data science, portfolio analysis, and comprehensive public data enables transparent data-driven decision-making on a scale that was not possible ten years ago.

Comprehensive citation graphs now let us see large-scale knowledge transfer across the entire scientific literature. Combined, public-domain citation data from sources like the Initiative for Open Citations, the National Institutes of Health (NIH) Open Citation Collection, and Internet Archive’s Refcat are now more comprehensive than commercial data sources like Web of Science or Scopus. Large open datasets enable a variety of measures that can be used to answer policy-relevant questions with data. Measures of scientific influence, like NIH’s Relative Citation Ratio or Digital Science’s Field Citation Ratio have recently been developed and tested. These can be applied to measure scientific influence across the literature writ broadly or down to the individual article level. They can be used to compare funder portfolios, or to identify between successes from different funding mechanisms that can be scaled up.

Metrics of applied outcomes can be developed as well. For the first time, there is a comprehensive index of every fundamental or applied biomedical research study that informed and was cited by published clinical research. There is enough structure in citation dynamics to identify measures of present or future knowledge transfer into clinical outcomes. Advances in natural language processing have enabled the partitioning and semantic comparison of whole topics of science. This can be used to detect (mis)alignment of the literature with legacy organization of funding agency bodies. The ability to include both retrospective and prospective analytics in evaluation is a capacity that has emerged directly from the collision of data science with big (open) data.

These advances in using data analytics to inform decision-making have augmented the traditional expert-opinion based decision systems in important ways. One limitation of opinion-based evaluation is that it may rely exclusively on a sparse sample of the information available. There are nearly two million articles being published annually in biomedicine alone, making it impossible for experts to synthesize even a small fraction of the literature. Comprehensive data facilitate comparative analysis of a global portfolio in evaluation. One example is in analysis of high-risk, high-reward science portfolios relative to standard funding mechanisms, yielding insights about the comparative strengths and weaknesses that would not be visible without this kind of global analysis. Another strength of modern analytical approaches is that they can leverage data science to merge heterogeneous data to answer questions about the present and future scientific workforce under different policy scenarios. One such study found that a recent policy change was sufficient to stabilize career stage trends, but not to reverse prior effects. Finally, global analysis of risk of project failure identified factors related to awards being made with no research outputs whatsoever being published. This, alongside measures of scientific influence, has informed a data-driven decision-making process that merges analysis, evaluation, and expert opinion to reshape review groups at NIH. In a sense, newly acquired analytical approaches are beginning to reshape the research enterprise.


The American Evaluation Association is hosting Research, Technology and Development (RTD) TIG Week with our colleagues in the Research, Technology and Development Topical Interest Group. The contributions all this week to AEA365 come from our RTD TIGmembers. Do you have questions, concerns, kudos, or content to extend this AEA365 contribution? Please add them in the comments section for this post on the AEA365 webpage so that we may enrich our community of practice. Would you like to submit an AEA365 Tip? Please send a note of interest to AEA365@eval.org. AEA365 is sponsored by the American Evaluation Association and provides a Tip-a-Day by and for evaluators. The views and opinions expressed on the AEA365 blog are solely those of the original authors and other contributors. These views and opinions do not necessarily represent those of the American Evaluation Association, and/or any/all contributors to this site.

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.