Judy Savageau and Kathy Muhr on Working with Different Data Sources

Hi. We’re Judy Savageau and Kathy Muhr from the University of Massachusetts Medical School’s Center for Health Policy and Research. Within our Research and Evaluation Unit, we work on a number of projects using qualitative and quantitative methods as well as primary and secondary data sources. We’ve come to appreciate that different types of data from different sources need varying levels of data management and quality oversight.

One of our current projects is evaluating a screening program that requires primary care providers to screen children for potential behavioral health conditions. Among a random sample of 4000 children seen for a well child visit during one of two study years, we collected data both from medical records (primary data source: both quantitative and qualitative chart notes) as well as administrative/claims data (secondary data source: solely quantitative). Given the nature of data from the two sources, we implemented different data quality checks and cross-checks between them.

Lessons Learned:

  • Claims data comes from the insurance payer having already gone through its own internal data cleaning and data management processes. However, much of the patient demographic data comes at the time of insurance enrollment and not updated at the time of a clinical visit. Some data elements are often incomplete and not updated even after numerous clinical encounters, especially data such as gender, race, ethnicity and primary language. While a provider might ‘know’ this information when seeing a patient, it’s not necessarily updated in administrative datasets.
  • Many practices don’t necessarily collect demographic data in a uniform manner unless they’re required to report on this data. Primary care providers are well connected to their patient’s demographics in terms of needs for interpreters, cultural health beliefs, and age- or gender-specific anticipatory guidance needs. Unfortunately, medical records data often had nearly as much missing data as did the administrative claims data!
  • Cross-checking data between these two sources was an important step for us to take in this project as we hypothesized that there might be differences in screening children for behavioral health needs. Wanting to assess potential health service disparities was an important factor in this evaluation given the interest in vulnerable populations.
  • While electronic medical records (EMRs) were evident in at least 60% of practices where charts were abstracted, it was no surprise to find that EMRs vary practice to practice. It was clear that projects such as this one might then need to use text-based data within the chart notes to obtain vital information in order to assess potential disparities.

Hot Tip: Although data quality is key, find a balance between budgetary and personnel resources and the time required to cross-check data through multiple sources and/or impute missing data using a variety of techniques.

Do you have questions, concerns, kudos, or content to extend this aea365 contribution? Please add them in the comments section for this post on the aea365 webpage so that we may enrich our community of practice. Would you like to submit an aea365 Tip? Please send a note of interest to aea365@eval.org . aea365 is sponsored by the American Evaluation Association and provides a Tip-a-Day by and for evaluators.

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.