PD Presenters Week: Jennifer Ann Morrow on What to Do With “Dirty” Data – Steps For Getting Evaluation Data Clean and Useable

I’m Jennifer Ann Morrow, a faculty member in Evaluation, Statistics, and Measurement at the University of Tennessee. I created a 12 Step process evaluators can follow to ensure their data is clean prior to conducting analyses.

Hot Tip: Evaluators should follow these 12 steps prior to conducting analyses for evaluation reports:

1. Create a data codebook

a. Datafile names, variable names and labels, value labels, citations for instrument sources, and a project diary

2. Create a data analysis plan

a. General instructions, list of datasets, evaluation questions, variables used, and specific analyses and visuals for each evaluation question

3. Perform initial frequencies – Round 1

a. Conduct frequency analyses on every variable

4. Check for coding mistakes

a. Use the frequencies from Step 3 to compare all values with what is in your codebook. Double check to make sure you have specified missing values

5. Modify and create variables

a. Reverse code (e.g., from 1 to 5 to 5 to 1) any variables that need it, recode any variable values to match your codebook, and create any new variables (e.g., total score) that you will use in future analyses

6. Frequencies and descriptives – Round 2

a. Rerun frequencies on every variable and conduct descriptives (e.g., mean, standard deviation, skewness, kurtosis) on every continuous variable

7. Search for outliers

a. Define what an outlying score is and then decide whether or not to delete, transform, or modify outliers

8. Assess for normality

a. Check to ensure that your values for skewness and kurtosis are not too high and then decide on whether or not to transform your variable, use a non-parametric equivalent, or modify your alpha level for your analysis

9. Dealing with missing data

a. Check for patterns of missing data and then decide if you are going to delete cases/variables or estimate missing data

10. Examine cell sample size

a. Check for equal sample sizes in your grouping variables

11. Frequencies and descriptives – The finale

a. Run your final versions of frequencies and descriptives

12. Assumption testing

a. Conduct the appropriate assumption analyses based on the specific inferential statistics that you will be conducting.

Lesson Learned: One statistics course is not enough. Utilize all the great resources that AEA offers to gain additional training in data analysis.

Rad Resources:

Want to learn more from Jennifer? Register for her upcoming AEA eStudy: The twelve steps of data cleaning: Strategies for dealing with dirty data and her workshop Twelve Steps of Data Cleaning: Strategies for Dealing with Dirty Evaluation Data at Evaluation 2013 in Washington, DC.

This week, we’re featuring posts by people who will be presenting Professional Development workshops at Evaluation 2013 in Washington, DC. Click here for a complete listing of Professional Development workshops offered at Evaluation 2013. Do you have questions, concerns, kudos, or content to extend this aea365 contribution? Please add them in the comments section for this post on the aea365 webpage so that we may enrich our community of practice. Would you like to submit an aea365 Tip? Please send a note of interest to aea365@eval.org. aea365 is sponsored by the American Evaluation Association and provides a Tip-a-Day by and for evaluators.


Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.