Our names are Lindsey Dunn and Lauren Fluegge and we are PhD. students at The University of North Carolina at Greensboro. Recently, we have been involved with a project in which the data come from many different sources and often include hundreds of thousands of cases. We used primarily three programs to organize and merge the data (Microsoft Excel 2007, Microsoft Access 2007, and PASW 18), and include here pros and cons of each program and information about the advantages of the statistical program R.
- Sorts/filters data efficiently
- Summarizes data with Microsoft PivotTable, which automatically sorts, counts, totals or averages the data
- Analyses excluded from the formula repertoire and results formatting must be entered by hand
- Has “behind the scenes” syntax out of the user’s control which makes it difficult to reproduce procedures on different data
Hot Tip: The text to column function of Excel is useful to reformat identifiers, such as names, to match between two data files you want to merge
- Relational nature of databases allows for flexible linking of data files (e.g. the user can match multiple data files by variables with different names; a smaller subset of variables of interest can be selected from the original data files to be in a newly merged data file)
- Access cannot analyze data
- Merging can be slow or freeze the program if done with large data files
- Syntax is out of the user’s control
Hot tip: Choose “Export” to move any data files to Excel for formatting and simple calculations
PASW 18 (SPSS):
- Can accomplish a plethora of analyses, decent graphics, and easy to read output
- Efficiently identifies duplicates, sorts, and filters data
- Can be controlled by user-written syntax
- Inflexible merging procedures
- Error messages are not always clear
Hot Tip: With SPSS, can now open files without a .sav with the “Open” command and retain the field names (rather than using the “New Query” command)
Hottest Tip: Use R statistical program
- R is more efficient than other programs because it allows you to do everything in one place!
- In R, you have control over what and how you merge (e.g. can merge multiple data files at once by several variables)
- You can manage and analyze data as effectively as the other programs in the same program
- R is free and once you have written a merging protocol it can be reused with other data files!
Useful Rad resources are listed below:
Do you have questions, concerns, kudos, or content to extend this aea365 contribution? Please add them in the comments section for this post on the aea365 webpage so that we may enrich our community of practice. Would you like to submit an aea365 Tip? Please send a note of interest to firstname.lastname@example.org. aea365 is sponsored by the American Evaluation Association and provides a Tip-a-Day by and for evaluators.