Guili Zhang on Longitudinal Data Management and Analysis

My name is Guili Zhang. I am an Assistant Professor of Research and Evaluation Methodology at East Carolina University. During the last ten years, I have evaluated the National Science Foundation’s SUCCEED program, and developed and analyzed the SUCCEED longitudinal database, which includes data from nine universities and spans 20 years. Our research team’s publications based on this database have received two Best Paper Awards from the American Society of Engineering Education and the Frontiers in Education. Today I’d like to share some information about longitudinal data management and analysis.

Lessons Learned: There are two very different organizations for longitudinal data—the “person-level” format and the “person-period” format. A person-level data set, also known as the multivariate format, has as many records as there are people in the sample. As additional waves of data are collected, the file gains new variables, not new cases. A person-period data set, also known as the univariate format, has multiple records for each person—one for each person-period combination. As additional waves of data are collected, the file gains new records, but not new variables.

Besides the derived variable approach to longitudinal data analysis, which involves the reduction of the repeated measurements into a summary variable, there are two classical approaches: the ANOVA and MANOVA approaches. The ANOVA and MANOVA approaches represent well-understood methodology, and the computer software is widely available. Unfortunately, both models have limited usage in longitudinal data analysis due to their restrictive and often unrealistic assumptions and the effect of missing data on the statistical properties of their estimates. Currently, there are several alternative approaches that overcome the limitations of the traditional approaches, variously known as: mixed-effect regression model, the covariance pattern model, generalized estimating equations model, individual growth model, multilevel model, hierarchical linear model, random regression model, survival analysis, event history analysis, failure time analysis, and hazard model.

Hot Tip #1 – The person-period format most naturally supports meaningful analysis of change over time.

Hot Tip #2 – Most statistical software packages can convert a longitudinal data set from one format to another. For example, in SAS, Singer (1998, 2001) provides simple code for the conversion; in STATA, the “reshape” command can be used.

Rad Resources:

Two introductory books that I have found useful are:

Do you have questions, concerns, kudos, or content to extend this aea365 contribution? Please add them in the comments section for this post on the aea365 webpage so that we may enrich our community of practice. Would you like to submit an aea365 Tip? Please send a note of interest to aea365@eval.org. aea365 is sponsored by the American Evaluation Association and provides a Tip-a-Day by and for evaluators.

2 thoughts on “Guili Zhang on Longitudinal Data Management and Analysis”

  1. Charles I. Obutte

    kudos to you prof.It’s highly educative and has guided me for further properly articulated thoughts on longitidunal data analysis.

  2. Pingback: AEA 365: A Tip-a-Day by and for Evaluators « Statistician Career

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.