Hi, I am Timothy S. Brophy, professor of music education and Director of Institutional Assessment at the University of Florida in Gainesville, where I manage the assessment of over 500 assessment units at the university. As a music educator and an evaluator, I am interested in exploring the fundamental measurement issues related to the assessment of arts performances and products.
Arts performances and products are a result of a complex combination of skills, techniques, and knowledge. Artistic performance or product creation is multidimensional, and assessment results are fraught with variability from the individual’s perception of the dimension being measured. As a result, arts performance and product assessments are largely criterion-referenced, and measured using analytic or holistic rubrics. One of the constraints in analyzing the difficulty, discrimination, validity, and reliability of measures of arts performance and product achievement is that the theoretical model most often used for these analyses, Classical Test Theory (CTT), has limitations that preclude its usefulness for this purpose.
Recent work in Item Response Theory (IRT) has expanded statistical models to account for multidimensionality and hierarchical data in criterion-referenced assessments. Where CTT is designed for tests of items where there is a single, correct response, IRT models focus on item performance. As the field of psychometrics extends IRT to increasingly complex data sets such as those obtained during performance and product assessments, there is considerable possibility that arts data can someday be analyzed using these new models, providing the profession with rigorous estimates of difficulty, discrimination, reliability, and validity characteristics of performance and product assessments at a level of precision previously unavailable.
Music education assessment researchers are still in the early stages of developing and applying rigorous measurement and analysis procedures to arts performances and products. To date there have been no studies of using IRT techniques to analyze arts performance and product measurement data. I am happy to team up with interested evaluators to undertake the study of IRT applications to arts assessment data analysis.
In addition to AEA, there are a number of partner organizations to explore this issue:
- The International Society for Music Education;
- The National Association for Music Education; and
- The International Symposia on Assessment in Music Education.
- Baker’s The Basics of Item Response Theory is a classic book on IRT, a great place to start.
- Janssen, et.al.’s (2000) article describes the extension of IRT modeling to the analysis of criterion-referenced assessments.
- Reckase’s Multidimensional item response theory describes the commonly used MIRT models and their practical application.
The American Evaluation Association is celebrating Arts, Culture, and Audiences (ACA) TIG Week. The contributions all week come from ACA members. Do you have questions, concerns, kudos, or content to extend this aea365 contribution? Please add them in the comments section for this post on the aea365 webpage so that we may enrich our community of practice. Would you like to submit an aea365 Tip? Please send a note of interest to firstname.lastname@example.org. aea365 is sponsored by the American Evaluation Association and provides a Tip-a-Day by and for evaluators.