Towards a “post p < 0.05 era” by Tamara Young

I’m Tamara Young and I am an associate professor in Educational Evaluation and Policy Analysis at North Carolina State University.  I teach evaluation theory and practice in education. Today, I’m going to discuss the American Statistical Association’s (ASA) Statement on p-values, which responds to the decades old highly contentious debate about null hypothesis statistical significance testing (NHSST). Implications of the debate and ASA response for the evaluation community are also described.

The Debate

The NHSST process is flawed and there is widespread “misconceptions and misuse” of NHSST. As Ronald Wasserstein and Nicole Lazar explain in their editorial on the Context, Process, and Purpose of the ASA statement on p-values, NHSST has faced serious critique for decades. In recent years, Tom Sigfried has called attention to flaws of  NHSST, describing the process as “science’s dirtiest secret”, and concluding “statistical techniques for testing hypotheses …have more flaws than Facebook’s privacy policies.” Even the journal Basic and Applied Psychology banned NHSST.

Hot Tip: The Current Resolution

In 2016 The American Statistical Association issued a statement  delineating six principles (directly quoted below) that should guide use and interpretation of p-values, which ultimately will improve practice and move us into a post “p < .05 era”:

  1. “P-values can indicate how incompatible the data are with a specified statistical model.”
  2. “P-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone.”
  3. “Scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold.”
  4. “Proper inference requires full reporting and transparency.”
  5. “A p-value, or statistical significance, does not measure the size of an effect or the importance of a result.”
  6. “By itself, a p-value does not provide a good measure of evidence regarding a model or hypothesis.”

Implications for the Evaluation Community

Evaluators who utilize NHSST need to become more familiar with the debate about NHSST and read about the ASA’s six guiding principles. Instructors of quantitative methods need to discuss the debate and provide students opportunities to critically reflect upon the ASA’s principles and apply them to data analysis simulations. Additionally, the evaluation community, especially journal editors, need to encourage the use of other methods (e.g., Bayesian methods) that can function as alternatives or supplement NHSST. Lastly, funders, decision-makers, and evaluators need to consider the ASA principles when designing, interpreting, and using results.

Rad Resources:

Statistical errors: P values, the ‘gold standard’ of statistical validity, are not as reliable as many scientists assume.

Odds Are, It’s Wrong: Science Fails to Face the Shortcomings of Statistics

The ASA’s Statement on p-Values: Context, Process, and Purpose which includes the ASA statement, online supplemental materials related to NHSST, and alternatives to NHSST.

Do you have questions, concerns, kudos, or content to extend this aea365 contribution? Please add them in the comments section for this post on the aea365 webpage so that we may enrich our community of practice. Would you like to submit an aea365 Tip? Please send a note of interest to aea365 is sponsored by the American Evaluation Association and provides a Tip-a-Day by and for evaluators.


2 thoughts on “Towards a “post p < 0.05 era” by Tamara Young”

  1. Great post, Tamara! Evaluators need to understand the basics of p-values but also other things like avoiding p-hacking and not doing any other questionable research (evaluation!) practices (aka QRPs).

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.