My name is Steve Fleming, and I work for the National Center for Educational Achievement, a department of ACT, Inc. whose mission is to help people achieve education and workplace success. I also earned an M.S. in Statistics from the University of Texas at Austin.
I have been thinking a lot lately about how to explain statistical significance. Leaving behind the problem of overemphasis on statistical significance compared to practical significance of results, my objective for this post is to provide a visual explanation of statistical significance testing and suggest a display for the statistical significance of results.
Statistical significance testing begins with a null hypothesis, which we typically want to show not to be true, and an alternative hypothesis. From sample data, a p-value is generated which summarizes the evidence against the null hypothesis. The p-value is compared to a fixed significance level, a. If the p-value is smaller than the significance level, the null hypothesis is rejected; otherwise the null hypothesis is accepted.
Hot Tip: What effect does choosing a different significance level have? In the following diagram the combined blue and red regions represent the possible sample data results if the null hypothesis is true. The blue regions show where we would accept the null hypothesis and the red regions where we would reject. It is clear that smaller levels of a make it less likely to reject the null hypothesis. In terms the language of errors, smaller levels of a offer more protection against false positives.
Hot Tip: The APA style guide suggests using asterisks next to the sample estimates to indicate the p-value when space does not allow printing the p-value itself. Using increasing intensities of color as an alternative way to indicate the most significant results saves even more space. Consider:
Rad Resource: How do you choose a consistent set of colors of increasing intensity? I have found Color Brewer to be a good source for this information.
What do you think? Does this vision clarify or obfuscate the meaning of statistical significance? I look forward to the discussion online.
Do you have questions, concerns, kudos, or content to extend this aea365 contribution? Please add them in the comments section for this post on the aea365 webpage so that we may enrich our community of practice. Would you like to submit an aea365 Tip? Please send a note of interest to firstname.lastname@example.org. aea365 is sponsored by the American Evaluation Association and provides a Tip-a-Day by and for evaluators.