When researches find an association between a chemical and certain health effects, they then try to determine whether the finding is statistically significant. Statistical significance is the attempt to quantify the probability that research findings were merely accidental rather than the result a real relationship between two variables in the study. It is usually expressed as a percent and is referred to as the “p” value, with “p” standing for “probability.”
A “p” value of 1 or p=0.01, suggests a 1 percent probability that the results occurred by chance, and that the researcher is 99 percent confident in the results (confidence interval of 99 percent). Likewise, p=0.05 suggests a 5 percent probability that the results are due to mere chance and 95 percent confidence that they are correct, and so on. Studies with a p value less than 0.05 are usually considered significant, while those above that mark are not.
The fact that a study’s findings are statistically significant does not by itself establish a cause-and-effect relationship. Even statistically significant findings may occur by mere chance, failure to control confounding factors, researcher bias, and many other causes.
In fact, some researchers point out that small changes in the p-value can move a study from significant to insignificant pretty easily.(1) One study title highlights an ironic conclusion: “The Difference Between ‘Significant’ and ‘Not Significant’ is not Itself Statistically Significant.”(2) Indeed, the 0.05 cutoff is itself arbitrary and prone to be abused. A study of psychology research papers, for example, showed that a large portion reported p-values just under the 0.05 cut off for significance, indicating that researchers regularly work the data to push their findings into the significance category, thus increasing their chances of publication in a peer reviewed journal.(3)
Philosophy professor Mark Battersby of Capilano University in British Columbia explains how using the term “significant” can confuse some people about the importance of study results:
It is unfortunate that statisticians chose such a loaded term as “significant” to describe what is merely a probabilistic judgment that difference between the two sample groups is unlikely to be the result of chance. Many results that are statistically significant simply aren’t significant or important in any ordinary sense. And sometimes the lack of statistical significance is more medically or humanly important.(4)
Browse the terms on the sidebar of this webpage for more information and/or download a copy of A Consumer’s Guide to Chemical Risk: Deciphering the “Science” Behind Chemical Scares.
(1) Ronald P. Carver, “The Case Against Statistical Significance Testing,” Harvard Educational Review 48, no. 3 (1978): 378-399, .
(2)Andrew G. Elman and Hal S. Tern, “The Difference Between ‘Significant’ and ‘Not Significant’ is not Itself Statistically Significant,” The American Statistician 60, no, 4 (November 2006): 238-331.
(3) E.J. Masicampo and D.R. Lalande, “A Peculiar Prevalence of p Values Just Below .05,” Quarterly Journal of Experimental Psychology 65 no. 11 (August 2, 2012), 2271-2279.
(4)Mark Battersby, Is that a Fact: A Field Guide to Statistical and Scientific Information, revised edition (Ontario, CA: Broadview Press, 2010).