top of page
Search

# Four Ways the p-value is Misunderstood and Misapplied.

Science and technology are ubiquitous in our lives. From mass torts to pharmaceuticals to beauty products and the equipment used to read this article, we are surrounded by the results of scientific and technological advancement. Evidently, this reality translates to the law, which is a reflection of and seeks to regulate the intricacies of civil society. For instance, cases often include “battles of the experts” where uniquely qualified individuals explain complex concepts to judges and juries. A 2016 study by the Maurer School of Law: Indiana University indicated that between 63% and 86% of cases involved experts. A 2022 article reviewing expert witness fees opined that approximately 8 out of 10 trials involve an expert witness. One favorite concept in litigation is the p-value. Often misunderstood, mostly misused, the p-value continues to appear in court decisions.

I. What is the p-value?

The p-value is perhaps one of the most misunderstood concepts, both among lay people and legalists. Let’s begin with its definition: A p-value is the probability of seeing the data as it exists (in the data set or more extreme than the data set) if there is, in fact, no relationship between the variables. A reductive illustration is the following: a graph shows an increase in per capita cheese consumption that is proportional to the number of people who died by becoming tangled in their bedsheets. In other words, a graph shows a correlation between these two variables. But whether those two things are actually related to each other (i.e., one is causing the other) is not established (correlations is not causation). The p-value would tell us the probability of this data appearing in this way (or being even more correlated) if cheese eating and bedsheet-deaths are not, in fact, related by causation. In other words, how probable it is to see this graph if this is just a random (coincidental) result.

II. The p-value: rejecting a negative but not proving a positive!

First, in order to understand a p-value, one has to understand the concepts of null and alternative hypothesis. A null hypothesis assumes that whatever we are trying to test is not, in fact, true. More specifically, the null hypothesis stands for the proposition that there is no statistically significant relationship between two variables. To illustrate this, using the vignette of ice cream sales and drownings discussed above, the null hypothesis in that case would be that there is no statistically significant relationship between ice cream sales and drownings. Attorneys, therefore, should take care to question why a particular null hypothesis was chosen, why it was phrased a certain way, and how it affected the calculations. Importantly, a null hypothesis is not “the opposite” of what is being tested; rather, it is the absence of a relationship, at all. This is a subtle but critically important distinction.

Second, and conversely, the hypothesis being tested is referred to as the “alternative hypothesis.” In this case, the alternative hypothesis would be that there is a statistically significant relationship between drownings and ice cream sales. A common misconception is that the p-value can tell us the direction of the relationship (positive, negative, inverse). But that is incorrect. Other tests, beyond the scope of this article, allow one to reach those conclusions. The p-value only allows one to reject or fail to reject the null hypothesis. Let’s see how this is done.

Once the null and alternative hypotheses have been crafted, data is collected and analyzed, resulting in a p-value. The p-value is a measure that helps us assess the strength of evidence against a null hypothesis. In other words, it tells us the probability of observing results as extreme as or more extreme than the ones we've obtained, if the null hypothesis is true. Herein lies yet another significant misconception: the p-value doesn’t tell us whether we are proving the alternative hypothesis. Rather, the p-value tells us whether to reject the null hypothesis and thereby leave the alternative hypothesis standing or fail to reject the null hypothesis and thereby leave it standing.

A low p-value (typically less than 0.05) suggests that the observed results are unlikely to have occurred by chance alone, leading us to question the validity of the null hypothesis. In other words, a sufficiently low p-value allows researchers and statisticians to reject the null hypothesis but not prove the alternative hypothesis.

III. A sufficiently low p-value: statistical significance.

Statistical significance is used to decide whether the observed results are likely due to a real effect rather than random effects. Specifically, statistical significance is a shorthand way of saying that the p-value is lower than a pre-defined significance level. While that “level” is often 0.05, that is not always the case and varies by field. If the p-value is smaller than the significance level, we consider the results statistically significant. This means that the evidence supports rejecting the null hypothesis in favor of the alternative hypothesis, suggesting that there's a meaningful relationship or effect.

In essence, these concepts help us make informed decisions about the validity of our hypotheses and the reliability of the observed data by quantifying the probability of obtaining such results under different assumptions.

IV. The p-value does not indicate the “strength” of a relationship.

The statement is not entirely correct. A p-value in clinical trials does not represent the probability that the results establish a cause-and-effect relationship between a drug and a positive health benefit. The p-value is a statistical measure that assesses the strength of evidence against a null hypothesis, which states that there is no effect or relationship.

Specifically, a p-value indicates the probability of observing the obtained results (or more extreme results) if the null hypothesis were true. A small p-value (typically less than 0.05) suggests that the observed results are unlikely to have occurred due to random chance alone, which might lead researchers to reject the null hypothesis.

However, establishing a cause-and-effect relationship requires a comprehensive analysis that goes beyond just the p-value. Clinical trials involve various factors such as study design, sample size, control groups, blinding, and more, all of which contribute to determining whether a cause-and-effect relationship can be reasonably concluded. Therefore, while a small p-value can be suggestive of a relationship between a drug and a positive health benefit, it does not solely determine the presence or absence of such a relationship.