Statistical proof

Content sourced from Wikipedia, licensed under CC BY-SA 3.0.

Statistical proof is a careful way of showing how likely a claim is, based on data from a statistical test. It’s used to convince others and to explain what the data imply, by referring to a hypothesis, the data, the test, and the odds involved. The goal is twofold: persuade and help others understand the reasoning behind the claim.

The burden of proof rests on clearly applying the statistical method, sharing the assumptions, and showing how the test relates to the real world. People differ in how they infer things from data. Some use Bayes’ ideas (how prior beliefs update with evidence), others rely on the likelihood of the data under a hypothesis, and there are broader debates between different philosophies of science. These views shape how we interpret statistical proof.

A common science-based rule for judging claims is the hypothetico-deductive approach: a claim is tested, and attempts are made to falsify it. Other ways of drawing conclusions include inductive and abductive reasoning. Scientists do not seek absolute certainty; they test ideas to falsify or support them, learning from error, not claiming perfect truth.

Statistical proof also matters in law. It can influence how strong the evidence is and when it shifts the burden of proof.

Two kinds of starting points in probability are important. Some statements are conventions that are treated as true for practical reasons but aren’t testable. Others are hypotheses to be tested. Probability theory rests on foundational rules from the 17th century that describe how randomness works. Data from experiments can never prove a hypothesis true; they support it or show it is unlikely, using inductive reasoning.

The word “proof” here comes from Latin roots meaning to test. In statistics, proof means a rational demonstration using logic, mathematics, and evidence from tests. Tests are built on models that describe how data should behave, using probability distributions such as the normal, binomial, or Poisson distributions.

When you test a sample against a null model, you ask whether the sample statistics are unusually different from what the model predicts. The true values for a population (parameters) are unknown, so researchers estimate them from samples to compute things like the mean or the standard deviation. If you could sample the entire population, your sample results would match the population’s behavior.

A common practice is to choose a significance level (often 0.05 or 0.10) before testing. If the observed difference is unlikely under the null model (more than would be expected by chance), the null is rejected.

Bayesian statistics offer a different route. Bayes’ theorem updates the probability of a parameter or hypothesis based on the data, producing a posterior probability. The Bayes factor compares how well different hypotheses explain the data. There is debate about how this relates to Popper’s idea of falsification; some see Bayesian updates as a form of corroboration rather than absolute proof.

In law, statistical evidence can be categorized and used to show patterns of discrimination or other effects. Since the Castaneda v. Partida case in the 1970s, large statistical disparities can serve as prima facie proof, shifting some burden to the defendant. But statistics cannot determine individual intent, and real-world proof often requires case-by-case examination.

This page was last edited on 3 February 2026, at 16:19 (CET).