Heckman correction

Content sourced from Wikipedia, licensed under CC BY-SA 3.0.

The Heckman correction is a statistical method used to fix bias that happens when you don’t observe outcomes for everyone, only for a non-random subset of the population. This is common in studies based on observational data in the social sciences, where the sample you can study is not representative of the whole population.

The basic idea is to model both the way people end up in the observed group and the outcome you want to study. First, you estimate how likely each person is to be observed (the selection process) with a probit model. For example, if you want to study wages but only see wages for people who are employed, you model the probability of being employed using available variables like age, education, and other factors. From this model you compute a value called the inverse Mills ratio for each person, which summarizes how much selection into the observed group is likely to matter for them.

Second, you estimate the outcome of interest (such as wages) using only the observed cases, but you include the inverse Mills ratio from the first step as an extra regressor. This correction term accounts for the fact that the observed sample is not a random subset of the population. If the two equations are connected through the underlying, unobserved factors, including the correction term makes the outcome estimates unbiased.

The method relies on a normality assumption about the error terms and provides a way to test for selection bias. Specifically, the coefficient on the correction term is zero only if there is no selection effect; a nonzero coefficient indicates that selection was biasing the results.

The Heckman correction is typically implemented as a two-step procedure (a control-function approach) rather than estimating both equations together. This makes it easier to compute, though it can be slightly less efficient than a full joint estimation. There are standard ways to get correct standard errors for the two-step method, including asymptotic formulas or bootstrapping.

James Heckman introduced this approach in the 1970s and received the Nobel Prize in Economic Sciences in 2000 for his work on selection models and related methods. Since then, the correction has become widely used to adjust for non-random samples in economics and other social sciences. A common example is studying wage determinants using wage data only from people who are employed; the Heckman correction helps address the bias that arises because those employed may differ in important ways from those not employed.

This page was last edited on 3 February 2026, at 12:27 (CET).