Shape context

Content sourced from Wikipedia, licensed under CC BY-SA 3.0.

Shape context is a compact way to describe shapes for recognizing objects and matching shapes. It was introduced by Serge Belongie and Jitendra Malik in 2000 as a practical descriptor that can measure shape similarity and help find point correspondences between shapes.

How it works
- Select a set of points on a shape’s contour. For each point p_i, consider the vectors to all other points and build a coarse histogram of their relative positions. This histogram—the shape context of p_i—is stored in bins that are usually arranged in log-polar space.
- The shape context for a point is a rich, discriminative description of its local geometry, yet the distribution itself is compact and robust to small changes.

Invariances
- Translation is natural: shifting a shape doesn’t affect the relative positions of points.
- Scale is handled by normalizing radial distances by the mean distance between all point pairs (the median can also be used).
- Rotation can be made invariant if you measure directions relative to the tangent direction at each point, though sometimes it’s useful to keep rotation information for discrimination (e.g., distinguishing a 6 from a 9).

Matching shapes
- For two shapes, compute a shape-context cost for each pair of points p_i and q_j using the chi-squared distance between their histograms. You can also add an appearance cost, such as tangent-angle dissimilarity, to capture local orientation differences.
- The total cost between two points is a weighted sum of the shape-context cost and the appearance cost. Build a cost matrix for all point pairs and find a one-to-one matching that minimizes the total cost (often using the Hungarian method).
- To handle outliers, you can add dummy points with large matching costs.

From correspondences to a shape transformation
- Once correspondences are established, estimate a transformation T that maps one shape to the other. Common choices are an affine model or a thin-plate spline (TPS) model. TPS is especially popular for shape contexts because it can smoothly warp one shape to align with the other.
- For noisy data, you can relax the exact point-to-point matching and still estimate a good transformation, typically by iterating between finding correspondences and refining the transformation.

Measuring shape distance
- A shape distance combines three parts: the shape-context matching cost, an appearance-based cost after warping, and a transformation cost (the bending energy of the TPS or the amount of affine deformation).
- This distance lets you compare shapes more robustly and can be used with a nearest-neighbor classifier (k-NN).

Results and applications
- Shape contexts have been tested on several challenging databases, such as MNIST (handwritten digits), MPEG-7 shape silhouettes, and COIL-20 object views. They achieved strong results by using a larger set of sampled points and robust matching that tolerates rotations and outliers.
- The approach has also been used for tasks like trademark retrieval, where shape-context-based matching helped identify visually similar logos without missing close matches.

In short, shape context turns the local geometry around each point on a shape into a simple, robust histogram. By comparing these histograms and optionally refining with a smooth transformation, it provides a powerful and versatile way to recognize shapes and align shapes across different views and conditions.

This page was last edited on 3 February 2026, at 03:29 (CET).