CISC 472 - RANSAC

When fitting a model to a set of observations, there are often outlier observations that should not be used in the fitting.

For example, when computing the least squares fit of a sphere to points, such as with pivot calibration for a tracked stylus, the points are the observations and the sphere is the model. But some of the points may be outliers, such as points that are collected before the stylus is placed at its pivot point, or after the stylus is removed.

RANSAC is an algorithm to fit a model to observations in the presence of outliers. "RANSAC" means "RANndom SAmple Consensus".

RANSAC operates as follows:

Choose randomly a small set of the observations — just enough observations that the model can fit exactly — and fit the model to those observations.
Find which observations (from the complete set of observations) are consistent with the model. These are the "inliers".
If there are not enough inliers, go back to Step 1.
Finally, fit the model to the inliers from Step 3.

Usually, Steps 1-to-3 are iterated through at most a certain number of times and the largest set of inliers ever found in Step 2 is kept from iteration to iteration. If nothing satisfactory is found, this largest set of inliers is used in Step 4.

Example with line fitting

Problem: fit a line to $n$ points, $p_1, \; p_2, \ldots, p_n$ in 2D. We are happy once 90% of the points are within distance 0.01 of the line.

Note that the 90% and the 0.01 threshold must be chosen with some knowledge of many outliers are expected, and how well you expect the model to fit the points.

RANSAC does this:

Randomly choose two points, $(x_1,y_1)$ and $(x_2,y_2)$, from the $n$ points. Two points are chosen because this is the minimum number of points to define a line.
Fit a line to the points. The line is $n \cdot (x,y) - d = 0$ where
$n = { \large (y_2-y_1, x_1-x_2) \over \large | (y_2-y_1, x_1-x_2) |}$
and
$d = n \cdot (x_1,y_1)$.
Find the "inlier" points that are within distance 0.01 of the above line. Point $p_i$ is an inlier if
$| n \cdot p_i - d | \leq 0.01$
If the number of inlier points is less than $0.9 \; n$, go back to Step 1.
Finally, use least squares fit of a plane to points to find the best-fit line to the inliers from Step 3.