Segmentation is the act of classifying pixels according to the object to which they belong.
Usually, there are two classes: the object of interest (labelled as 1) and everything else (labelled as 0).
But there can be any number of classes.
We'll first look at fixed thresholding:
Idea: Partition based on pixel intensity with respect to a fixed (i.e. unchanging) threshold intensity.
$T$ = threshold intensity
$p_i$ = intensity of pixel $i$
foreground = $\{ p_i \mid p_i \geq T \}$
background = $\{ p_i \mid p_i < T \}$
$T$ could be chosen by looking at the histogram to find the value that best separates the pixels into two classes.
Problems:
From Gonzales and Woods, Figure 10.36
The remaining methods attempt to automatically pick a good threshold:
Algorithm:
This algorithm minimizes the "within-class sum of squares" (WCSS):
Recall that $\sigma^2 = {1 \over N} \sum_{x_i} (x_i - \mu)^2$. If
Then
The WCSS is the "intra-class variance".
Objective: Find $k$ clusters: $S_1, S_2, \ldots, S_k$ to minimize
where $\mu_i$ is the mean of cluster $S_i$ and $d()$ is a distance metric, typically the (Euclidean) 2-norm.
This minimizes the distances of the points to their respective cluster centres.
With images, we usually use only intensities, so the clustering is in 1D, not in 2D as shown above.
Algorithm:
See also www.cs.cmu.edu/afs/cs/Web/People/awm/tutorials/kmeans.html
Problems:
This is a method to choose "good" initial $\mu_i$.
Algorithm:
Each iteration picks a next mean which has higher probability of being distant from the existing means, so the means get distributed well.