Segmentation is the act of classifying pixels according to the object to which they belong.
Usually, there are two classes: the object of interest (labelled as 1) and everything else (labelled as 0).
But there can be any number of classes.
We'll first look at fixed thresholding:
- 2-means
- k-means
- k-means ++
- Otsu's method
Idea: Partition based on pixel intensity with respect to a fixed (i.e. unchanging) threshold intensity.
$T$ = threshold intensity
$p_i$ = intensity of pixel $i$
foreground = $\{ p_i \mid p_i \geq T \}$
background = $\{ p_i \mid p_i < T \}$
$T$ could be chosen by looking at the histogram to find the
value that best separates the pixels into two classes.
Problems:
- A single $T$ cannot perfectly distinguish an object.
- Noise results in mis-classification.
From Gonzales and Woods, Figure 10.36
The remaining methods attempt to automatically pick a good threshold:
Two-means Method
Algorithm:
- Pick $T'$ arbitrarily
- do
- Set $T = T'$
- Compute $\mu_1$ = mean of $\{ p_i \mid p_i < T \}$
- Compute $\mu_2$ = mean of $\{ p_i \mid p_i \geq T \}$
- Set $T' = {\mu_1 + \mu_2 \over 2}$
- while $\mid T - T' \mid > \epsilon \qquad$ (for $\epsilon$ a small number)
This algorithm minimizes the "within-class sum of squares" (WCSS):
WCSS = $\displaystyle \sum_{p_i < T} (p_i - \mu_1)^2 + \sum_{p_i \geq T} (p_i - \mu_2)^2$
Recall that $\sigma^2 = {1 \over N} \sum_{x_i} (x_i - \mu)^2$. If
$\displaystyle
N_1 = | \; \{ p_i \mid p_i < T \} \; | \\
N_2 = | \; \{ p_i \mid p_i \geq T \} \; |
$
Then
WCSS = $\displaystyle N_1 \sigma_1^2 + N_2 \sigma_2^2$
The WCSS is the "intra-class variance".
K-means Clustering
Objective: Find $k$ clusters: $S_1, S_2, \ldots, S_k$ to minimize
$\displaystyle \sum_{i=1}^k \sum_{x_{ij} \in S_i} d( \: x_{ij}, \; \mu_i \: )$
where $\mu_i$ is the mean of cluster $S_i$ and $d()$ is a distance metric, typically the (Euclidean) 2-norm.
This minimizes the distances of the points to their respective cluster centres.
With images, we usually use only intensities, so the
clustering is in 1D, not in 2D as shown above.
Algorithm:
- Place $k$ cluster centres arbitrarily. Call them
$\mu_1, \mu_2, \ldots, \mu_k$, even though these are not
yet cluster means.
- Put each sample point, $x_j$, in the cluster, $S_i$, with the closest $\mu_i$:
$\displaystyle S_i = \{ x_j \; | \; d( x_j, \mu_i ) < d( x_j, \mu_k )\ \textrm{for}\ k \neq i \}$
- Move the "means" to the cluster centroids:
$\displaystyle \mu_i = { \displaystyle\sum_{x_j \in S_i} x_j \over | S_i | }$
- Repeat steps 2 and 3 until the means don't change very much.
See also www.cs.cmu.edu/afs/cs/Web/People/awm/tutorials/kmeans.html
Problems:
- Have to choose $k$ beforehand.
- Different initial $\mu_1, \mu_2, \ldots, \mu_k$ will give different clusters.
K-means ++
This is a method to choose "good" initial $\mu_i$.
Algorithm:
- Pick one of the $x_j$ as $\mu_1$.
- For $i$ in $2 \ldots k$:
- Pick $\mu_i$ randomly from among the remaining $x_j$, but
$\displaystyle P( \textrm{pick}\ x_j ) = { D(x_j)^2 \over \sum_k D(x_k)^2 }$
where $D(x_j)$ is the distance to the closest already-picked mean.
Each iteration picks a next mean which has higher
probability of being distant from the existing means, so the
means get distributed well.