CISC/CMPE 454, Importance Sampling

The Rendering Equation describes how light reflects off a surface at a point $x$ with surface normal $N$:

$L(\oo,x) = \int_\Omega f(x,\oi,\oo) \; \cos\theta_\textrm{in} \; L(\oi,x) \; d\oi$

where

$L(\oo,x)$ is the radiance (think of it as "intensity") leaving $x$ toward direction $\oo$.
$\Omega$ is the set of directions above $x$: $\Omega = \{ \omega \mid \omega \cdot N > 0 \}$.
$f(x,\oi,\oo)$ is the ratio of (outgoing) radiance in direction $\oo$ to (incoming) irradiance from direction $\oi$, with units of per steradian, and is called the Bidirectional Reflectance Distribution Function, or BRDF.
$\cos\theta_\textrm{in}$ is $N \cdot \oi$.
$L(\oi,x)$ is the radiance arriving at $x$ from direction $\oo$.

Radiance is measured in Watts/m$^2$/sterradian, where a sterradian is a measure of solid (i.e. 3D) angle.

For more information, please see the supplementary notes on Introduction to Illumination.

Note that the position, $x$, and the outgoing direction, $\oo$, are constants inside the integral.

Ideally, this would be evaluated at each ray/surface intersection during raytracing. However, integrating over all of $\Omega$ by sampling many directions, $\oi \in \Omega$, is very expensive, since each sample requires a recursive call to the raytracer.

Below, we'll see how to do this more efficiently. That is, we'll see how to choose directions, $\oi$, to reduce the variance of our estimate of $L(\oo,x)$.

Monte Carlo Integration

The central problem is to evaluate a generic integral like

$F = \int_D f(x) \; dx$

This $f$ has no relation to the BRDF above; it's just a generic function to be integrated over a domain, $D$.

Monte Carlo integration estimates $F$ by evaluating $f(x)$ at randomly chosen places, $x \in D$.

The simplest Monte Carlo estimator chooses $N$ samples taken uniformly randomly in $D$:

$\widehat{F} = |D| \; {1 \over N} \sum_i f(x_i)$

where $|D| = \int_D \; dx$ is the size of the domain.

This estimator is unbiased because its expected value is the true integral:

$\begin{array}{rcll} E(\widehat{F}) & = & E( |D| \; {1 \over N} \sum_i f(x_i) ) \\ & = & |D| \; {1 \over N} \sum_i E( f(x_i) ) & \textrm{by linearity of expectation}\\ & = & |D| \; E( f(x) ) & \textrm{since all of these expected values are equal}\\ & = & |D| \; \int_D f(x) \; p(x) \; dx & \textrm{by the definition of expectation $E()$}\\ & = & |D| \; \int_D f(x) \; {1 \over |D|} \; dx & \textrm{since we are doing uniform sampling}\\ & = & \int_D f(x) \; dx \\ \end{array}$

Importance Sampling

With Importance Sampling, the samples are chosen according to a non-uniform distribution. The non-uniform distribution give more probability to samples that contribute more to the intergral. These samples are more important, hence the name.

If we sample $x_i$ according to a probabiliity density function, $p(x)$, the estimator is:

$\widehat{F_p} = {1 \over N} \sum_i {f(x_i) \over p(x_i)}$

Note that uniform sampling has $p(x) = {1 \over |D|}$, so $\widehat{F_p}$ is a generalization of the uniformly sampled estimator, $\widehat{F}$, above.

This estimator is unbiased:

$\begin{array}{rcll} E(\widehat{F_p}) & = & E( {1 \over N} \sum_i {f(x_i) \over p(x_i)} ) \\ & = & {1 \over N} \sum_i E( {f(x_i) \over p(x_i)} ) & \textrm{by linearity of expectation}\\ & = & E( {f(x) \over p(x)} ) & \textrm{since all of these expected values are equal} \\ & = & \int_D {f(x) \over p(x)} \; p(x) \; dx & \textrm{from the definition of expectation $E()$} \\ & = & \int_D f(x) \; dx \\ \end{array}$

With importance sampling, the sampling PDF, $p(x)$, is chosen to reduce the variance of $\widehat{F_p}$ by reducing the variance of the integrand, ${f(x) \over p(x)}$.

For example, if $f(x)$ is known, we could choose $p(x) = f(x)$. Then the integrand would be one and, since this integrand is constant, the variance would be zero.

In this case, a single sample would be sufficient to estimate $\int f(x) \; dx$ exactly. However, to sample according to $p(x)$ requires knowing the cumulative density function of $p(x)$. For a one-dimensional domain:

$\textrm{CDF}_p( x ) = \int_{-\infty}^x p(x) \; dx$

and this is as hard as the original problem, $\int f(x) \; dx$.

So we can only sample according to PDFs for which we can easily build the CDF.

Application to the Rendering Equation

The integrand of the rendering equation is

$f(x,\oi,\oo) \; \cos\theta_\textrm{in} \; L(\oi,x)$

Recall that this is integrated over incoming directions, $\oi \in \Omega$, so the position, $x$, and outgoing direction, $\oo$, are constants.

Determining $L(\oi,x)$ for all $\oi$ requires many expensive raytracing steps, so it is not feasible to model $p(x)$ from $L(\oi,x)$.

But $\cos\theta_\textrm{in}$ is easy to evaluate, and $f(x,\oi,\oo)$ is sometimes easy to evaluate. For example, with purely diffuse shading,

$f(x,\oi,\oo) = {\rho \over \pi}$

for surface albedo $\rho$. This is constant because it does not depend on the incoming direction, $\oi$.

So we could choose $p(\oi) = \cos\theta_\textrm{in}$ as the sampling PDF. This is good because the variance of

$f(x,\oi,\oo) \; \cos\theta_\textrm{in} \; L(\oi,x) \over \cos\theta_\textrm{in}$

is less than the variance of the original

$f(x,\oi,\oo) \; \cos\theta_\textrm{in} \; L(\oi,x)$

in general (although it depends on $L(\oi,x)$).

Intuition

Sampling according to $\cos\theta_\textrm{in}$ gives higher probability to samples that are in the direction of the surface normal. Radiance coming from these directions makes a larger contribution to the integral over $\Omega$ because of the $\cos\theta_\textrm{in}$ term in the integrand.