$$\int f(x) dx \sim \sum_i^k f(x) \Delta x$$
$$
\begin{align*}
\Delta x & = \int \dot{x}(t) dt \\
& \sim \dot{x}(t) \Delta t
\end{align*}
$$
These are all special cases of "Deterministic Quadratures", where the integration step size $\Delta x$ is non-random ("deterministic") and we are summing up a bunch of quadrilateral pieces ("Quadratures") to approximate areas.
This post interprets the numerical integration as a special case of stratified sampling, and shows that deterministic quadrature rules are statistically biased estimators.
Suppose we want to compute $E_X[f(X)] = \int_x p(x)\cdot f(x) dx$ via a Monte Carlo estimator.
A plain Monte Carlo Estimator is given by $\mu = \frac{1}{n}\sum_i^n f(X_i)$, and has variance $\frac{1}{n}\sigma^2$ where $\sigma^2=Var[X_i]$.
Stratified sampling (see previous blog post) introduces a stratification variable $Y$ with a discrete distribution of $k$ strata with probability densities $p(y_1),p(y_2),...,p(y_k)$, respectively. A common choice is to assign each $p(y_1) = p(y_2) = ... = p(y_k) = 1/k$, and arrange the strata in a grid. In this case, $y_i$ correspond to the corners of the strata and $P(X|Y=y_i)$ corresponds to a uniform distribution over that square ($X_i = Y_i + 0.1*\text{rand}()$).
The variance of this estimator is $\sum_i^k p(y_i)^2 \cdot V_i$, where $V_i$ is the variance of $\mathbb{E}[X|Y=y_i]$ for strata $i$.
Suppose we let $\mathbb{E}[X|Y=y_i]$ as $y_i$ - that is, we just sample at the corners where $y_i$ sampels are.
The good news is that $V_i = 0$, and we've reduced the variance of the total estimator to zero. The bad news is that the estimator is now biased because our per-strata estimators $\mathbb{E}[X|Y=y_i]$ are biased (our estimator never uses information from the interiors/edges of the squares). The estimator is also inconsistent for finite strata.
Does this figure remind you of anything? This estimator behaves identically to a Riemann Summation in 2D!
What does this mean?
- If you ever take a Riemann sum with fixed step size, your integral is biased!
- If you are monitoring mean ocean temperatures and place your sensors in an regular grid across the water (like above), your estimator is biased!
- If you are recording video at a too-low frequency and pick up aliasing artifacts, your estimates are biased!
- If you use a fixed timestep in a physics simulation, such as Euler or Runge-Kutta methods, your space integrals are biased!
The takeaway here is that *all* deterministic quadrature rules are statistically biased. that's okay though - the expected value of biased estimators *do* converge to the true expectation as you crank the number of strata to infinity (e.g. $\Delta x \to 0$).
A "less-biased" physics simulator ought to incorporate random time steps (for instance, sampled between 0 and MAX_TIME_STEP). To reduce variance, one might generate multiple samples, compute a nonlinear update, then average out the integration.
Fairly obvious in hindsight, but I found it interesting to think about the relationship between stratified sampling and deterministic quadrature :)
I was just trying, to sum up, my idea of physics simulation and now this. After reading this, I got me more confused and I thought I have already an ample idea about the simulation. Simulation and summation are categorically contrasted in my point of view.
ReplyDeletehttp://www.lordofpapers.com/term-papers-help