Use the left and right arrow keys to navigate the presentation forward and backward respectively. You can also use the arrows at the bottom right of the screen to navigate with a mouse.
FAIR USE ACT DISCLAIMER: This site is for educational purposes only. This website may contain copyrighted material, the use of which has not been specifically authorized by the copyright holders. The material is made available on this website as a way to advance teaching, and copyright-protected materials are used to the extent necessary to make this class function in a distance learning environment. The Fair Use Copyright Disclaimer is under section 107 of the Copyright Act of 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, education and research.
We have now begun to introduce some of the key concepts that bridge the estimation problem to nonlinear models.
Rather than the usual Gauss-Markov model, we may generally consider a system of equations
\[ \begin{align} \pmb{x}_k &= \mathcal{M}_k (\pmb{x}_{k-1}) + \pmb{w}_k \\ \pmb{y}_k &= \mathcal{H}_k (\pmb{x}_k) + \pmb{v}_k \end{align} \] where
Note, even if the error distributions are Gaussian, the nonlinearity of the process model and observation model will deform the forecast and posterior distributions;
While we have seen that the Gauss-Markov model can be used as an approximation within certain restrictions,
Recall the highly general hidden Markov model, without the simplification of the linear-Gaussian restriction.
\[ \begin{align} \pmb{x}_k &= \mathcal{M}_k (\pmb{x}_{k-1}) + \pmb{w}_k \\ \pmb{y}_k &= \mathcal{H}_k (\pmb{x}_k) + \pmb{v}_k \end{align} \]
It is possible to define pure Bayesian estimators for the general configuration, with very few assumptions.
These types of estimators are very robust in terms of estimating the nonlinear evolution, and can identify highly-skewed and multi-modal distributions.
The high generality of this type of estimator means that there is very little bias by its construction;
This tradeoff means that in order to gain robust estimates as above, the sample size needs to be very large.
However, for too large a system, the computation of standard particle methods becomes unfeasible, unless additional forms of bias are introduced to the estimator.
Recall that in our discussion of the Kalman filter, we derived that
\[ \begin{align} p(\pmb{x}_k|\pmb{y}_{k:1})\propto p(\pmb{y}_k|\pmb{x}_k) p(\pmb{x}_k | \pmb{y}_{k-1:1}), \end{align} \]
so that we interpret:
\[ \begin{align} p(\pmb{x}_{k}|\pmb{y}_{k-1:1}) = \int p(\pmb{x}_k|\pmb{x}_{k-1}) p(\pmb{x}_{k-1}|\pmb{y}_{k-1:1})\mathrm{d}\pmb{x}_{k-1} \end{align} \] to obtain the forecast density for \( \pmb{x}_{k|k-1} \).
\[ \begin{align} p(\pmb{y}_k|\pmb{x}_k) p(\pmb{x}_k | \pmb{y}_{k-1:1}); \end{align} \] and
All of these steps are implicitly encoded in the Kalman filter equations for the recursive conditional mean and covariance.
However, the above derivation actually never made use of any linear-Gaussian model assumptions;
SIS particle filters are extremely flexible and makes few assumptions on the form of the problem whatsoever;
This is one example of a concept more broadly known in nonlinear filtering as ensemble collapse / filter divergence.
In effect, the empirical estimate becomes overly self-certain, and will no longer be receptive to new data.
Because a single point mass cannot represent the spread of the data, this also cannot be used to represent any of the variation in the estimate.
Finding a method for handling the degeneracy of the weights is explicitly the motivation for the bootstrap particle filter, and implicitly one of the motivations for the ensemble Kalman filter.
We will return to the idea of the ensemble Kalman filter later, but for now consider the classical particle filter rectification.
The basic resampling algorithm simply utilizes the strategy of the inverse CDF transformation of a uniform sample.
In particular, the weights \( \tilde{w}^i_k \) at any given time provide an estimate of the empirical CDF for the posterior.
We draw a uniform \( u \) on \( [0,1] \) and find the first index \( i \) for which the total sum of all the weights below index \( i \) fall below the realized value \( u \).
This is repeated until we have \( N_e \) total ensemble members once again, all given equal weights to restart the algorithm.
Notice that assigning the map of \( u \) to the associated particle \( \pmb{x}^{i}_k \) is precisely the inverse empirical CDF map.
In particular, under generic convergence conditions, and the limit in the sample size \( N_e \rightarrow \infty \),
More commonly, the standard technique for the bootstrap particle filter is the systematic resampling algorithm.
Firstly, we draw \( u^1 \) uniform on \( [0, N_e^{-1} \), i.e., on the restricted range up to one over the ensemble size.
The first draw creates exactly one “representative” replicate corresponding to all particles with combined weight in the range \( [0,N_e^{-1}] \).
From this point, a new \( u^j \) is defined as \( u^j= u^1 + N_e^{-1}(j-1) \), where the same replication strategy follows:
With this strategy, we are guaranteed to draw exactly one “representative” replicate among particles for which the empirical CDF, \( c^i \) falls in the range \( [(j-1)/N_e, j/N_e] \).
Particularly, \( u^j \) is uniform on \( [(j-1)/N, j/N] \), and this decides which particular particle will be replicated in this weight interval.