Use the left and right arrow keys to navigate the presentation forward and backward respectively. You can also use the arrows at the bottom right of the screen to navigate with a mouse.

FAIR USE ACT DISCLAIMER:

This site is for educational purposes only. This website may contain copyrighted material, the use of which has not been specifically authorized by the copyright holders. The material is made available on this website as a way to advance teaching, and copyright-protected materials are used to the extent necessary to make this class function in a distance learning environment. The Fair Use Copyright Disclaimer is under section 107 of the Copyright Act of 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, education and research.

- The following topics will be covered in this lecture:
- The ensemble Kalman filter
- The classic ensemble Kalman smoother

We have seen now how the

**linear-Gaussian analysis**can be**extended at first order to nonlinear systems**in several classical ways.Specifically,

**3D-VAR and the extended Kalman filter**both provide a means to produce an**approximate filtering analysis in the space of perturbations**.**4D-VAR**extends the analysis of 3D-VAR, using a static “climatological” background covariance, to a**smoothing formulation over an entire time series**.The primary issue with th

**3D- / 4D-VAR**approach is that the**static background covariance**doesn't capture the spread of the forecast as this changes in time.- Rather, these approaches can be considered to predict from a persistence model, fitting the outcome to an observation or a time series of observations of the current process.

On the other hand, the

**extended Kalman**filter provides a means to**update the background covariance**, but the propagation of the covariance in the tangent-linear model is often unstable / unfeasible.An

**alternative formulation**arises if we consider the**sample-based estimates**as with the particle filter and Metropolis-Hastings.

In particular, let's recall our construction of the

**ensemble matrix**\( \mathbf{E}\in\mathbb{R}^{N_x \times N_e} \):- We will suppose that we have a random sample \( \pmb{X}_j \) following a parent distribution \( \pmb{X}\sim P \);
- The ensemble matrix is given such that \( \mathbf{E}^j = \pmb{X}_j \) for all \( j = 1,\cdots,N_e \).

Moreover, the

**sample mean**can be computed from the row-average of the ensemble matrix as\[ \hat{\pmb{X}} = \mathbf{E} \pmb{1} \frac{1}{N_e}. \]

We can thus define the

**sample covariance matrix**in a way analogously to how we define the sample mean.Particularly, if we follow the matrix multiplication with the transpose, we find that

\[ \begin{align} \mathbf{E}\pmb{1}\pmb{1}^\top \frac{1}{N_e} = \begin{pmatrix} \hat{X}_1 & \cdots & \hat{X}_{1} \\ \vdots & \ddots & \vdots \\ \hat{X}_{N_x} & \cdots &\hat{X}_{N_x} \end{pmatrix}\in\mathbb{R}^{N_x \times N_e} \end{align} \]

Particularly, this can be written column-wise as

\[ \mathbf{E}\pmb{1}\pmb{1}^\top \frac{1}{N_e} = \begin{pmatrix}\hat{\pmb{X}}, \cdots, \hat{\pmb{X}}\end{pmatrix} \]

Using element-wise subtraction with the last identity, this says that,

\[ \begin{align} \mathbf{E} - \mathbf{E}\pmb{1}\pmb{1}^\top \frac{1}{N_e} = \begin{pmatrix} X_{1,1} - \hat{X}_1 & \cdots &X_{1,n}- \hat{X}_1 \\ \vdots & \ddots & \vdots \\ X_{N_x,1} - \hat{X}_{N_X} & \cdots & X_{N_X,N_e} - \hat{X}_{N_x} \end{pmatrix} \end{align} \]

With a re-normalization, we will define the matrix of perturbations or anomalies of the ensemble about the mean.

We define the

**(normalized) anomaly matrix**of the ensemble as\[ \begin{align} \mathbf{X} :&= \left(\mathbf{E} - \mathbf{E}\pmb{1}\pmb{1}^\top \frac{1}{N_e} \right)\frac{1}{\sqrt{N_e -1}}\\ &=\mathbf{E}\left( \mathbf{I} - \pmb{1}\pmb{1}^\top \frac{1}{N_e} \right)\frac{1}{\sqrt{N_e -1}} \end{align} \]

The anomalies have the property

\[ \begin{align} \mathbf{P} :&= \mathbf{X} \mathbf{X}^\top \\ &= \mathbf{E}\left( \mathbf{I} - \pmb{1}\pmb{1}^\top \frac{1}{N_e} \right)\frac{1}{N_e -1}\left( \mathbf{I} - \pmb{1}\pmb{1}^\top \frac{1}{N_e} \right)\mathbf{E}^\top\\ &=\mathbf{E}\left( \mathbf{I} - \pmb{1}\pmb{1}^\top \frac{1}{N_e} \right)\mathbf{E}^\top\frac{1}{N_e -1} \end{align} \]

where \[ \begin{align} \mathbf{P}_{i,j} = \begin{cases} \hat{\sigma}^2_{i} &\text{ for }i=j\\ \hat{\sigma}_{i,j} &\text{ for }i\neq j \end{cases} \end{align} \]

Rather than dealing with the

**numerical challenges**of**propagating the background covariance**\( \mathbf{B}_k^\mathrm{filt} \) through the tangent-linear model to form the next \( \mathbf{B}^\mathrm{fore}_k \) as with the extended Kalman filter,- we can take the
**sampling approach**of the particle filter and**treat replicates as point volumes but with equal weights**.

- we can take the
Forming such a

**particle cloud / ensemble**this**gives an estimator for the background**\( \mathbf{P}_k \approx \mathbf{B}_k \) when the first order linear-Gaussian approximation is appropriate.However, the linear-Gaussian assumption actually leads to a

**biased estimator**, but which (by construction)**eliminates the extremely high-variance of the particle filter weights**.This approach is the basis of the

**ensemble Kalman filter (EnKF)**.

One can see the

**EnKF**to be a**hybridization**of the extended Kalman filter, variational cost function with a particle filter, using a sample-based covariance and sample-based mean estimate.The resulting

**EnKF cost function**can thus be written as\[ \begin{align} \mathcal{J}_{\mathrm{EnKF}}(\pmb{x}) := \frac{1}{2} \parallel \hat{\pmb{x}}_k^\mathrm{fore} - \pmb{x} \parallel_{\mathbf{P}_k^\mathrm{fore}}^2 + \frac{1}{2} \parallel \pmb{y}_k - \mathcal{H}(\pmb{x})\parallel_{\mathbf{R}_k}^2, \end{align} \] where we take the ensemble-based, empirical mean and covariance as

- \( \hat{\pmb{x}}_k^\mathrm{fore} := \mathbf{E}_k^\mathrm{fore}\pmb{1} / N_e \),
- \( \mathbf{P}_k^\mathrm{fore} := \left(\mathbf{X}_k^\mathrm{fore}\right)\left(\mathbf{X}_k^\mathrm{fore}\right)^\top \), and
- \( \mathbf{X}^\mathrm{fore}_k :=\mathbf{E}^\mathrm{fore}_k\left( \mathbf{I} - \pmb{1}\pmb{1}^\top \frac{1}{N_e} \right)\frac{1}{\sqrt{N_e -1}} \).

The

**columns of the ensemble matrix**are given by**propagating the sample through the nonlinear model**, so that if \( \pmb{x}_k^{i,\mathrm{filt}} \) is a replicate of the model state from the filtering density,\[ \begin{align} \pmb{x}_k^{i,\mathrm{fore}} := \mathcal{M}_k\left(\pmb{x}_k^{i,\mathrm{filt}}\right) + \pmb{w}^i_k \end{align} \]

Therefore,

**sampling the forecast density**is**performed with the fully nonlinear state space model**like the particle filter;- the key of the method is in how one efficiently can resample the (approximate) filtering density given the maximum-a-posteriori analysis.

If we

**re-write the state vector**as a**linear combination of the replicates**, we can devise this in the anomalies as\[ \begin{align} \pmb{x} := \hat{\pmb{x}}_k + \mathbf{X}_k^\mathrm{fore} \pmb{w}. \end{align} \]

Notice that \( \pmb{w}\in \mathbb{R}^{N_e} \) so that this is an optimization over the ensemble dimension.

- If \( N_e \leq N_x \), then we should note that \( \parallel \cdot \parallel_{\mathbf{P}_k} \) refers to a
**pseudo-norm with respect to the pseudo-inverse**of the anomaly matrix.

- If \( N_e \leq N_x \), then we should note that \( \parallel \cdot \parallel_{\mathbf{P}_k} \) refers to a
Revising the cost function, we can linearize the observation operator with Taylor's theorem as

\[ \begin{align} \mathcal{J}_{\mathrm{EnKF}}(\pmb{w}) := \frac{1}{2} \parallel \pmb{w} \parallel^2 + \frac{1}{2} \parallel \pmb{y}_k - \mathcal{H}\left(\hat{\pmb{x}}_k^\mathrm{fore}\right) - \mathbf{H}_k\mathbf{X}_k^\mathrm{fore} \pmb{w} \parallel_{\mathbf{R}_k}^2, \end{align} \] where we

**define the analysis**with the**linear approximation through the Hessian**\[ \begin{align} \mathbf{T}:= \mathbf{H}_{\mathcal{J}}^{-\frac{1}{2}} & & \mathbf{X}_k^\mathrm{filt} := \mathbf{X}_k^\mathrm{fore} \mathbf{T}. \end{align} \]

With the update to the anomalies defined as above, and the update to the mean defined for the optimal weights \( \overline{\pmb{w}} \) as

\[ \begin{align} \hat{\pmb{x}}^\mathrm{filt}_k := \hat{\pmb{x}}^\mathrm{fore}_k + \mathbf{X}_k^\mathrm{fore} \overline{\pmb{w}}, \end{align} \]

we can

**resample the entire ensemble**from the**approximate, best-fit Gaussian**as\[ \begin{align} \mathbf{E}_k^\mathrm{filt} := \hat{\pmb{x}}^\mathrm{filt}_k \pmb{1}^\top + \mathbf{X}_k^\mathrm{filt}\sqrt{N_e -1 }. \end{align} \]

In this formalism, we can appropriately define an

**ensemble right-transform**\( \boldsymbol{\Psi}_k \) such that for any \( t_k \),\[ \begin{align} \mathbf{E}^\mathrm{filt}_k = \mathbf{E}^\mathrm{fore}_k \boldsymbol{\Psi}_k \end{align} \] where in the above we would say that \[ \begin{align} \mathbf{E}^\mathrm{filt}_k &\sim p(\pmb{x}_k \vert \pmb{y}_{1:k}) \\ \mathbf{E}^\mathrm{fore}_k &\sim p(\pmb{x}_k \vert \pmb{y}_{1:k-1}) \end{align} \]

We will associate \( \mathbf{E}^\mathrm{filt}_k \equiv \mathbf{E}^\mathrm{smth}_{k|k} \);

- under the
**linear-Gaussian model**, we furthermore have that

\[ \begin{align} \mathbf{E}^\mathrm{smth}_{k|L} = \mathbf{E}^\mathrm{smth}_{k|L-1}\boldsymbol{\Psi}_{L} & & \mathbf{E}^\mathrm{smth}_{k|K} \sim p(\pmb{x}_k \vert \pmb{y}_{1:K}). \end{align} \]

- under the
Then we can perform a

**retrospective smoothing analysis**on all past states stored in memory by**using the latest right-transform update from the filtering step**.This form of

**retrospective analysis**is the basis of the**ensemble Kalman smoother (EnKS)**.

- The EnKS takes advantage of the simple form of the
**retrospective, right-transform analysis**by including an**additional, inner loop of the filtering cycle**.