The ensemble Kalman filter and smoother part II

Instructions:

Use the left and right arrow keys to navigate the presentation forward and backward respectively. You can also use the arrows at the bottom right of the screen to navigate with a mouse.

FAIR USE ACT DISCLAIMER:
This site is for educational purposes only. This website may contain copyrighted material, the use of which has not been specifically authorized by the copyright holders. The material is made available on this website as a way to advance teaching, and copyright-protected materials are used to the extent necessary to make this class function in a distance learning environment. The Fair Use Copyright Disclaimer is under section 107 of the Copyright Act of 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, education and research.

Outline

  • The following topics will be covered in this lecture:
    • The iterative ensemble Kalman smoother
    • The single-iteration ensemble Kalman smoother

Motivation

  • In the last lecture, we saw how the ensemble Kalman filter (EnKF) can be used to propagate covariance estimates in the nonlinear model, while making the linear-Gaussian approximation at first order.

  • Rather than analytically computing a large covariance matrix, and its evolution using the tangent-linear approximation, a nonlinear sample is drawn and a sample-based estimate is performed for the covariance.

  • This sample-based covariance thus forms the background weights for the optimization of the nonlinear filtering cost function.

  • This can be extended, like the smoothing problem in 4D-VAR, to a global analysis over a time series.

    • Such an approach is what is often known as an ensemble-variational (EnVAR) technique, in which a 4D iterative optimization is made over an initial condition,
    • but the weights and the estimate are constructed with the ensemble.
  • This approach can be considered to thus extend 4D-VAR to include a time-varying background covariance using the ensemble-based estimates.

    • Often, however, due to the small feasible ensemble size that can be simulated in the nonlinear model, the ensemble-based background weights are interpolated with a climatological covariance to regularize the problem.

Hybrid EnVAR in the EnKF analysis

  • The ensemble-variational approach is at the basis of the iterative ensemble Kalman filter / smoother (IEnKF/S).

  • This technique seeks to perform an ensemble analysis like the square root ETKF by defining the ensemble estimates and the weight vector in the ensemble span

    \[ \begin{alignat}{2} & & {\color{#d95f02} {\widetilde{\mathcal{J}} (\pmb{w})} } &= {\color{#d95f02} {\frac{1}{2} \parallel \hat{\pmb{x}}_{0|L-S}^\mathrm{smth} - \mathbf{X}^\mathrm{smth}_{0|L-S} \pmb{w}- \hat{\pmb{x}}^\mathrm{smth}_{0|L-S} \parallel_{\mathbf{P}^\mathrm{smth}_{0|L-S}}^2} } + {\color{#7570b3} {\sum_{k=L-S+1}^L \frac{1}{2} \parallel \pmb{y}_k - \mathcal{H}_k\circ {\color{#1b9e77} { \mathcal{M}_{k:1}\left( {\color{#d95f02} { \hat{\pmb{x}}^\mathrm{smth}_{0|L-S} + \mathbf{X}^\mathrm{smth}_{0|L-S} \pmb{w} } } \right)}}\parallel_{\mathbf{R}_k}^2 } }\\ \Leftrightarrow & & {\color{#d95f02} {\widetilde{\mathcal{J}} (\pmb{w})} } &= {\color{#d95f02} { \frac{1}{2} \parallel \pmb{w}\parallel^2} } + {\color{#7570b3} {\sum_{k=L-S+1}^L \frac{1}{2} \parallel \pmb{y}_k - \mathcal{H}_k\circ {\color{#1b9e77} { \mathcal{M}_{k:1}\left( {\color{#d95f02} { \hat{\pmb{x}}^\mathrm{smth}_{0|L-S} + \mathbf{X}^\mathrm{smth}_{0|L-S} \pmb{w} } } \right)}}\parallel_{\mathbf{R}_k}^2 } } \end{alignat} \]

  • One measures the cost as the discrepancy from the observations with the nonlinear evolution of the perturbation to the ensemble mean,

    \[ \begin{align} \hat{\pmb{x}}^\mathrm{smth}_{0|L-S} + \mathbf{X}^\mathrm{smth}_{0|L-S} \pmb{w} \end{align} \] combined with the size of the perturbation relative to the ensemble spread.

  • The key is again, how the gradient is computed for the above cost function.

Hybrid EnVAR in the EnKF analysis

  • The gradient of the ensemble-based cost function is given by,

    \[ \begin{align} {\color{#d95f02} {\nabla_{\pmb{w}} \widetilde{\mathcal{J}} } }:= {\color{#d95f02} {\pmb{w}}} - {\color{#7570b3} {\sum_{k=L-S+1}^L \widetilde{\mathbf{Y}}_k^\top \mathbf{R}^{-1}_k\left[\pmb{y}_k - \mathcal{H}_k \circ {\color{#1b9e77} { \mathcal{M}_{k:1}\left({\color{#d95f02} {\hat{\pmb{x}}_{0|L-S}^\mathrm{smth} + \mathbf{X}_{0|L-S}^\mathrm{smth} \pmb{w}} }\right) } } \right]}}, \end{align} \]

  • where \( {\color{#7570b3} { \widetilde{\mathbf{Y}}_k } } \) represents a directional derivative of the observation and state models,

    \[ \begin{align} {\color{#7570b3} { \widetilde{\mathbf{Y}}_k } }:= {\color{#d95f02} {\nabla\vert_{\hat{\pmb{x}}^\mathrm{smth}_{0|L-S}} } } {\color{#7570b3} {\left[\mathcal{H}_k \circ {\color{#1b9e77} {\mathcal{M}_{k:1} } } \right] } } {\color{#d95f02} {\mathbf{X}^\mathrm{smth}_{0|L-S}} }. \end{align} \]

  • In order to avoid the construction of the tangent-linear and adjoint models, the “bundle” version makes an explicit approximation of finite differences with the ensemble

    \[ \begin{align} {\color{#7570b3} { \widetilde{\mathbf{Y}}_k } }\approx& {\color{#7570b3} { \frac{1}{\epsilon} \mathcal{H}_k \circ {\color{#1b9e77} {\mathcal{M}_{k:1} \left( {\color{#d95f02} { \pmb{x}_{0|L-S}^\mathrm{smth} \pmb{1}^\top + \epsilon \mathbf{X}_{0|L-S}^\mathrm{smth} } }\right) } } \left(\mathbf{I}_{N_e} - \pmb{1}\pmb{1}^\top / N_e \right)} }, \end{align} \] for a small constant \( \epsilon \).

  • The scheme produces an iterative estimate using a Gauss-Newton- or, e.g., Levenberg-Marquardt-based optimization.

  • A similar scheme used more commonly in reservoir modeling is the ensemble randomized maximum likelihood estimator (EnRML).

The single iteration ensemble transform Kalman smoother (SIEnKS)

  • While accuracy increases with iterations in the 4D-MAP estimate, every iteration comes at the cost of the model forecast \( {\color{#1b9e77} { \mathcal{M}_{L:1} } } \).

  • In synoptic meteorology the linear-Gaussian approximation of the evolution of the densities is actually an adequate approximation;

    • iterating over the nonlinear dynamics may not be justified by the improvement in the forecast statistics.
  • However, the iterative optimization over a nonlinear observation operator \( \mathcal{H}_k \) or hyper-parameters in the filtering step of the classical EnKS can be run without the additional cost of model forecasts.

    • This can be performed similarly to the IEnKS with the maximum likelihood ensemble filter (MELF) analysis.
  • Subsequently, the retrospective analysis in the form of the filtering right-transform can be applied to condition the initial ensemble

    \[ \begin{align} \mathbf{E}^\mathrm{smth}_{0|L} = \mathbf{E}_{0:L-1}^\mathrm{smth} \boldsymbol{\Psi}_L \end{align} \]

  • As with the 4D cost function, one can initialize the next DA cycle in terms of the retrospective analysis, and gain the benefit of the improved initial estimate.

  • This scheme, is the single-iteration ensemble Kalman smoother (SIEnKS).

The single iteration ensemble Kalman smoother (SIEnKS)

  • Compared to the classical EnKS, this adds an outer loop to the filtering cycle to produce the posterior analysis.
Diagram of the filter observation-analysis-forecast cycle.
  • The information flows from the filtered state back in time from the retrospective analysis.
  • This re-analyzed state becomes the initialization for the next cycle over the shifted DAW, carrying this information forward.
  • The iterative cost function is only solved in the filtering estimate for the new observations entering the DAW.
    • Combined with the retrospective analysis, this comes without the cost of iterating the model forecast over the DAW.
  • When the tangent-linear approximation is adequate, this is shown to be an accurate and highly efficient approach to sequential DA.