Use the left and right arrow keys to navigate the presentation forward and backward respectively. You can also use the arrows at the bottom right of the screen to navigate with a mouse.
FAIR USE ACT DISCLAIMER: This site is for educational purposes only. This website may contain copyrighted material, the use of which has not been specifically authorized by the copyright holders. The material is made available on this website as a way to advance teaching, and copyright-protected materials are used to the extent necessary to make this class function in a distance learning environment. The Fair Use Copyright Disclaimer is under section 107 of the Copyright Act of 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, education and research.
We saw in the last lecture how we can formally extend the linear-Gaussian model for data assimilation into a nonlinear system.
The primary difference in how these estimators perform is in the way in which they treat the background weights for a least-squares style optimization.
3D-VAR can be viewed as a recursive least-squares estimate where the model state is taken as a random draw from invariant dynamics of the long-time average.
The extended Kalman filter seeks to include this time-dependent information by making the first order approximation of the evolution of the background covariance in time.
While this approximation can be very successful,
typically the extended Kalman filter has not seen widespread use due to the numerical cost and stability issues of the estimator.
Another classic approach to extend linear-Gaussian methods to nonlinear estimation follows the motivation of 3D-VAR.
Rather than fitting the model state to data, relative to the long-time background weights, without the time-dependence,
This follows directly from formally extending the 4D-smoothing cost function of the linear-Guassian analysis with a locally linearized, quadratic cost function approximating nonlinear least-squares.
4D-VAR is one of the most important scalable data assimilation algorithms for its strong performance, and its forming the basis of many widely used operational data assimilation algorithms.
4D-VAR refers to extending the “three dimensional” state-space cost function to include the time variable, and performing a global analysis of a time series to optimize an initial condition.
Recall that when we introduced the extended Kalman filter cost function, we derived this by the local linearization of the nonlinear cost function:
\[ \begin{align}
\mathcal{J}_{\mathrm{EKF}}(\pmb{w}) &:=\frac{1}{2}\parallel \pmb{w}\parallel^2 + \frac{1}{2}\parallel \pmb{y} - \mathcal{H}\left(\overline{\pmb{x}}_k^\mathrm{fore}\right) - \mathbf{H}_k \boldsymbol{\Sigma}_k^\mathrm{fore}\pmb{w} \parallel_{\mathbf{R}_k}^2,\\
\end{align} \]
which is actually quadratic in \( \pmb{w} \), as \( \mathcal{H}\left(\overline{\pmb{x}}_k^\mathrm{fore}\right) \) is a constant with respect to the optimization.
Therefore, this represents a fully linearized system, performing an approximate conditional Gaussian analysis in the space of perturbations.
If we take a constant \( \mathbf{B}_0 \) as with 3D-VAR, and define the matrix factor again
\[ \begin{align} \mathbf{B}_0 := \boldsymbol{\Sigma}_0\boldsymbol{\Sigma}_0^\top & & \pmb{x}_0 := \overline{\pmb{x}}_0 + \boldsymbol{\Sigma}_{0}\pmb{w} \end{align} \]
we can apply the same tangent-linear approximation as with the extended Kalman filter to optimize the initial state versus a time series of observations globally at-once.
Making the approximation of the tangent-linear model,
\[ \begin{align} &\frac{\mathrm{d}}{\mathrm{d}t} \pmb{x} \approx \pmb{f}(\overline{\pmb{x}}) + \nabla_{\pmb{x}}\pmb{f}(\overline{\pmb{x}})\pmb{\delta}\\ \\ \Rightarrow&\int_{t_{k-1}}^{t_k}\frac{\mathrm{d}}{\mathrm{d}t}\pmb{x}\mathrm{d}t \approx \int_{t_{k-1}}^{t_k} \pmb{f}(\overline{\pmb{x}}) \mathrm{d}t + \int_{t_{k-1}}^{t_k}\nabla_{\pmb{x}}\pmb{f}(\overline{\pmb{x}})\pmb{\delta}\mathrm{d}t \\ \\ \Rightarrow & \pmb{x}_{k} \approx \mathcal{M}_{k}\left(\overline{\pmb{x}}_{k-1}\right) + \mathbf{M}_k\pmb{\delta}_{k-1} \end{align} \] where \( \mathbf{M}_k \) is the resolvent of the tangent-linear model.
Gaussians are closed under affine transformations, approximating the evolution under the tangent-linear model as
\[ \begin{align} \pmb{x}_{k} \sim N\left(\mathcal{M}_k\left(\overline{\pmb{x}}_{k-1}\right), \mathbf{M}_k \mathbf{B}_{k-1}\mathbf{M}^\top_{k}\right) \end{align} \]
Therefore, the 4D-quadratic cost function is approximated by an incremental linearization along the background mean
\[ \begin{alignat}{2} & & {\color{#d95f02} {\mathcal{J} (\pmb{w})} } &= {\color{#d95f02} {\frac{1}{2} \parallel \overline{\pmb{x}}_0 - \overline{\pmb{x}}_0 - \boldsymbol{\Sigma}_0 \pmb{w} \parallel^2_{\mathbf{B}_0}} } + {\color{#7570b3} {\sum_{k=1}^L \frac{1}{2} \parallel \pmb{y}_k - \mathcal{H}_k\circ {\color{#1b9e77} { \mathcal{M}_{k:1} \left( {\color{#d95f02} {\overline{\pmb{x}}_{0} } }\right)}} - \mathbf{H}_k {\color{#1b9e77} {\mathbf{M}_{k:1}}} {\color{#d95f02} {\boldsymbol{\Sigma}_{0} \pmb{w} } } \parallel_{\mathbf{R}_k}^2 } } \end{alignat} \] describing an approximate linear-Gaussian model / cost function in the space of perturbations.
The incremental 4D-VAR cost function from the last slide is composed as follows:
\[ \begin{alignat}{2} & & {\color{#d95f02} {\mathcal{J} (\pmb{w})} } &= {\color{#d95f02} {\frac{1}{2} \parallel \overline{\pmb{x}}_0 - \overline{\pmb{x}}_0 - \boldsymbol{\Sigma}_0 \pmb{w} \parallel^2_{\mathbf{B}_0}} } + {\color{#7570b3} {\sum_{k=1}^L \frac{1}{2} \parallel \pmb{y}_k - \mathcal{H}_k\circ {\color{#1b9e77} { \mathcal{M}_{k:1} \left( {\color{#d95f02} {\overline{\pmb{x}}_{0} } }\right)}} - \mathbf{H}_k {\color{#1b9e77} {\mathbf{M}_{k:1}}} {\color{#d95f02} {\boldsymbol{\Sigma}_{0} \pmb{w} } } \parallel_{\mathbf{R}_k}^2 } }\\ & & &= {\color{#d95f02} {\frac{1}{2} \parallel \pmb{w} \parallel^2} } + {\color{#7570b3} {\sum_{k=1}^L \frac{1}{2} \parallel \pmb{y}_k - \mathcal{H}_k\circ {\color{#1b9e77} { \mathcal{M}_{k:1} \left( {\color{#d95f02} {\overline{\pmb{x}}_{0} } }\right)}} - \mathbf{H}_k {\color{#1b9e77} {\mathbf{M}_{k:1}}} {\color{#d95f02} {\boldsymbol{\Sigma}_{0} \pmb{w} } } \parallel_{\mathbf{R}_k}^2 } } \end{alignat} \] where
This then extends the locally quadratic objective function that was derived earlier, but to include the derivative of the dynamical model with respect to the model state.
This is precisely the way that the adjoint is thus used to compute the gradient as discussed with variational least squares.
The incremental linearization along the background provides the approximation of the gradient with the adjoint.
In particular, the adjoint variables are defined by a backward-in-time solution to the linear equation
\[ \begin{align} \frac{\mathrm{d}}{\mathrm{d}t} \tilde{\pmb{\delta}} = -\left(\nabla_{\pmb{x}} \pmb{f}(\overline{\pmb{x}})\right)^\top \tilde{\pmb{\delta}}, \end{align} \] with the underlying dependence on the nonlinear solution over the time interval.
Therefore, in incremental 4D-VAR, one constructs the gradient for the objective function, differentiating the nonlinear model, via:
This a very effective and efficient solution, but relies on the construction of the tangent-linear and adjoint models for the dynamics.
The above discussion is the basis of the traditional incremental 4D-VAR, though modern formulations of 4D-VAR seek to include the effect of model error in the estimation.
The standard approach to include the dependence of model error is known as weak-constraint 4D-VAR.
The idea behind weak-constraint 4D-VAR is to use the form of the hidden Markov model with additive noise to allow for a non-exact model evolution.
Particularly, we consider that
\[ \begin{align} & \pmb{x}_k = \mathcal{M}_k(\pmb{x}_{k-1}) + \pmb{w}_k \\ \Leftrightarrow & \pmb{x}_k - \mathcal{M}_{k}(\pmb{x}_{k-1}) = \pmb{w}_k. \end{align} \]
If we again suppose a linear-Gaussian approximation is appropriate at first order, we can say that the transition density is given by
\[ \begin{align} p\left(\pmb{x}_k | \pmb{x}_{k-1}\right) = N\left(\pmb{x}_k - \mathcal{M}_k (\pmb{x}_{k-1}) | \pmb{0}, \mathbf{Q}_k \right), \end{align} \] where the above refers to the Gaussian density with mean zero and covariance \( \mathbf{Q}_k \).
The fully nonlinear weak-constraint 4D-VAR objective function is then given as,
\[ \begin{align} \mathcal{J}(\pmb{x}_{L:0}) := \frac{1}{2}\parallel \overline{\pmb{x}}_0 - \pmb{x}_0 \parallel^2_{\mathbf{B}_0} + \frac{1}{2} \sum_{k=1}^L \left\{ \parallel \pmb{x}_k - \mathcal{M}_k(\pmb{x}_{k-1}) \parallel^2_{\mathbf{Q}_k} + \parallel \pmb{y}_k - \mathcal{H}_k(\pmb{x}_k) \parallel^2_{\mathbf{R}_k} \right\}, \end{align} \] where we extend the former objective function by simultaneously minimizing the differences between the evolution of a past state and the next optimized state relative to the model uncertainty.
The weak-constraint cost function is likewise then typically approximated with a locally quadratic cost function via incremental linearization.