Presented at the Joint ICTP-IUGG Workshop on Data Assimilation and Inverse Problems in Geophysical Sciences
19 October, 2021
Outline
The following topics will be covered in this lecture
Observation-analysis-forecast cycles
Hidden Markov models and Bayesian analysis
Bayesian maximum-a-posteriori (MAP) analysis in Gaussian models
The ensemble Kalman filter and smoother (EnKF / EnKS)
The 4D cost function and VAR approaches
Sequential EnVAR estimators in the EnKF analysis:
the iterative ensemble Kalman smoother (IEnKS)
the single-iteration ensemble Kalman smoother (SIEnKS)
Motivation
In applications such as short-range weather prediction, data assimilation provides a means to sequentially and recursively update forecasts with newly incoming information.
The Bayesian approach to DA has become widely adopted because it provides a unified treatment of the tools from statistical estimation, nonlinear optimization and machine learning for handling such a problem.
Let’s suppose that the dynamical physical states can be written in a vector, \( \pmb{x}_k \in \mathbb{R}^{N_x} \), where \( k \) corresponds to some time \( t_k \).
Abstractly, we will represent the time-evolution of these states with the nonlinear map \( \mathcal{M} \),
\[ \begin{align}\\
\pmb{x}_k = \mathcal{M}_{k} \left(\pmb{x}_{k-1}, \boldsymbol{\lambda}\right) + \boldsymbol{\eta}_k
\end{align} \]
where
\( \pmb{x}_{k-1} \) is the vector of physical states at an earlier time \( t_{k-1} \);
\( \boldsymbol{\lambda} \) is a vector of uncertain physical parameters on which the time evolution depends;
\( \boldsymbol{\eta}_k \) is an additive, stochastic noise term, representing errors in our model for the physical process.
We will estimate the random vector \( {\color{#d95f02} {\pmb{x}_k } } \) with a prior distribution on \( \left(\pmb{x}_{k-1}, \boldsymbol{\lambda}\right) \) and knowledge of \( \mathcal{M}_{k} \) and knowledge of how \( \boldsymbol{\eta}_k \) is distributed.
At time \( t_{k-1} \), we will make a forecast for the distribution of \( \pmb{x}_k \) with our prior knowledge, including the physics-based model.
For the rest of this lecture, we will restrict to the case that \( \boldsymbol{\lambda} \) is a known constant, and the forecast model is perfect,
\[ \begin{align}
\pmb{x}_k = \mathcal{M}_{k} \left(\pmb{x}_{k-1}\right),
\end{align} \]
for simplicity.
Motivation
Suppose that we are also given real-world observations \( {\color{#7570b3} {\pmb{y}_k\in\mathbb{R}^{N_y}} } \) related to the physical states by,
\[ {\color{#7570b3} { \pmb{y}_k = \mathcal{H}_k \left(\pmb{x}_k\right) + \boldsymbol{\epsilon}_k } } \]
where
\( {\color{#7570b3} {\mathcal{H}_k:\mathbb{R}^{N_x} \rightarrow \mathbb{R}^{N_y} } } \) is a nonlinear map relating the states we wish to estimate \( \pmb{x}_k \) to the values that are observed \( {\color{#7570b3} { \pmb{y}_k} } \).
Typically \( N_y \ll N_x \) so this is information is sparse and observations are not \( 1:1 \) with the unobserved states.
\( \boldsymbol{\epsilon}_k \) is an additive, stochastic noise term representing errors in the measurements.
Therefore, at time \( t_k \) we will have a forecast distribution for the states \( \pmb{x}_k \) generated by our prior on \( \pmb{x}_{k-1} \) and our physics-based model \( \mathcal{M} \).
We will also have an observation \( \pmb{y}_k \) with uncertainty.
We wish to find a posterior distribution for \( \pmb{x}_k \)conditioned on \( \pmb{y}_k \).