Presented at the Joint ICTP-IUGG Workshop on Data Assimilation and Inverse Problems in Geophysical Sciences

19 October, 2021

- The following topics will be covered in this lecture
- Observation-analysis-forecast cycles
- Hidden Markov models and Bayesian analysis
- Bayesian maximum-a-posteriori (MAP) analysis in Gaussian models
- The ensemble Kalman filter and smoother (EnKF / EnKS)
- The 4D cost function and VAR approaches
- Sequential EnVAR estimators in the EnKF analysis:
- the iterative ensemble Kalman smoother (IEnKS)
- the single-iteration ensemble Kalman smoother (SIEnKS)

- In applications such as
**short-range weather prediction**, data assimilation provides a means to**sequentially and recursively update forecasts**with newly incoming information. - The
**Bayesian approach to DA**has become widely adopted because it provides a**unified treatment of the tools from statistical estimation, nonlinear optimization and machine learning**for handling such a problem. - Let’s suppose that the
**dynamical physical states**can be written in a vector, \( \pmb{x}_k \in \mathbb{R}^{N_x} \), where \( k \) corresponds to some time \( t_k \). - Abstractly, we will represent the
**time-evolution**of these states with the**nonlinear map \( \mathcal{M} \)**, \[ \begin{align}\\ \pmb{x}_k = \mathcal{M}_{k} \left(\pmb{x}_{k-1}, \boldsymbol{\lambda}\right) + \boldsymbol{\eta}_k \end{align} \] where - \( \pmb{x}_{k-1} \) is the
**vector of physical states**at an**earlier time**\( t_{k-1} \); - \( \boldsymbol{\lambda} \) is a vector of
**uncertain physical parameters**on which the**time evolution depends**; - \( \boldsymbol{\eta}_k \) is an
**additive, stochastic noise**term, representing**errors in our model for the physical process**. - We will
**estimate the random vector**\( {\color{#d95f02} {\pmb{x}_k } } \) with a prior distribution on \( \left(\pmb{x}_{k-1}, \boldsymbol{\lambda}\right) \) and knowledge of \( \mathcal{M}_{k} \) and knowledge of how \( \boldsymbol{\eta}_k \) is distributed. - At time \( t_{k-1} \), we will make a
**forecast for the distribution of \( \pmb{x}_k \)**with our prior knowledge, including the physics-based model. - For the rest of this lecture, we will restrict to the case that \( \boldsymbol{\lambda} \) is a known constant, and the forecast model is perfect, \[ \begin{align} \pmb{x}_k = \mathcal{M}_{k} \left(\pmb{x}_{k-1}\right), \end{align} \] for simplicity.

- Suppose that we are also given
**real-world observations**\( {\color{#7570b3} {\pmb{y}_k\in\mathbb{R}^{N_y}} } \) related to the physical states by, \[ {\color{#7570b3} { \pmb{y}_k = \mathcal{H}_k \left(\pmb{x}_k\right) + \boldsymbol{\epsilon}_k } } \] where - \( {\color{#7570b3} {\mathcal{H}_k:\mathbb{R}^{N_x} \rightarrow \mathbb{R}^{N_y} } } \) is a nonlinear map relating the
**states we wish to estimate \( \pmb{x}_k \)**to the**values that are observed**\( {\color{#7570b3} { \pmb{y}_k} } \). - Typically \( N_y \ll N_x \) so this is information is sparse and observations are not \( 1:1 \) with the unobserved states.
- \( \boldsymbol{\epsilon}_k \) is an
**additive, stochastic noise**term representing**errors in the measurements**. - Therefore, at time \( t_k \) we will have a
**forecast distribution**for the states \( \pmb{x}_k \)**generated by our prior on \( \pmb{x}_{k-1} \) and our physics-based model \( \mathcal{M} \)**. - We will also have an
**observation \( \pmb{y}_k \)**with uncertainty. - We wish to find a
**posterior distribution for \( \pmb{x}_k \)****conditioned on \( \pmb{y}_k \)**.