EnsembleKalmanSchemes

There are currently four families of ensemble Kalman estimators available in this package, which define the outer-loop of the data assimilation cycle. Particularly, these define how the sequential data assimilation cycle will pass over a time series of observations, with more details in the SmootherExps documents.

Ensemble filters only produce analyses forward-in-time. The classic lag-shift smoother runs identically to the filter in its forecast and filter steps, but includes an additional retrospective analysis to past ensemble states stored in memory. The single iteration smoother follows the same convention as the classic smoother, except in that new cycles are initiated from a past, reanalyzed ensemble state. The Gauss-Newton iterative smoothers are 4D smoothers, which iteratively optimize the initial condition at the beginning of a data assimilation cycle, and propagate this initial condition to initialize the subsequent cycle. A full discussion of these methods can be found in Grudzien et al. 2022.

For each outer-loop method defining the data assimilation cycle, different types of analyses can be specified within their arguments. Likewise, these outer-loop methods require arguments such as the ensemble state or the range of ensemble states to analyze, an observation to assimilate or a range of observations to assimilate, as the observation operator and observation error covariance and key word arguments for running the underlying dynamical state model. Examples of the syntax are below:

ensemble_filter(analysis::String, ens::ArView(T), obs::VecA(T), obs_cov::CovM(T),
    kwargs::StepKwargs) where T <: Float64

ls_smoother_classic(analysis::String, ens::ArView(T), obs::ArView(T), obs_cov::CovM(T),
    kwargs::StepKwargs) where T <: Float64

ls_smoother_single_iteration(analysis::String, ens::ArView(T), obs::ArView(T),
    kwargs::StepKwargs) where T <: Float64

ls_smoother_gauss_newton(analysis::String, ens::ArView(T), obs::ArView(T), obs_cov::CovM(T),
    kwargs::StepKwargs; ϵ::Float64=0.0001, tol::Float64=0.001,
    max_iter::Int64=10) where T <: Float64

with conventions defined as follows:

  • analysis - string name of the analysis scheme;
  • ens - ensemble matrix defined by the array with columns given by the replicates of the model state;
  • obs - observation vector for the current analysis in ensemble_filter / array with columns given by the observation vectors for the ordered sequence of analysis times in the current smoothing window;
  • H_obs - observation model mapping state vectors and ensembles into observed variables;
  • obs_cov - observation error covariance matrix;
  • kwargs - keyword arguments for inflation, parameter estimation or other functionality, including integration parameters for the state model in smoothing schemes.

The analysis string is passed to the DataAssimilationBenchmarks.EnsembleKalmanSchemes.transform_R or the DataAssimilationBenchmarks.EnsembleKalmanSchemes.ens_gauss_newton methods below to produce a specialized analysis within the outer-loop controlled by the above filter and smoother methods. Observations for the filter schemes correspond to information available at a single analysis time giving an observation of the state vector of type VecA. The ls (lag-shift) smoothers require an array of observations of type ArView corresponding to all analysis times within the data assimilation window (DAW). Observation covariances are typed as CovM for efficiency. State covariance multiplicative inflation and extended state parameter covariance multiplicative inflation can be specified in kwargs. Utility scripts to generate observation operators, analyze ensemble statistics, etc, are included in the below.

Methods

DataAssimilationBenchmarks.EnsembleKalmanSchemes.analyze_ensMethod
analyze_ens(ens::ArView(T), truth::VecA(T)) where T <: Float64

Computes the ensemble state RMSE as compared with truth twin, and the ensemble spread.

return rmse, spread

Note: the ensemble ens should only include the state vector components to compare with the truth twin state vector truth, without replicates of the model parameters. These can be passed as an ArView for efficient memory usage.

DataAssimilationBenchmarks.EnsembleKalmanSchemes.analyze_ens_paramMethod
analyze_ens_param(ens::ArView(T), truth::VecA(T)) where T <: Float64

Computes the ensemble parameter RMSE as compared with truth twin, and the ensemble spread.

return rmse, spread

Note: the ensemble ens should only include the extended state vector components consisting of model parameter replicates to compare with the truth twin's governing model parameters truth. These can be passed as an ArView for efficient memory usage.

DataAssimilationBenchmarks.EnsembleKalmanSchemes.ens_gauss_newtonMethod
ens_gauss_newton(analysis::String, ens::ArView(T), obs::VecA(T),
                 H_obs::Function, obs_cov::CovM(T), kwargs::StepKwargs;
                 conditioning::ConM(T)=1000.0I,
                 m_err::ArView(T)=(1.0 ./ zeros(1,1)),
                 tol::Float64 = 0.0001,
                 j_max::Int64=40,
                 Q::CovM(T)=1.0I) where T <: Float64

Computes the ensemble estimated gradient and Hessian terms for nonlinear least-squares

return ∇_J, Hess_J

m_err, tol, j_max, Q are optional arguments depending on the analysis, with default values provided.

Serves as an auxilliary function for IEnKS(-N), where "analysis" is a string which determines the method of transform update ensemble Gauss-Newton calculation. The observation error covariance obs_cov is of type CovM, the conditioning matrix conditioning is of type ConM, the keyword arguments dictionary kwargs is of type StepKwargs and the model error covariance matrix Q is of type CovM.

Currently validated analysis options:

  • analysis == "ienks-bundle" || "ienks-n-bundle" || "ienks-transform" || "ienks-n-transform" computes the weighted observed anomalies as per the bundle or transform version of the IEnKS, described in Bocquet et al. 2013, Grudzien et al. 2022. Bundle versus tranform versions of the scheme are specified by the trailing analysis string as -bundle or -transform. The bundle version uses a small uniform scalar ϵ, whereas the transform version uses a matrix square root inverse as the conditioning operator. This form of analysis differs from other schemes by returning a sequential-in-time value for the cost function gradient and Hessian, which will is utilized within the iterative smoother optimization. A finite-size inflation scheme, based on the EnKF-N above, can be utilized by appending additionally a -n to the -bundle or -transform version of the IEnKS scheme specified in analysis.
DataAssimilationBenchmarks.EnsembleKalmanSchemes.ens_update_RT!Method
ens_update_RT!(ens::ArView(T), transform::TransM(T)) where T <: Float64

Updates forecast ensemble to the analysis ensemble by right transform (RT) method.

return ens

Arguments include the ensemble of type ArView and the 3-tuple including the right transform for the anomalies, the weights for the mean and the random, mean-preserving orthogonal matrix, type TransM.

DataAssimilationBenchmarks.EnsembleKalmanSchemes.ensemble_filterMethod
ensemble_filter(analysis::String, ens::ArView(T), obs::VecA(T), H_obs::Function,
                obs_cov::CovM(T), kwargs::StepKwargs) where T <: Float64

General filter analysis step, wrapping the right transform / update, and inflation steps. Optional keyword argument includes state_dim for extended state including parameters. In this case, a value for the parameter covariance inflation should be included in addition to the state covariance inflation.

return Dict{String,Array{Float64,2}}("ens" => ens)
DataAssimilationBenchmarks.EnsembleKalmanSchemes.inflate_param!Method
inflate_param!(ens::ArView(T), inflation::Float64, sys_dim::Int64,
               state_dim::Int64) where T <: Float64

Applies multiplicative covariance inflation to parameter replicates in the ensemble matrix.

return ens

The first index of the ensemble matrix ens corresponds to the length sys_dim (extended) state dimension while the second index corresponds to the ensemble dimension. Dynamic state variables are assumed to be in the leading state_dim rows of ens, while extended state parameter replicates are after. Multiplicative inflation is performed only in the trailing state_dim + 1: state_dim components of the ensemble anomalies from the ensemble mean, in-place in memory.

DataAssimilationBenchmarks.EnsembleKalmanSchemes.inflate_state!Method
inflate_state!(ens::ArView(T), inflation::Float64, sys_dim::Int64,
               state_dim::Int64) where T <: Float64

Applies multiplicative covariance inflation to the state components of the ensemble matrix.

return ens

The first index of the ensemble matrix ens corresponds to the length sys_dim (extended) state dimension while the second index corresponds to the ensemble dimension. Dynamic state variables are assumed to be in the leading state_dim rows of ens, while extended state parameter replicates are after. Multiplicative inflation is performed only in the leading components of the ensemble anomalies from the ensemble mean, in-place in memory.

DataAssimilationBenchmarks.EnsembleKalmanSchemes.ls_smoother_classicMethod
ls_smoother_classic(analysis::String, ens::ArView(T), obs::ArView(T), H_obs::Function,
                    obs_cov::CovM(T),  kwargs::StepKwargs) where T <: Float64

Lag-shift ensemble Kalman smoother analysis step, classical version.

Classic EnKS uses the last filtered state for the forecast, different from the iterative schemes which use the once or multiple-times re-analized posterior for the initial condition for the forecast of the states to the next shift.

Optional argument includes state dimension for extended state including parameters. In this case, a value for the parameter covariance inflation should be included in addition to the state covariance inflation.

return Dict{String,Array{Float64}}(
                                   "ens" => ens,
                                   "post" =>  posterior,
                                   "fore" => forecast,
                                   "filt" => filtered
                                  )
DataAssimilationBenchmarks.EnsembleKalmanSchemes.ls_smoother_gauss_newtonMethod
ls_smoother_gauss_newton(analysis::String, ens::ArView(T), obs::ArView(T),
                         H_obs::Function, obs_cov::CovM(T),
                         kwargs::StepKwargs; ϵ::Float64=0.0001,
                         tol::Float64=0.001, max_iter::Int64=5) where T <: Float64

This implements a lag-shift Gauss-Newton IEnKS analysis step as in algorithm 4 of Bocquet et al. 2014. The IEnKS uses the final re-analyzed initial state in the data assimilation window to generate the forecast, which is subsequently pushed forward in time from the initial conidtion to shift-number of observation times. Optional argument includes state dimension for an extended state including parameters. In this case, a value for the parameter covariance inflation should be included in addition to the state covariance inflation.

return Dict{String,Array{Float64}}(
                                   "ens" => ens,
                                   "post" =>  posterior,
                                   "fore" => forecast,
                                   "filt" => filtered
                                  )
DataAssimilationBenchmarks.EnsembleKalmanSchemes.ls_smoother_single_iterationMethod
ls_smoother_single_iteration(analysis::String, ens::ArView(T),
                             H_obs::Function, obs::ArView(T), obs_cov::CovM(T),
                             kwargs::StepKwargs) where T <: Float64

Lag-shift, single-iteration ensemble Kalman smoother (SIEnKS) analysis step.

Single-iteration EnKS uses the final re-analyzed posterior initial state for the forecast, which is pushed forward in time to shift-number of observation times. Optional argument includes state dimension for an extended state including parameters. In this case, a value for the parameter covariance inflation should be included in addition to the state covariance inflation.

return Dict{String,Array{Float64}}(
                                   "ens" => ens,
                                   "post" =>  posterior,
                                   "fore" => forecast,
                                   "filt" => filtered
                                  )
DataAssimilationBenchmarks.EnsembleKalmanSchemes.square_rootMethod
square_root(M::CovM(T)) where T <: Real

Computes the square root of covariance matrices with parametric type.

Multiple dispatches for the method are defined according to the sub-type of CovM, where the square roots of UniformScaling and Diagonal covariance matrices are computed directly, while the square roots of the more general class of Symmetric covariance matrices are computed via the singular value decomposition, for stability and accuracy for close-to-singular matrices.

return S
DataAssimilationBenchmarks.EnsembleKalmanSchemes.square_root_invMethod
square_root_inv(M::CovM(T); sq_rt::Bool=false, inverse::Bool=false,
                full::Bool=false) where T <: Real

Computes the square root inverse of covariance matrices with parametric type.

Multiple dispatches for the method are defined according to the sub-type of CovM, where the square root inverses of UniformScaling and Diagonal covariance matrices are computed directly, while the square root inverses of the more general class of Symmetric covariance matrices are computed via the singular value decomposition, for stability and accuracy for close-to-singular matrices. This will optionally return a computation of the inverse and the square root itself all as a byproduct of the singular value decomposition for efficient numerical computation of ensemble analysis / update routines.

Optional keyword arguments are specified as:

  • sq_rt=true returns the matrix square root in addition to the square root inverse
  • inverse=true returns the matrix inverse in addition to the square root inverse
  • full=true returns the square root and the matrix inverse in addition to the square root inverse

and are evaluated in the above order.

Output follows control flow:

if sq_rt
    return S_inv, S
elseif inverse
    return S_inv, M_inv
elseif full
    return S_inv, S, M_inv
else
    return S_inv
end
DataAssimilationBenchmarks.EnsembleKalmanSchemes.transform_RMethod
transform_R(analysis::String, ens::ArView(T), obs::VecA(T), H_obs::Function,
            obs_cov::CovM(T), kwargs::StepKwargs; conditioning::ConM=1000.0I,
            m_err::ArView(T)=(1.0 ./ zeros(1,1)),
            tol::Float64 = 0.0001,
            j_max::Int64=40,
            Q::CovM(T)=1.0I) where T <: Float64

Computes the ensemble transform and related values for various flavors of ensemble Kalman schemes. The output type is a tuple containing a right transform of the ensemble anomalies, the weights for the mean update and a random orthogonal transformation for stabilization:

return trans, w, U

where the tuple is of type TransM. m_err, tol, j_max, Q are optional arguments depending on the analysis, with default values provided.

Serves as an auxilliary function for EnKF, ETKF(-N), EnKS, ETKS(-N), where "analysis" is a string which determines the type of transform update. The observation error covariance obs_cov is of type CovM, the conditioning matrix conditioning is of type ConM, the keyword arguments dictionary kwargs is of type StepKwargs and the model error covariance matrix Q is of type CovM.

Currently validated analysis options:

  • analysis=="etkf" || analysis=="etks" computes the deterministic ensemble transform as in the ETKF described in Grudzien et al. 2022.
  • analysis[1:7]=="mlef-ls" || analysis[1:7]=="mles-ls" computes the maximum likelihood ensemble filter transform described in Grudzien et al. 2022, optimizing the nonlinear cost function with Newton-based line searches.
  • analysis[1:4]=="mlef" || analysis[1:4]=="mles" computes the maximum likelihood ensemble filter transform described in Grudzien et al. 2022, optimizing the nonlinear cost function with simple Newton-based scheme.
  • analysis=="enkf-n-dual" || analysis=="enks-n-dual" computes the dual form of the EnKF-N transform as in Bocquet et al. 2015 Note: this cannot be used with the nonlinear observation operator. This uses the Brent method for the argmin problem as this has been more reliable at finding a global minimum than Newton optimization.
  • analysis=="enkf-n-primal" || analysis=="enks-n-primal" computes the primal form of the EnKF-N transform as in Bocquet et al. 2015, Grudzien et al. 2022. This differs from the MLEF/S-N in that there is no approximate linearization of the observation operator in the EnKF-N, this only handles the approximation error with respect to the adaptive inflation. This uses a simple Newton-based minimization of the cost function for the adaptive inflation.
  • analysis=="enkf-n-primal-ls" || analysis=="enks-n-primal-ls" computes the primal form of the EnKF-N transform as in Bocquet et al. 2015, Grudzien et al. 2022. This differs from the MLEF/S-N in that there is no approximate linearization of the observation operator in the EnKF-N, this only handles the approximation error with respect to the adaptive inflation. This uses a Newton-based minimization of the cost function for the adaptive inflation with line searches.