Elementary numerical solution to ODEs and SDEs Part II

Instructions:

Use the left and right arrow keys to navigate the presentation forward and backward respectively. You can also use the arrows at the bottom right of the screen to navigate with a mouse.

FAIR USE ACT DISCLAIMER:
This site is for educational purposes only. This website may contain copyrighted material, the use of which has not been specifically authorized by the copyright holders. The material is made available on this website as a way to advance teaching, and copyright-protected materials are used to the extent necessary to make this class function in a distance learning environment. The Fair Use Copyright Disclaimer is under section 107 of the Copyright Act of 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, education and research.

Outline

The following topics will be covered in this lecture:
- The second-order Taylor Scheme for autonomous ODEs
- The four-stage Runge-Kutta scheme for ODEs
- The four-stage Runge-Kutta scheme for SDEs
- Discrete maps from a continuous-time model

The second-order Taylor Scheme for autonomous ODEs

Recall the ODE initial value problem,

\[ \begin{align} \frac{\mathrm{d}}{\mathrm{d}t}\pmb{x}: = \pmb{f}(t, \pmb{x}), & & \pmb{x}(0):= \pmb{x}_0. \end{align} \]
As noted before, the Euler scheme arises from the first-order Taylor expansion of the initial value problem

\[ \begin{align} \hat{\pmb{x}}(t_{i+1}) := \hat{\pmb{x}}(t_{i}) + \pmb{f}(t_i, \hat{\pmb{x}}(t_i)) h. \end{align} \]
It is reasonable thus to consider taking a higher-order Taylor expansion to get a better approximation of the integral – for simplicity, let's consider the case where \( \pmb{f} \) doesn't depend on time \( t \).
This is what is known as an autonomous dynamical model – for an exact solution we expand at second order as

\[ \begin{align} \pmb{x}(t_{i+1}) &= \pmb{x}(t_i) + \frac{\mathrm{d}}{\mathrm{d}t}|_{t_i} \pmb{x} h + \frac{1}{2} \frac{\mathrm{d}^2}{\mathrm{d}t^2}|_{t_i} \pmb{x} h^2 + \mathcal{O}\left(h^3\right)\\ &= \pmb{x}(t_i) + \pmb{f}(\pmb{x}(t_i)) h + \frac{1}{2} \frac{\mathrm{d}}{\mathrm{d}t}|_{t_i} \pmb{f}(\pmb{x}(t))h^2 + \mathcal{O}\left(h^3\right)\\ &=\pmb{x}(t_i) + \pmb{f}(\pmb{x}(t_i)) h + \frac{1}{2} \left(\nabla_{\pmb{x}} \pmb{f}(\pmb{x}(t_t))\right)\pmb{f}(\pmb{x}(t_i))h^2 + \mathcal{O}\left(h^3\right) \end{align} \]
where the final line is obtained using the multivariate chain rule, differentiating with respect to \( t \) and thus implicitly with respect to \( \nabla_{\pmb{x}} \).

The second-order Taylor Scheme for autonomous ODEs

Truncating the last expression at terms of second order, we obtain the second-order Taylor scheme for autonomous ODEs.

Second-order Taylor scheme for autonomous ODEs
Let \( \pmb{x}(t) \) satisfy the initial value problem where \( \pmb{f} \) is a function of \( \pmb{x} \) alone. The approxmation of this solution by the second-order Taylor scheme is defined recursively as \[ \begin{align} \hat{\pmb{x}}(t_{i+1}) := \hat{\pmb{x}}(t_{i}) + \pmb{f}(\hat{\pmb{x}}(t_i)) h +\frac{1}{2} \left(\nabla_{\pmb{x}_{t_i}}\pmb{f}(\pmb{x}(t_i)) \right)\pmb{f}(\pmb{x}(t_i)) h^2. \end{align} \] This scheme has an order \( 2.0 \) convergence to the true solution.

Once again, although the truncation error is at third order, propagation error from the approximate solution means that the total error is at second order.
This does make a significant improvement on the Euler scheme;
- however, the issue is that this approach becomes increasingly complicated when \( \pmb{f} \) is a function of time simultaneously, and the Jacobian itself may not be analytically solvable.
Furthermore, this approach is extremely difficult when one uses Itô-Taylor expansions, and needs to resolve higher-order stochastic integrals.

The four-stage Runge-Kutta scheme for ODEs

An alternative approach that is widely used is to evaluate \( \pmb{f}(t, \pmb{x}) \) at more points, while attempting to retain the accuracy of the higher-order Taylor approximation.
These methods are known as Runge-Kutta methods, taking a general form of

\[ \begin{align} \hat{\pmb{x}}(t_{i+1}) := \hat{\pmb{x}}(t_i) + h\pmb{g}(t_i, \pmb{x}(t_i), h), \end{align} \] where \( \pmb{g} \) above is constructed as a kind of “average slope” of the solution on the interval \( [t_i, t_{i+1}] \).
For methods of order 2, we generally choose

\[ \begin{align} \pmb{g}(t, \hat{\pmb{x}}, h):= b_1 \pmb{f}(t, \hat{\pmb{x}}) + b_2 \pmb{f}(t + \alpha h, \hat{\pmb{x}} + h \beta \pmb{f}(t, \hat{\pmb{x}})) \end{align} \]
The constant coefficients \( \{\alpha, \beta, b_1, b_2 \} \) are determined such that the truncation error

\[ \hat{\pmb{x}}(t_{i+1}) -\left[ \hat{\pmb{x}}(t_i) + h\pmb{g}(t_i, \hat{\pmb{x}}(t_i), h)\right] \equiv \mathcal{O}\left( h^3\right), \] by appropriately matching terms in the Taylor expansion.

The four-stage Runge-Kutta scheme for ODEs

This methodology can be generalized for ODEs to match higher-order expansions.
An explicit Runge-Kutta formula with \( s \)-total stages has the following form:

\[ \begin{align} \pmb{z}_1 &:= \hat{\pmb{x}}_i, \\ \pmb{z}_2 &:= \hat{\pmb{x}}_i + a_{2,1} h \pmb{f}(t_i, \pmb{z}_1), \\ \pmb{z}_3 &:= \hat{\pmb{x}}_i + h\left[a_{3,1}\pmb{f}(t_i,\pmb{z}_1)+a_{3,2}\pmb{f}(t_i + c_2 h,\pmb{z}_2)\right],\\ &\quad \vdots \\ \pmb{z}_s &:= \hat{\pmb{x}}_i + h\left[a_{s,1}\pmb{f}(t_i,\pmb{z}_1)+a_{s,2}\pmb{f}(t_i + c_2 h,\pmb{z}_2) +\cdots + a_{s,s-1}\pmb{f}(t_i+c_{s-1}h, \pmb{z}_{s-1})\right], \end{align} \]
where we define the next step as,

\[ \begin{align} \hat{\pmb{x}}_{i+1} := \hat{\pmb{x}}_i + h\sum_{j=1}^s b_j \pmb{f}(t_i + c_j h,\pmb{z}_j). \end{align} \]
The process thus approximates the time-evolution by taking multiple sample points of the tangent line approximation in space and time.
These sample points \( (t_i + c_j h, \pmb{z}_j) \) are computed recursively in terms of the coefficients \( a_{l,j},c_{j} \), while the average over their slopes is weighted by the \( b_j \).

The four-stage Runge-Kutta scheme for ODEs

For orders up to \( \mathcal{O}\left(h^4\right) \), we can match the global error with exactly \( s= \gamma \) total stages of the approximation.
- However, for orders \( \gamma > 4 \), this begins to require \( s>\gamma \) total stages in the approximation.
For this reason, a popular and widely used scheme is the four-stage Runge-Kutta scheme;
- this balances both accuracy and simplicity, and is widely applicable for generic, autonomous and non-autonomous ODEs.

Four-stage Runge-Kutta scheme for ODEs
Let \( \pmb{x}(t) \) satisfy the initial value problem above; the approximation of this solution by the four-stage Runge-Kutta scheme is defined recursively as \[ \begin{align} \pmb{\kappa}_1 &:= \pmb{f}\left(t_i, \hat{\pmb{x}}(t_i) \right)h ,\\ \pmb{\kappa}_2 &:= \pmb{f}\left(t_i + \frac{h}{2}, \hat{\pmb{x}}(t_i)+ \frac{\pmb{\kappa}_1}{2}\right) h, \\ \pmb{\kappa}_3 &:= \pmb{f}\left(t_i + \frac{h}{2},\hat{\pmb{x}}(t_i) + \frac{\pmb{\kappa}_2}{2} \right)h,\\ \pmb{\kappa}_4 &:= \pmb{f}\left(t_i + h, \hat{\pmb{x}}(t_i) + \pmb{\kappa}_3 \right)h,\\ \hat{\pmb{x}}(t_{i+1})& := \hat{\pmb{x}}(t_i) + \frac{1}{6}\left[\pmb{\kappa}_1 + 2\pmb{\kappa}_2 + 2 \pmb{\kappa}_3+ \pmb{\kappa}_4 \right]. \end{align} \] The scheme converges to the true solution with order \( \mathcal{O}\left(h^4\right) \).

Notice, an enormous benefit of the above four-stage scheme is that it doesn't need to compute fourth-order Taylor expansions, which become impractical for almost all models.

The four-stage Runge-Kutta scheme for ODEs

The benefits of using the four-stage Runge-Kutta scheme over the Euler scheme are huge.
We can generically consider the error bounds for a step size of \( h=10^{-1} \) as

\[ \begin{align} \parallel \pmb{x}(t) - \hat{\pmb{x}}_\mathrm{RK}(t) \parallel&\leq C_\mathrm{RK} h^{-4}, \\ \parallel \pmb{x}(t) - \hat{\pmb{x}}_\mathrm{E}(t) \parallel&\leq C_\mathrm{E} h^{-1}, \\ \end{align} \] for all \( t \in [0,T] \) for \( T \) sufficiently small.
Note, however, floating point arithmetic introduces additional errors so that for \( h\ll 1 \), these bounds are not linear in \( \log_{10} \).
Nonetheless, the four-stage Runge-Kutta is the recommended out-of-the-box solver for nearly any well-conditioned problem.
- However, additional considerations apply if the function \( \pmb{f} \) is extremely sensitive to small changes in step sizes (i.e, this is a “stiff” equation).
An additional benefit of the four-stage Runge-Kutta is also that, with only minor modifications, this scheme is a statistically robust solver for SDEs, and works widely out-of-the-box in these systems as well.

The four-stage Runge-Kutta scheme for SDEs

We recall the form of our generic SDE, but this time presented in vector notations,

\[ \begin{align} \mathrm{d}\pmb{x}:= \pmb{f}(t, \pmb{x}(t)) \mathrm{d}t + \mathbf{S}(t, \pmb{x}(t))\mathrm{d}\pmb{W}_t. \end{align} \]

Four-stage Runge-Kutta scheme for SDEs
For the system of SDEs defined above, the numerical approximation of the sample path by the four-stage Runge-Kutta scheme is defined recursively as \[ \begin{align} \pmb{\kappa}_1 &:= \pmb{f}\left(t_i, \hat{\pmb{x}}(t_i) \right)h +\mathbf{S}(t_i, \hat{\pmb{x}}(t_i)) \pmb{\xi}\sqrt{h}, \\ \pmb{\kappa}_2 &:= \pmb{f}\left(t_i + \frac{h}{2}, \hat{\pmb{x}}(t_i)+ \frac{\pmb{\kappa}_1}{2}\right) h+ \mathbf{S}\left(t_i + \frac{h}{2}, \hat{\pmb{x}}(t_i)+ \frac{\pmb{\kappa}_1}{2}\right) \pmb{\xi}\sqrt{h} , \\ \pmb{\kappa}_3 &:= \pmb{f}\left(t_i + \frac{h}{2},\hat{\pmb{x}}(t_i) + \frac{\pmb{\kappa}_2}{2} \right)h+ \mathbf{S}\left(t_i + \frac{h}{2},\hat{\pmb{x}}(t_i) + \frac{\pmb{\kappa}_2}{2} \right) \pmb{\xi}\sqrt{h} , \\ \pmb{\kappa}_4 &:= \pmb{f}\left(t_i + h, \hat{\pmb{x}}(t_i) + \pmb{\kappa}_3 \right)h + \mathbf{S}\left(t_i + h, \hat{\pmb{x}}(t_i) + \pmb{\kappa}_3 \right) \pmb{\xi}\sqrt{h}, \\ \hat{\pmb{x}}(t_{i+1})& := \hat{\pmb{x}}(t_i) + \frac{1}{6}\left[\pmb{\kappa}_1 + 2\pmb{\kappa}_2 + 2 \pmb{\kappa}_3+ \pmb{\kappa}_4 \right], \end{align} \] where \( \pmb{\xi}\sim N(\pmb{0}, \mathbf{I}) \). The scheme converges both weakly and strongly to the Stratonovich form of the true solution with order \( \mathcal{O}\left(h^{1.0}\right) \).

Notice, although this uses four stages like the deterministic scheme, this only attains order \( 1.0 \) convergence generically.
This has to do with the limitations of Runge-Kutta schemes approximating multiple stochastic integrals.

The four-stage Runge-Kutta scheme for SDEs

Note, again for additive noise where \( \mathbf{S} \) is a function of time alone, there is no distinction between the Itô and Stratonovich forms of the SDE.
In this case, Euler-Maruyama also attains both weak and strong convergence of order \( 1.0 \) like the Runge-Kutta scheme.
It may seem then like this offers no advantage over the Euler-Maruyama scheme.
- However, in practice when simulating a system which is driven by the drift terms, i.e., the mechanistic process, this provides a much greater accuracy than the Euler-Maruyama scheme.
Empirically, \( C^{\mathrm{weak} / \mathrm{strong}}_\mathrm{RK} \ll C^{\mathrm{weak} / \mathrm{strong}}_\mathrm{EM} \), even by orders of magnitude.
This is due to the fact that the four-stage Runge-Kutta converges to a fourth-order method when the model noise is taken to a zero-noise limit (i.e, the system becomes deterministic).
The differences in the Euler-Maruyama and Runge-Kutta solutions are relaxed when noise dominates the system, i.e., the shocks are so great to the system that it is a shock driven dynamics.

Discrete maps from a continuous-time model

We now have a general means of simulating continuous-time Markov models, particularly for systems defined by a mechanistic model perturbed by additive noise.
This is important because, in realistic, high-dimensional and nonlinear systems, we can rarely prove formal convergence results of estimators and optimization routines.
Rather, we typically rely on theoretical results based on the discrete, Gauss-Markov model approximation, and we must demonstrate nonlinear convergence and stability results based on numerical test-cases.
It warrants understanding, then, how the continuous-time model corresponds to a discrete, Gauss-Markov model.
Recall for the deterministic initial value problem, we stated that a discrete model can be generated by the flow map

\[ \begin{align} \pmb{\Phi}(t_1, \pmb{x}_0) = \pmb{x}(t_1) = \int_{t_0}^{t_1} \pmb{f}(s, \pmb{x})\mathrm{d}s + \pmb{x}_0 \end{align} \]
In particular, we recover identically a discrete, matrix value map when \( \pmb{f}(t, \pmb{x}) := \mathbf{A}(t) \pmb{x} \) is a linear transformation.
Such a solution is derived from what is known as a fundamental matrix solution to the initial value problem.

Fundamental matrix solutions

Suppose that we have a linear system of ODEs, for a nonsingular \( \mathbf{A}(t)\in \mathbb{R}^{N_x \times N_x} \),

\[ \begin{align} \frac{\mathrm{d}}{\mathrm{d}t} \pmb{x} := \mathbf{A}(t) \pmb{x}; \end{align} \]
- we can define a matrix-valued ODE as
\[ \begin{align} \frac{\mathrm{d}}{\mathrm{d}t} \mathbf{M} := \mathbf{A}(t) \mathbf{M} . \end{align} \]
It is a classical result of ODEs and dynamical systems that this matrix-valued ODE gives the entire solution to the linear, initial value problem.

Fundamental matrix solution
Let \( \mathbf{M}_t \) be defined as the solution to the matrix-valued ODE above, with initial data \( \mathbf{M}_0:= \mathbf{I}_{N_x} \). Then, \( \mathbf{M}_t \) is a fundamental matrix solution for the linear system of ODEs above, satisfying the property that \[ \begin{align} \pmb{x}_t = \mathbf{M}_t \pmb{x}_0. \end{align} \]

Note, when \( \mathbf{A} \) does not depend on time, this solution is given identically by the matrix exponential, i.e.,

\[ \begin{align} \mathbf{M}_t := \exp\{ \mathbf{A} t\}; \end{align} \]
- however, in general there is not an analytical solution for a time dependent matrix \( \mathbf{A}(t) \).

Discrete maps from a continuous-time model

Despite the lack of an analytical solution, a fundamental matrix solution can be generated numerically to create a discrete, Gauss-Markov model from a linear system of ODEs, with discrete perturbations of noise.
Specifically, a common model for the linear DA problem is to simulate the problem as

\[ \begin{align} \pmb{x}_k := \left(\int_{t_{k-1}}^{t_k} \mathbf{A}(s) \mathrm{d}s + \mathbf{I}\right) \pmb{x}_{k-1} + \pmb{w}_k \end{align} \]
This defines a discrete, Gauss-Markov model, where the perturbations \( \pmb{w}_k \) are not continuous in time.
Note, if the mechanistic model is defined instead by a nonlinear ODE,

\[ \begin{align} \frac{\mathrm{d}}{\mathrm{d}t}\pmb{x}: = \pmb{f}(t, \pmb{x}), & & \pmb{x}(0):= \pmb{x}_0 \end{align} \] a classical approach to extend this analysis is given by the so-called tangent-linear model.
Suppose that once again, \( \pmb{x}_1 = \pmb{x}_0 + \pmb{\delta}_{\pmb{x}_1} \) is a perturbation of some point \( \pmb{x}_0 \).
If we want to find the evolution of such a perturbation, we can consider the equation

\[ \begin{align} &\frac{\mathrm{d}}{\mathrm{d}t} \pmb{x}_1 := \frac{\mathrm{d}}{\mathrm{d}t} \pmb{x}_0 + \frac{\mathrm{d}}{\mathrm{d}t} \pmb{\delta}_{\pmb{x}_1} \\ \Leftrightarrow & \frac{\mathrm{d}}{\mathrm{d}t} \pmb{\delta}_{\pmb{x}_1} = \frac{\mathrm{d}}{\mathrm{d}t} \left( \pmb{x}_1 - \pmb{x}_0\right). \end{align} \]

Computing the tangent-linear model

From the last slide, we had that

\[ \begin{align} \frac{\mathrm{d}}{\mathrm{d}t} \pmb{\delta}_{\pmb{x}_1}= \frac{\mathrm{d}}{\mathrm{d}t} \left( \pmb{x}_1 - \pmb{x}_0\right) = \pmb{f}(t, \pmb{x}_1) - \pmb{f}(t,\pmb{x}_0) = \nabla_{\pmb{x}} \pmb{f}(t, \pmb{x}_0)\pmb{\delta}_{\pmb{x}_1} + \mathcal{O}\left(\parallel \pmb{\delta}_{\pmb{x}_1}\parallel^2 \right) \end{align} \]
This defines the evolution of the perturbation \( \pmb{\delta}_{\pmb{x}_1} \) in terms of a linear ODE of the Jacobian, up to higher-order terms.

Tangent-linear model
Let \( \pmb{\delta} \) be a perturbation of a trajectory \( \pmb{x}(t) \) for some initial value problem, i.e., it is in the tangent space of the trajector: \( \pmb{\delta}\in T_{\pmb{x}(t)} \). The tangent-linear model is defined by \[ \begin{align} \pmb{\delta}_k = \mathbf{M}_k \pmb{\delta}_{k-1} \end{align} \] where \( \mathbf{M}_k \) is the fundamental matrix solution of the linear equation \[ \begin{align} \frac{\mathrm{d}}{\mathrm{d}t} \pmb{\delta} = \nabla_{\pmb{x}}\pmb{f}(\pmb{x}(t)) \pmb{\delta} \end{align} \] with dependence on the underlying nonlinear solution to \[ \begin{align} \frac{\mathrm{d}}{\mathrm{d}t}\pmb{x}: = \pmb{f}(t, \pmb{x}), & & \pmb{x}(0):= \pmb{x}_0. \end{align} \]

With the above tangent-linear approximation, one can again define a discrete Gauss-Markov model as \( \pmb{x}_k := \mathbf{M}_k \pmb{x}_{k-1} + \pmb{w}_k \), where this is an approxiation on the evolution of the space of perturbations.
This approximately represents perturbations of the mean under nonlinear evolution, with \( \pmb{w}_k \) representing the errors of these approximations in, e.g., truncating higher-order terms.

Discrete maps from a (stochastic) continuous-time model

The classical, tangent-linear approach was widely used to find discrete Gauss-Markov models for many problems arising from nonlinear, deterministic initial value problems.
However, the discrete model is more complicated for even linear SDEs, as we rather consider the flow map

\[ \begin{align} \pmb{\Phi}(\omega,t, \pmb{x}_0) =\pmb{x}_t \end{align} \] to be a randomly generated mapping (diffeomorphism) determined by the outcome \( \omega \).
- Solutions to the SDE equation are thus just sample paths that are distributed according to the solution of the Fokker-Plank equation.
- This type of discrete model motivates the technique of a Monte-Carlo simulation to generate an empirical representation of the forward probability density.
The benefit of this approach is, again, that one does not need to explicitly compute the Jacobian.
Instead, we can use a nonlinear ensemble of solutions to the O/S DE to sample the forward density.
That is, we will define an ensemble matrix \( \mathbf{E}_k \) where each column satisfies a O/S DE initial value problem with uncertain initial data,

\[ \begin{align} \mathbf{E}_k &:= \mathcal{M}_k(\mathbf{E}_k) + \pmb{w}_k \end{align} \] where \( \pmb{w}_k\equiv \pmb{0} \) if \( \mathcal{M}_k \) is the randomly generated flow map of an SDE.
From this ensemble matrix, we can compute empirical statistics such as the sample mean and sample covariance.