Continuous-time models and stochastic calculus Part I

Instructions:

Use the left and right arrow keys to navigate the presentation forward and backward respectively. You can also use the arrows at the bottom right of the screen to navigate with a mouse.

FAIR USE ACT DISCLAIMER:
This site is for educational purposes only. This website may contain copyrighted material, the use of which has not been specifically authorized by the copyright holders. The material is made available on this website as a way to advance teaching, and copyright-protected materials are used to the extent necessary to make this class function in a distance learning environment. The Fair Use Copyright Disclaimer is under section 107 of the Copyright Act of 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, education and research.

Outline

The following topics will be covered in this lecture:
- The stochastic integral
- Modes of convergence
- The Itô Taylor expansion
- Itô versus Stratononvich calculus

The stochastic integral

We have now introduced the context for a discrete Gauss-Markov model, as generated by a sequence of linear transformations and random shocks.
However, in practice, we will often consider a model that evolves in continuous time in between discrete observations of the system.
In order to discuss such a conditional inference problem, we will extend the Gauss-Markov model to continuous time with the notion of a stochastic differential equation.
The mathematics behind solutions to stochastic differential equations is quite complex, and instead we will focus on an intuitive development of the big picture.
We will start by discussing the analogy and the difference between a deterministic and a stochastic integral.

The stochastic integral

Consider a smooth function

\[ \begin{align} h : [0, T ] \rightarrow \mathbb{R} \end{align} \] for which the derivative \( \frac{\mathrm{d}}{ \mathrm{d} t } h \) is bounded on \( [0, T ] \).
To define the Riemann integral of \( h \), the interval \( [0, T ] \) is partitioned into subintervals

\[ \begin{align} 0 = t_0 < t_1 < \cdots < t_{N−1} < t_N = T . \end{align} \]
The Riemann integral of \( h \) is then given by

\[ \begin{align} \int_{0}^T h(t) \mathrm{d}t := \lim_{N\rightarrow\infty} \sum_{j=0}^{N-1} h(t_j)\left(t_{j+1} - t_j\right) \end{align} \]
The stochastic integral can be defined in a similar way, and two forms exist, the Itô form and the Stratonovich form.
Both forms of the stochastic integral are used in practice for different applications.
- Notably Itô is typically preferred in applications such as finance, due to the fact that it preserves the martingale property:
\[ \begin{align} \mathbb{E}\left[\vert X_n\vert \right] < \infty & & \mathbb{E}\left[ X_n |X_{n-1:0}\right] = X_{n-1}; \end{align} \]
- on the other hand, the Stratonovich formulation is often the preferred choice in physical models, due to its connection to coarse-grained simulation of a higher-resolution physical process.

The stochastic integral

We will start by recalling the definition of the Wiener process as

Wiener process
A continuous-time stochastic process is denoted a Wiener process \( W_{t} \) if it has the following properties:

\( W_0:= 0 \),

\( W \) has independent increments;

The increments \( W_{t+s} - W_{t} \sim N(0,s) \); and

\( W_t \) is continuous in \( t \).

In analogy to the deterministic integral, the Itô integral is given like the Riemann integral, but integrating “against” a Wiener process

\[ \begin{align} \int_{0}^T h \mathrm{d}W_t:=\lim_{N\rightarrow \infty} \sum_{j=0}^{N-1} h(t_j) \left(W_{t_{j+1}} - W_{t_{j}} \right). \end{align} \]
In the above, note that we use the left-endpoint of the partition of the interval, just like in the Riemann integral.
The Stratonovich integral is defined only as a slight variation where, instead of the left end-point, we use the mid-point rule:

\[ \begin{align} \int_0^T h \circ\mathrm{d}W_t := \lim_{N\rightarrow \infty}\sum_{j=0}^{N-1} h\left(\frac{t_{j+1} - t_{j}}{2}\right) \left(W_{t_{j+1}} - W_{t_{j}} \right). \end{align} \]

Modes of convergence

In the last slide, the two definitions are deceivingly simple, because we haven't specified in what way the limit and convergence is actually defined.
It turns out, that for random variables, there are several ways we can consider convergence:

Convergence in probability
Let \( X_n \) , \( n = 1, 2, \cdots , \) be a sequence of random variables. We say that xn converges in probability to some random variable \( X \) if, for every real number \( \epsilon> 0 \), \[ \begin{align} \lim_{n\rightarrow \infty} \mathcal{P}\left(\vert X_n - X\vert > \epsilon \right) = 0. \end{align} \]

This is a definition that is basically the same as the previous “continuity in probability” seen earlier;
- i.e., the probability of observing a non-zero jump between the sequence limit and the random variable \( X \), becomes increasingly small, limiting to zero.
- In particular, if the CDF of \( X_n \) is given by \( P_n \), and \( X\sim P \), then the above implies that \( P_n\rightarrow P \).

Modes of convergence

A stronger notion of convergence, that is similar to deterministic convergence, is the following:

Almost sure convergence
A random sequence, \( X_n \), is said to converge almost surely, or with probability 1, to a random variable \( X \) if \[ \begin{align} \lim_{n\rightarrow \infty} \vert X_n - X\vert = 0 \end{align} \] except on a set \( A_0 \) of probability zero, i.e., \( \mathcal{P}(A_0)=0 \).

This explicitly requires that, with probability one, the limiting random variable attains the identical realization of \( X \).
- This is qualitatively different than saying, the probability of seeing a discrepancy shrinks in the limit.
These two above definitions are useful to introduce now, as they are similar to the notions we will use in numerical simulation of SDEs later of “weak” and “strong” convergence.
Finally, the mode of convergence we will use when considering the Itô and Stratonovich integrals defined earlier is the following, mean-square convergence.

Mean-square convergence
Let \( X_k \) be a random sequence such that \( \mathbb{E}\left[X_k^2 \right] < \infty \) and let \( X \) be a random variable such that \( \mathbb{E}\left[X^2\right] <\infty \). The sequence, \( X_k \), is said to converge in the mean-square to \( X \) if \[ \begin{align} \lim_{k\rightarrow \infty}\mathbb{E}\left[\left(X_k - X\right)^2 \right] = 0. \end{align} \]

This is actually a stronger notion of convergence than convergence in probability, but weaker than almost sure convergence.
- That is, mean-square convergence implies convergence in probability, but not almost-sure convergence.

An example of the Itô integral of a Wiener process

Let's consider an example case of the Itô integral for which \( h(t):=W_t \), i.e., the Itô integral of a Wiener process.
For this case, the Itô integral can be evaluated analytically because

\[ \begin{align} &\sum_{j=0}^{N-1} W_{t_j}\left(W_{t_{j+1}} - W_{t_j}\right)\\ =&\sum_{j=1}^{N-1} \frac{1}{2} \left[W_{t_{j+1}}^2 - W_{t_j}^2 -\left(W_{t_{j+1}} - W_{t_{j}} \right)^2 \right]\\ =&\frac{1}{2} \left(W_T^2 - W_0^2 \right) - \frac{1}{2}\sum_{j=0}^{N-1} \left(W_{t_{j+1}} - W_{t_j} \right)^2 \end{align} \]
Assume that the time-discretization is uniform such that \( \mathrm{d}t := t_{j+1} - t_{j} \) for all \( j \).
Then, for \( \mathrm{d}W_j := W_{t_{j+1}} - W_{t_j} \) we have that

\[ \begin{align} \mathbb{E}\left[\left(\mathrm{d}W_j\right)^2 \right] = \mathrm{d}t \end{align} \] by definition of the Wiener process.
In the “mean-square algebra”, we then write identically \( \left(\mathrm{d}W_j\right)^2 \equiv \mathrm{d}t \), though the formal mathematics is suppressed here.

An example of the Itô integral of a Wiener process

From the last slide, we thus obtain

\[ \begin{align} \sum_{j=0}^{N-1} \left(W_{t_{j+1}} - W_{t_j}\right)^2 = \sum_{j=0}^{N-1} \left(\mathrm{d}W_j\right)^2= \sum_{j=0}^{N-1} \mathrm{d}t = T \end{align} \]
Noting that, by definition, \( W_0 \equiv 0 \), we thus obtain

\[ \begin{align} \int_{0}^T W_t \mathrm{d}W_t = \frac{1}{2} \left( W_T^2 - T\right). \end{align} \]
In the above, it is important to remember that this is an equality in the mean-square sense, such that both sides represent random variables for which
- on the left-hand-side, the limit over the partition gives a sequence such that;
- the expected value of the square difference with the right-hand-side equals zero.
Therefore, we say that the Itô integral on the left hand size converges in the mean-square sense to 0.5 times the square of a Gaussian random variable, with mean zero and variance \( T \), plus the constant \( T \).
This convergence of the integral to a random variable shows some of the subtlety of working with SDEs.

Itô Taylor expansion

Recall that in deterministic calculus, the fundamental theorem of calculus tells us that

\[ \begin{align} f(b) - f(a) = \int_{a}^b \frac{\mathrm{d}}{\mathrm{d}t}f(t) \mathrm{d}t \end{align} \] provided such a derivative exists over the interval.
For a perturbation in time \( \mathrm{d}t \), we will write \( \mathrm{d}W_t := W_{t+\mathrm{d}t} - W_t \), such that we obtain a second-order approximation

\[ \begin{align} f(W_t - \mathrm{d}W_t) - f(W_t) = f'(W_t)\mathrm{d}W_t + \frac{1}{2} f''(W_t)\mathrm{d}W_t^2 + \mathcal{O}\left( W_t^3\right) \end{align} \]
If we integrate the above over the interval \( [0,T] \), using the mean-square algebra, we obtain the first Itô lemma as

\[ \begin{align} f(W_T) - f(W_0) = \int_{0}^T f'(W_t)\mathrm{d}W_t + \frac{1}{2}\int_{0}^T f''(W_t) \mathrm{d}t. \end{align} \]
An important difference between deterministic calculus and Itô calculus is thus given in the above;
- the difference of the function \( f \) evaluated at the two times of the Wiener process is thus given by the Itô stochastic integral of \( f'(W_t) \) over the interval; and
- added to an additional Riemann integral in the second derivative of \( f \).
The mathematics of this are, again, quite complicated but we often will use these equations as identities while suppressing the details.

Itô Taylor expansion

Consider the Itô formula from the last slide

\[ \begin{align} f(W_T) - f(W_0) = \int_{0}^T f'(W_t)\mathrm{d}W_t +\frac{1}{2} \int_{0}^T f''(W_t) \mathrm{d}t. \end{align} \]
If we set \( f(t):= t^2 \), then \( f'(t) =2t \) and \( f''(t)=2 \), so that

\[ \begin{align} & W_T^2 - W_0^2 = 2 \int_{0}^T W_t\mathrm{d}W_t + \int_{0}^T\mathrm{d}t\\ \Leftrightarrow & \int_{0}^T W_t \mathrm{d}W_t = \frac{1}{2}\left(W_T^2 - T\right). \end{align} \] as discussed earlier.
This shows how, in part, we can formally manipulate stochastic integrals with Itô's formula, even when we suppress the details.
Additional extensions of the Itô-Taylor expansion provide a number of rules to formally work with SDEs.
The details of these expansions and results can be found in a more formal course on the subject.

Itô versus Stratonovich

There is a direct relation between the Itô and Stratonovich integrals of a smooth function \( f \), which is given by

\[ \begin{align} \int_0^T f(W_t)\circ \mathrm{d}W_t = \int_0^T f(W_t)\mathrm{d}W_t + \frac{1}{2}\int_0^Tf'(W_t)\mathrm{d}t. \end{align} \]
This is to say that the Stratonovich is given by the Itô integral, plus a term of a Riemann integral on the right.
Using this relationship, we can show that the Stratonovich integral is actually formally more similar to the deterministic integral than the Itô integral.
If we define \( f(t)=g'(t) \), then the Itô-Taylor expansion similarly gives

\[ \begin{align} &g(W_T) - g(W_0) = \int_{0}^T g'(W_t)\mathrm{d}W_t +\frac{1}{2} \int_{0}^T g''(W_t) \mathrm{d}t\\ \Leftrightarrow & g(W_T) - g(W_0) = \int_0^T f(W_t) \mathrm{d}W_t + \frac{1}{2}\int_0^T f'(W_t)\mathrm{d}t\\ \Leftrightarrow &g(W_T) - g(W_0) = \int_0^T g'(W_t)\circ \mathrm{d}W_t \end{align} \] so that this looks like the deterministic fundamental theorem of calculus.
However, Stratonovich calculus is also subtle to work with, as the midpoint rule that defines the integral implicitly relies on future information for the value of the function, unlike the Itô formulation.