Use the left and right arrow keys to navigate the presentation forward and backward respectively. You can also use the arrows at the bottom right of the screen to navigate with a mouse.
FAIR USE ACT DISCLAIMER: This site is for educational purposes only. This website may contain copyrighted material, the use of which has not been specifically authorized by the copyright holders. The material is made available on this website as a way to advance teaching, and copyright-protected materials are used to the extent necessary to make this class function in a distance learning environment. The Fair Use Copyright Disclaimer is under section 107 of the Copyright Act of 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, education and research.
We have now introduced the context for a discrete Gauss-Markov model, as generated by a sequence of linear transformations and random shocks.
However, in practice, we will often consider a model that evolves in continuous time in between discrete observations of the system.
In order to discuss such a conditional inference problem, we will extend the Gauss-Markov model to continuous time with the notion of a stochastic differential equation.
The mathematics behind solutions to stochastic differential equations is quite complex, and instead we will focus on an intuitive development of the big picture.
We will start by discussing the analogy and the difference between a deterministic and a stochastic integral.
Consider a smooth function
\[ \begin{align} h : [0, T ] \rightarrow \mathbb{R} \end{align} \] for which the derivative \( \frac{\mathrm{d}}{ \mathrm{d} t } h \) is bounded on \( [0, T ] \).
To define the Riemann integral of \( h \), the interval \( [0, T ] \) is partitioned into subintervals
\[ \begin{align} 0 = t_0 < t_1 < \cdots < t_{N−1} < t_N = T . \end{align} \]
The Riemann integral of \( h \) is then given by
\[ \begin{align} \int_{0}^T h(t) \mathrm{d}t := \lim_{N\rightarrow\infty} \sum_{j=0}^{N-1} h(t_j)\left(t_{j+1} - t_j\right) \end{align} \]
The stochastic integral can be defined in a similar way, and two forms exist, the Itô form and the Stratonovich form.
Both forms of the stochastic integral are used in practice for different applications.
\[ \begin{align} \mathbb{E}\left[\vert X_n\vert \right] < \infty & & \mathbb{E}\left[ X_n |X_{n-1:0}\right] = X_{n-1}; \end{align} \]
Wiener process
A continuous-time stochastic process is denoted a Wiener process \( W_{t} \) if it has the following properties:
- \( W_0:= 0 \),
- \( W \) has independent increments;
- The increments \( W_{t+s} - W_{t} \sim N(0,s) \); and
- \( W_t \) is continuous in \( t \).
In analogy to the deterministic integral, the Itô integral is given like the Riemann integral, but integrating “against” a Wiener process
\[ \begin{align} \int_{0}^T h \mathrm{d}W_t:=\lim_{N\rightarrow \infty} \sum_{j=0}^{N-1} h(t_j) \left(W_{t_{j+1}} - W_{t_{j}} \right). \end{align} \]
In the above, note that we use the left-endpoint of the partition of the interval, just like in the Riemann integral.
The Stratonovich integral is defined only as a slight variation where, instead of the left end-point, we use the mid-point rule:
\[ \begin{align} \int_0^T h \circ\mathrm{d}W_t := \lim_{N\rightarrow \infty}\sum_{j=0}^{N-1} h\left(\frac{t_{j+1} - t_{j}}{2}\right) \left(W_{t_{j+1}} - W_{t_{j}} \right). \end{align} \]
In the last slide, the two definitions are deceivingly simple, because we haven't specified in what way the limit and convergence is actually defined.
It turns out, that for random variables, there are several ways we can consider convergence:
Convergence in probability
Let \( X_n \) , \( n = 1, 2, \cdots , \) be a sequence of random variables. We say that xn converges in probability to some random variable \( X \) if, for every real number \( \epsilon> 0 \), \[ \begin{align} \lim_{n\rightarrow \infty} \mathcal{P}\left(\vert X_n - X\vert > \epsilon \right) = 0. \end{align} \]
This is a definition that is basically the same as the previous “continuity in probability” seen earlier;
Almost sure convergence
A random sequence, \( X_n \), is said to converge almost surely, or with probability 1, to a random variable \( X \) if \[ \begin{align} \lim_{n\rightarrow \infty} \vert X_n - X\vert = 0 \end{align} \] except on a set \( A_0 \) of probability zero, i.e., \( \mathcal{P}(A_0)=0 \).
This explicitly requires that, with probability one, the limiting random variable attains the identical realization of \( X \).
These two above definitions are useful to introduce now, as they are similar to the notions we will use in numerical simulation of SDEs later of “weak” and “strong” convergence.
Finally, the mode of convergence we will use when considering the Itô and Stratonovich integrals defined earlier is the following, mean-square convergence.
Mean-square convergence
Let \( X_k \) be a random sequence such that \( \mathbb{E}\left[X_k^2 \right] < \infty \) and let \( X \) be a random variable such that \( \mathbb{E}\left[X^2\right] <\infty \). The sequence, \( X_k \), is said to converge in the mean-square to \( X \) if \[ \begin{align} \lim_{k\rightarrow \infty}\mathbb{E}\left[\left(X_k - X\right)^2 \right] = 0. \end{align} \]
This is actually a stronger notion of convergence than convergence in probability, but weaker than almost sure convergence.
Let's consider an example case of the Itô integral for which \( h(t):=W_t \), i.e., the Itô integral of a Wiener process.
For this case, the Itô integral can be evaluated analytically because
\[ \begin{align} &\sum_{j=0}^{N-1} W_{t_j}\left(W_{t_{j+1}} - W_{t_j}\right)\\ =&\sum_{j=1}^{N-1} \frac{1}{2} \left[W_{t_{j+1}}^2 - W_{t_j}^2 -\left(W_{t_{j+1}} - W_{t_{j}} \right)^2 \right]\\ =&\frac{1}{2} \left(W_T^2 - W_0^2 \right) - \frac{1}{2}\sum_{j=0}^{N-1} \left(W_{t_{j+1}} - W_{t_j} \right)^2 \end{align} \]
Assume that the time-discretization is uniform such that \( \mathrm{d}t := t_{j+1} - t_{j} \) for all \( j \).
Then, for \( \mathrm{d}W_j := W_{t_{j+1}} - W_{t_j} \) we have that
\[ \begin{align} \mathbb{E}\left[\left(\mathrm{d}W_j\right)^2 \right] = \mathrm{d}t \end{align} \] by definition of the Wiener process.
In the “mean-square algebra”, we then write identically \( \left(\mathrm{d}W_j\right)^2 \equiv \mathrm{d}t \), though the formal mathematics is suppressed here.
From the last slide, we thus obtain
\[ \begin{align} \sum_{j=0}^{N-1} \left(W_{t_{j+1}} - W_{t_j}\right)^2 = \sum_{j=0}^{N-1} \left(\mathrm{d}W_j\right)^2= \sum_{j=0}^{N-1} \mathrm{d}t = T \end{align} \]
Noting that, by definition, \( W_0 \equiv 0 \), we thus obtain
\[ \begin{align} \int_{0}^T W_t \mathrm{d}W_t = \frac{1}{2} \left( W_T^2 - T\right). \end{align} \]
In the above, it is important to remember that this is an equality in the mean-square sense, such that both sides represent random variables for which
Therefore, we say that the Itô integral on the left hand size converges in the mean-square sense to 0.5 times the square of a Gaussian random variable, with mean zero and variance \( T \), plus the constant \( T \).
This convergence of the integral to a random variable shows some of the subtlety of working with SDEs.
Recall that in deterministic calculus, the fundamental theorem of calculus tells us that
\[ \begin{align} f(b) - f(a) = \int_{a}^b \frac{\mathrm{d}}{\mathrm{d}t}f(t) \mathrm{d}t \end{align} \] provided such a derivative exists over the interval.
For a perturbation in time \( \mathrm{d}t \), we will write \( \mathrm{d}W_t := W_{t+\mathrm{d}t} - W_t \), such that we obtain a second-order approximation
\[ \begin{align} f(W_t - \mathrm{d}W_t) - f(W_t) = f'(W_t)\mathrm{d}W_t + \frac{1}{2} f''(W_t)\mathrm{d}W_t^2 + \mathcal{O}\left( W_t^3\right) \end{align} \]
If we integrate the above over the interval \( [0,T] \), using the mean-square algebra, we obtain the first Itô lemma as
\[ \begin{align} f(W_T) - f(W_0) = \int_{0}^T f'(W_t)\mathrm{d}W_t + \frac{1}{2}\int_{0}^T f''(W_t) \mathrm{d}t. \end{align} \]
An important difference between deterministic calculus and Itô calculus is thus given in the above;
The mathematics of this are, again, quite complicated but we often will use these equations as identities while suppressing the details.
Consider the Itô formula from the last slide
\[ \begin{align} f(W_T) - f(W_0) = \int_{0}^T f'(W_t)\mathrm{d}W_t +\frac{1}{2} \int_{0}^T f''(W_t) \mathrm{d}t. \end{align} \]
If we set \( f(t):= t^2 \), then \( f'(t) =2t \) and \( f''(t)=2 \), so that
\[ \begin{align} & W_T^2 - W_0^2 = 2 \int_{0}^T W_t\mathrm{d}W_t + \int_{0}^T\mathrm{d}t\\ \Leftrightarrow & \int_{0}^T W_t \mathrm{d}W_t = \frac{1}{2}\left(W_T^2 - T\right). \end{align} \] as discussed earlier.
This shows how, in part, we can formally manipulate stochastic integrals with Itô's formula, even when we suppress the details.
Additional extensions of the Itô-Taylor expansion provide a number of rules to formally work with SDEs.
The details of these expansions and results can be found in a more formal course on the subject.
There is a direct relation between the Itô and Stratonovich integrals of a smooth function \( f \), which is given by
\[ \begin{align} \int_0^T f(W_t)\circ \mathrm{d}W_t = \int_0^T f(W_t)\mathrm{d}W_t + \frac{1}{2}\int_0^Tf'(W_t)\mathrm{d}t. \end{align} \]
This is to say that the Stratonovich is given by the Itô integral, plus a term of a Riemann integral on the right.
Using this relationship, we can show that the Stratonovich integral is actually formally more similar to the deterministic integral than the Itô integral.
If we define \( f(t)=g'(t) \), then the Itô-Taylor expansion similarly gives
\[ \begin{align} &g(W_T) - g(W_0) = \int_{0}^T g'(W_t)\mathrm{d}W_t +\frac{1}{2} \int_{0}^T g''(W_t) \mathrm{d}t\\ \Leftrightarrow & g(W_T) - g(W_0) = \int_0^T f(W_t) \mathrm{d}W_t + \frac{1}{2}\int_0^T f'(W_t)\mathrm{d}t\\ \Leftrightarrow &g(W_T) - g(W_0) = \int_0^T g'(W_t)\circ \mathrm{d}W_t \end{align} \] so that this looks like the deterministic fundamental theorem of calculus.
However, Stratonovich calculus is also subtle to work with, as the midpoint rule that defines the integral implicitly relies on future information for the value of the function, unlike the Itô formulation.