Use the left and right arrow keys to navigate the presentation forward and backward respectively. You can also use the arrows at the bottom right of the screen to navigate with a mouse.

FAIR USE ACT DISCLAIMER:

This site is for educational purposes only. This website may contain copyrighted material, the use of which has not been specifically authorized by the copyright holders. The material is made available on this website as a way to advance teaching, and copyright-protected materials are used to the extent necessary to make this class function in a distance learning environment. The Fair Use Copyright Disclaimer is under section 107 of the Copyright Act of 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, education and research.

- The following topics will be covered in this lecture:
- The stochastic integral
- Modes of convergence
- The Itô Taylor expansion
- Itô versus Stratononvich calculus

We have now introduced the context for a

**discrete Gauss-Markov model**, as**generated by a sequence of linear transformations and random shocks**.However, in practice, we will often consider a

**model that evolves in continuous time**in between**discrete observations of the system**.In order to discuss such a conditional inference problem, we will extend the Gauss-Markov model to continuous time with the notion of a

**stochastic differential equation**.The mathematics behind solutions to stochastic differential equations is quite complex, and instead we will focus on an intuitive development of the big picture.

We will start by discussing the

**analogy and the difference between a deterministic and a stochastic integral**.

Consider a smooth function

\[ \begin{align} h : [0, T ] \rightarrow \mathbb{R} \end{align} \] for which the derivative \( \frac{\mathrm{d}}{ \mathrm{d} t } h \) is bounded on \( [0, T ] \).

To define the

**Riemann integral**of \( h \), the interval \( [0, T ] \) is partitioned into subintervals\[ \begin{align} 0 = t_0 < t_1 < \cdots < t_{N−1} < t_N = T . \end{align} \]

The

**Riemann integral**of \( h \) is then given by\[ \begin{align} \int_{0}^T h(t) \mathrm{d}t := \lim_{N\rightarrow\infty} \sum_{j=0}^{N-1} h(t_j)\left(t_{j+1} - t_j\right) \end{align} \]

The stochastic integral can be defined in a similar way, and two forms exist, the

**Itô form**and the**Stratonovich form**.Both forms of the stochastic integral are used in practice for different applications.

- Notably
**Itô**is typically preferred in applications such as finance, due to the fact that it preserves the**martingale property**:

\[ \begin{align} \mathbb{E}\left[\vert X_n\vert \right] < \infty & & \mathbb{E}\left[ X_n |X_{n-1:0}\right] = X_{n-1}; \end{align} \]

- on the other hand, the
**Stratonovich**formulation is often the preferred choice in physical models, due to its connection to**coarse-grained simulation of a higher-resolution physical process**.

- Notably

- We will start by recalling the definition of the Wiener process as

Wiener process

A continuous-time stochastic process is denoted aWiener process\( W_{t} \) if it has the following properties:

- \( W_0:= 0 \),
- \( W \) has independent increments;
- The increments \( W_{t+s} - W_{t} \sim N(0,s) \); and
- \( W_t \) is continuous in \( t \).

In analogy to the deterministic integral, the

**Itô integral**is given like the**Riemann integral**, but**integrating “against” a Wiener process**\[ \begin{align} \int_{0}^T h \mathrm{d}W_t:=\lim_{N\rightarrow \infty} \sum_{j=0}^{N-1} h(t_j) \left(W_{t_{j+1}} - W_{t_{j}} \right). \end{align} \]

In the above, note that we use the

**left-endpoint**of the partition of the interval, just like in the**Riemann integral**.The

**Stratonovich integral**is defined only as a slight variation where, instead of the left end-point, we use the mid-point rule:\[ \begin{align} \int_0^T h \circ\mathrm{d}W_t := \lim_{N\rightarrow \infty}\sum_{j=0}^{N-1} h\left(\frac{t_{j+1} - t_{j}}{2}\right) \left(W_{t_{j+1}} - W_{t_{j}} \right). \end{align} \]

In the last slide, the two definitions are

**deceivingly simple**, because we haven't specified in**what way the limit and convergence is actually defined**.It turns out, that for random variables, there are several ways we can consider convergence:

Convergence in probability

Let \( X_n \) , \( n = 1, 2, \cdots , \) be a sequence of random variables. We say that xnconverges in probabilityto some random variable \( X \) if, for every real number \( \epsilon> 0 \), \[ \begin{align} \lim_{n\rightarrow \infty} \mathcal{P}\left(\vert X_n - X\vert > \epsilon \right) = 0. \end{align} \]

This is a definition that is basically the same as the previous “continuity in probability” seen earlier;

- i.e., the probability of observing a non-zero jump between the sequence limit and the random variable \( X \), becomes increasingly small, limiting to zero.
- In particular, if the CDF of \( X_n \) is given by \( P_n \), and \( X\sim P \), then the above implies that \( P_n\rightarrow P \).

- A stronger notion of convergence, that is similar to deterministic convergence, is the following:

Almost sure convergence

A random sequence, \( X_n \), is said toconverge almost surely, or with probability 1, to a random variable \( X \) if \[ \begin{align} \lim_{n\rightarrow \infty} \vert X_n - X\vert = 0 \end{align} \] except on a set \( A_0 \) of probability zero, i.e., \( \mathcal{P}(A_0)=0 \).

This explicitly requires that, with probability one, the limiting random variable attains the identical realization of \( X \).

- This is qualitatively different than saying, the probability of seeing a discrepancy shrinks in the limit.

These two above definitions are useful to introduce now, as they are similar to the notions we will use in numerical simulation of SDEs later of “weak” and “strong” convergence.

Finally, the mode of convergence we will use when considering the Itô and Stratonovich integrals defined earlier is the following, mean-square convergence.

Mean-square convergence

Let \( X_k \) be a random sequence such that \( \mathbb{E}\left[X_k^2 \right] < \infty \) and let \( X \) be a random variable such that \( \mathbb{E}\left[X^2\right] <\infty \). The sequence, \( X_k \), is said toconverge in the mean-squareto \( X \) if \[ \begin{align} \lim_{k\rightarrow \infty}\mathbb{E}\left[\left(X_k - X\right)^2 \right] = 0. \end{align} \]

This is actually a

**stronger notion of convergence than convergence in probability**, but**weaker than almost sure convergence**.- That is, mean-square convergence implies convergence in probability, but not almost-sure convergence.

Let's consider an example case of the Itô integral for which \( h(t):=W_t \), i.e., the Itô integral of a Wiener process.

For this case, the Itô integral can be evaluated analytically because

\[ \begin{align} &\sum_{j=0}^{N-1} W_{t_j}\left(W_{t_{j+1}} - W_{t_j}\right)\\ =&\sum_{j=1}^{N-1} \frac{1}{2} \left[W_{t_{j+1}}^2 - W_{t_j}^2 -\left(W_{t_{j+1}} - W_{t_{j}} \right)^2 \right]\\ =&\frac{1}{2} \left(W_T^2 - W_0^2 \right) - \frac{1}{2}\sum_{j=0}^{N-1} \left(W_{t_{j+1}} - W_{t_j} \right)^2 \end{align} \]

Assume that the time-discretization is uniform such that \( \mathrm{d}t := t_{j+1} - t_{j} \) for all \( j \).

Then, for \( \mathrm{d}W_j := W_{t_{j+1}} - W_{t_j} \) we have that

\[ \begin{align} \mathbb{E}\left[\left(\mathrm{d}W_j\right)^2 \right] = \mathrm{d}t \end{align} \] by definition of the Wiener process.

In the

**“mean-square algebra”**, we then write identically \( \left(\mathrm{d}W_j\right)^2 \equiv \mathrm{d}t \), though the formal mathematics is suppressed here.

From the last slide, we thus obtain

\[ \begin{align} \sum_{j=0}^{N-1} \left(W_{t_{j+1}} - W_{t_j}\right)^2 = \sum_{j=0}^{N-1} \left(\mathrm{d}W_j\right)^2= \sum_{j=0}^{N-1} \mathrm{d}t = T \end{align} \]

Noting that, by definition, \( W_0 \equiv 0 \), we thus obtain

\[ \begin{align} \int_{0}^T W_t \mathrm{d}W_t = \frac{1}{2} \left( W_T^2 - T\right). \end{align} \]

In the above, it is important to remember that this is an equality in the mean-square sense, such that both sides represent random variables for which

- on the
**left-hand-side**, the**limit over the partition gives a sequence**such that; - the
**expected value of the square difference**with the**right-hand-side****equals zero**.

- on the
Therefore, we say that the Itô integral on the left hand size converges in the mean-square sense to 0.5 times the square of a Gaussian random variable, with mean zero and variance \( T \), plus the constant \( T \).

This convergence of the integral to a random variable shows some of the subtlety of working with SDEs.

Recall that in deterministic calculus, the fundamental theorem of calculus tells us that

\[ \begin{align} f(b) - f(a) = \int_{a}^b \frac{\mathrm{d}}{\mathrm{d}t}f(t) \mathrm{d}t \end{align} \] provided such a derivative exists over the interval.

For a perturbation in time \( \mathrm{d}t \), we will write \( \mathrm{d}W_t := W_{t+\mathrm{d}t} - W_t \), such that we obtain a second-order approximation

\[ \begin{align} f(W_t - \mathrm{d}W_t) - f(W_t) = f'(W_t)\mathrm{d}W_t + \frac{1}{2} f''(W_t)\mathrm{d}W_t^2 + \mathcal{O}\left( W_t^3\right) \end{align} \]

If we integrate the above over the interval \( [0,T] \), using the mean-square algebra, we obtain the first Itô lemma as

\[ \begin{align} f(W_T) - f(W_0) = \int_{0}^T f'(W_t)\mathrm{d}W_t + \frac{1}{2}\int_{0}^T f''(W_t) \mathrm{d}t. \end{align} \]

An important difference between

**deterministic calculus**and**Itô calculus**is thus given in the above;- the difference of the function \( f \) evaluated at the two times of the Wiener process is thus given by the
**Itô stochastic integral**of \( f'(W_t) \) over the interval; and - added to an additional
**Riemann integral in the second derivative**of \( f \).

- the difference of the function \( f \) evaluated at the two times of the Wiener process is thus given by the
The mathematics of this are, again, quite complicated but we often will use these equations as identities while suppressing the details.

Consider the Itô formula from the last slide

\[ \begin{align} f(W_T) - f(W_0) = \int_{0}^T f'(W_t)\mathrm{d}W_t +\frac{1}{2} \int_{0}^T f''(W_t) \mathrm{d}t. \end{align} \]

If we set \( f(t):= t^2 \), then \( f'(t) =2t \) and \( f''(t)=2 \), so that

\[ \begin{align} & W_T^2 - W_0^2 = 2 \int_{0}^T W_t\mathrm{d}W_t + \int_{0}^T\mathrm{d}t\\ \Leftrightarrow & \int_{0}^T W_t \mathrm{d}W_t = \frac{1}{2}\left(W_T^2 - T\right). \end{align} \] as discussed earlier.

This shows how, in part, we can formally manipulate stochastic integrals with Itô's formula, even when we suppress the details.

Additional extensions of the Itô-Taylor expansion provide a number of rules to formally work with SDEs.

The details of these expansions and results can be found in a more formal course on the subject.

There is a direct relation between the

**Itô**and**Stratonovich**integrals of a smooth function \( f \), which is given by\[ \begin{align} \int_0^T f(W_t)\circ \mathrm{d}W_t = \int_0^T f(W_t)\mathrm{d}W_t + \frac{1}{2}\int_0^Tf'(W_t)\mathrm{d}t. \end{align} \]

This is to say that the

**Stratonovich**is given by the**Itô**integral, plus a term of a**Riemann**integral on the right.Using this relationship, we can show that the Stratonovich integral is actually formally more similar to the deterministic integral than the Itô integral.

If we define \( f(t)=g'(t) \), then the Itô-Taylor expansion similarly gives

\[ \begin{align} &g(W_T) - g(W_0) = \int_{0}^T g'(W_t)\mathrm{d}W_t +\frac{1}{2} \int_{0}^T g''(W_t) \mathrm{d}t\\ \Leftrightarrow & g(W_T) - g(W_0) = \int_0^T f(W_t) \mathrm{d}W_t + \frac{1}{2}\int_0^T f'(W_t)\mathrm{d}t\\ \Leftrightarrow &g(W_T) - g(W_0) = \int_0^T g'(W_t)\circ \mathrm{d}W_t \end{align} \] so that this looks like the deterministic fundamental theorem of calculus.

However,

**Stratonovich calculus**is also subtle to work with, as the**midpoint rule**that defines the integral**implicitly relies on future information**for the value of the function, unlike the**Itô**formulation.