Use the left and right arrow keys to navigate the presentation forward and backward respectively. You can also use the arrows at the bottom right of the screen to navigate with a mouse.
FAIR USE ACT DISCLAIMER: This site is for educational purposes only. This website may contain copyrighted material, the use of which has not been specifically authorized by the copyright holders. The material is made available on this website as a way to advance teaching, and copyright-protected materials are used to the extent necessary to make this class function in a distance learning environment. The Fair Use Copyright Disclaimer is under section 107 of the Copyright Act of 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, education and research.
Linear algebra is a fundamental concept for applying and understanding statistical methods in more than one variable.
More specifically in data assimilation, algorithm design is strongly shaped by numerical stability and scalability;
We will not belabor the details and proofs of these results, as these can be found in other classes / books devoted to the subject.
numpy
) will be utilized.For this reason, these lectures will survey a variety of results from an applied perspective, providing intuition to how and why these tools are used.
We will start by introducing the basic characteristics of vectors / matrices, their operations and their implementation in Numpy.
Along the way, we will introduce some essential language and concepts about vector spaces, inner product spaces, linear transformations and important tools in applied matrix algebra.
To accommodate the flexibility of the Python programming environment, conventions around methods name spaces and scope have been adopted.
numpy
as a new object to call methods fromimport numpy as np
The tools we use from numpy will now be called from numpy
as an object, with the form of the call looking like np.method()
Numpy has a method known as “array”;
my_vector = np.array([1,2,3])
my_vector
array([1, 2, 3])
np.shape(my_vector)
(3,)
numpy
that handles both vector and matrix objects:my_array = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
my_array
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
np.shape(my_array)
(3, 3)
numpy
arrays function as mathematical multi-linear matricies in arbitrary dimensions:my_3D_array = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
my_3D_array
array([[[1, 2],
[3, 4]],
[[5, 6],
[7, 8]]])
np.shape(my_3D_array)
(2, 2, 2)
Mathematically, we will define the vector notations \( \pmb{x} \in \mathbb{R}^{N_x} \), matrix notations \( \mathbf{A} \in \mathbb{R}^{N_x \times N_x} \), and matrix-slice notations \( \mathbf{A}^j \in \mathbb{R}^{N_x} \) as
\[ \begin{align} \pmb{x} := \begin{pmatrix} x_1 \\ \vdots \\ x_{N_x} \end{pmatrix} & & \mathbf{A} := \begin{pmatrix} a_{1,1} & \cdots & a_{1, N_x} \\ \vdots & \ddots & \vdots \\ a_{N_x,1} & \cdots & a_{N_x, N_x} \end{pmatrix} & & \mathbf{A}^j := \begin{pmatrix} a_{1,j} \\ \vdots \\ a_{N_x, j} \end{pmatrix} \end{align} \]
Elements of the matrix \( \mathbf{A} \) may further be referred to by index in row and column as
\[ \mathbf{A}_{i,j} = \mathbf{A}\left[i,j\right] = a_{i,j} \]
In numpy
, we can make a reference to sub-arrays analogously with the :
slice notation:
my_array[0:2,0:3]
array([[1, 2, 3],
[4, 5, 6]])
my_array[:,0]
array([1, 4, 7])
Because arrays are understood as mathematical objects, they have inherent methods for mathematical computation.
Suppose we have two vectors
\[ \begin{align} \pmb{x} = \begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix}\in\mathbb{R}^{3 \times 1} & & \pmb{y} = \begin{pmatrix} y_1 \\ y_2 \\ y_3 \end{pmatrix}\in \mathbb{R}^{3\times 1} \end{align} \]
We can perform basic mathematical operations on these element-wise as follows
\[ \begin{align} \pmb{x} + \pmb{y} = \begin{pmatrix} x_1 + y_1 \\ x_2 + y_2 \\ x_3 + y_3 \end{pmatrix} & & \pmb{x}\circ\pmb{y} = \begin{pmatrix} x_1 * y_1 \\ x_2 * y_2 \\ x_3 * y_3 \end{pmatrix} \end{align} \]
Both of these operations generalize to vectors of arbitrary length.
x = np.array([1, 2, 3])
y = np.array([4, 5, 6])
x+y
array([5, 7, 9])
x*y
array([ 4, 10, 18])
The simple, element-wise multiplication and addition of vectors can be performed on any arrays of matching dimension as above.
This type of multiplication is known as the Schur product of arrays.
The Schur product of arrays
Let \( \mathbf{A},\mathbf{B} \in \mathbb{R}^{N \times M} \) be arrays of arbitrary dimension with \( N,M \geq 1 \). The Schur product is defined \[ \begin{align} \mathbf{A}\circ \mathbf{B}:= \begin{pmatrix}a_{1,1} * b_{1,1} & \cdots & a_{1,M}* b_{1,M} \\ \vdots & \ddots & \vdots\\ a_{N,1}* b_{N,1} & \cdots & a_{N,M} * b_{N,M} \end{pmatrix} \end{align} \]
The array-valued Schur product is not the most widely-used array product;
Notice that the two previously defined vectors \( \pmb{x} \) and \( \pmb{y} \) were defined as column vectors
\[ \begin{align} \pmb{x} = \begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix}\in\mathbb{R}^{3 \times 1} & & \pmb{y} = \begin{pmatrix} y_1 \\ y_2 \\ y_3 \end{pmatrix}\in \mathbb{R}^{3\times 1} \end{align} \]
The transpose of \( \pmb{x} \) is defined as the row vector,
\[ \begin{align} \pmb{x}^\top = \begin{pmatrix} x_1 & x_2 & x_3 \end{pmatrix} \in \mathbb{R}^{1 \times 3} \end{align} \]
The standard, Euclidean vector inner product is defined for the vectors \( \pmb{x} \) and \( \pmb{y} \) as follows
\[ \begin{align} \pmb{x}^\top \pmb{y} = x_1 * y_1 + x_2 * y_2 + x_3 * y_3 \end{align} \]
That is, we take each row element from \( \pmb{x}^\top \) and multiply it by each column element of \( \pmb{y} \) and take the sum of these products.
This generalizes to vectors of arbitrary length \( N_x \) as,
\[ \begin{align} \pmb{x}^\top \pmb{y} = \sum_{i=1}^{N_x} x_i * y_i \end{align} \]
The Euclidean inner product
For two vectors \( \pmb{x},\pmb{y} \in\mathbb{R}^{N_x} \), the Euclidean inner product is given as \[ \begin{align} \langle \pmb{x}, \pmb{y}\rangle := \pmb{x}^\top \pmb{y} = \sum_{i=1}^{N_x} \pmb{x}_i *\pmb{y}_i \end{align} \]
Euclidean norm
Let \( \pmb{x}\in\mathbb{R}^{N_x} \). The Euclidean norm of \( \pmb{x} \) is defined \[ \begin{align} \parallel \pmb{x}\parallel := \sqrt{ \sum_{i=1}^{N_x} x_i^2} \equiv \sqrt{\pmb{x}^\top\pmb{x}} \end{align} \]
It is important to note that there are other distances that can be defined on \( \mathbb{R}^{N_x} \) different than the Euclidean distance;
Note, it can be shown that the Euclidean inner product satisfies
\[ \begin{align} \pmb{x}^\top\pmb{y} = \parallel \pmb{x} \parallel * \parallel \pmb{y} \parallel \cos\left(\theta\right), \end{align} \] where,
Following our previous example, we will demonstrate the array transpose function and the dot product:
Recall that x
had the following dimensions
np.shape(x)
(3,)
x
and its transpose, we seex
array([1, 2, 3])
np.transpose(x)
array([1, 2, 3])
This is due to the fact that numpy
does not distinguish between row and column vectors.
transpose()
function extends, however, to two-dimensional arrays in the usual fashion. x.dot(y)
32
np.sum(x*y)
32
np.inner(x,y)
32
x @ y
32
@
notation refers to general matrix multiplication, which we will discuss shortly.Notice that the equation
\[ \begin{align} \pmb{x}^\top\pmb{y} = \parallel \pmb{x} \parallel * \parallel \pmb{y} \parallel \cos\left(\theta\right), \end{align} \]
generalizes the idea of a perpendicular angle between lines;
Orthogonal vectors
We say that two vectors \( \pmb{x},\pmb{y} \) are orthogonal if and only if \[ \begin{align} \pmb{x}^\top \pmb{y} = \pmb{0} \end{align} \]
Notice that with scalar / vector multiplication defined as
\[ \tilde{\pmb{x}}:= \alpha * \pmb{x}:= \begin{pmatrix} \alpha * x_1 \\ \vdots \\ \alpha * x_{N_x}\end{pmatrix} \]
then
\[ \begin{align} \tilde{\pmb{x}}^\top \pmb{y} = 0 & & \Leftrightarrow & & \pmb{x}^\top \pmb{y} = 0 \end{align} \]
This brings us to an important notions of linear combinations and subspaces.
Linear combination
Let \( n\geq 1 \) be an arbitrary integer, \( \alpha_i\in\mathbb{R} \) and \( \pmb{x}_i\in\mathbb{R}^{N_x} \) for each \( i=1,\cdots,n \). Then \[ \begin{align} \pmb{x} := \sum_{i=1}^n \alpha_i \pmb{x}_i \end{align} \] is a linear combination of the vectors \( \{\pmb{x}_i\}_{i=1}^{n} \).
Subspace
The collection of vectors \( V\subset \mathbb{R}^{N_x} \) is denoted a subspace if and only if for any arbitrary collection vectors \( \pmb{x}_i\in V \) and scalars \( \alpha_i\in\mathbb{R} \), their linear combination \[ \sum_{i=1}^n \alpha_i \pmb{x}_i = \pmb{x} \in V. \]
With the above linear combinations in mind, we will use the following notation
\[ \begin{align} \mathrm{span}\{\pmb{x}_i\}_{i=1}^n := \left\{\pmb{x}\in\mathbb{R}^{N_x} : \exists\text{ } \alpha_i \text{ for which } \pmb{x}= \sum_{i=1}^{n}\alpha_i \pmb{x}_i\right\} . \end{align} \]
It can be readily seen then that the span of any collection of vectors is a subspace by construction.
Linear independence / dependence
Let \( \pmb{x}\in\mathbb{R}^{N_x} \) and \( \pmb{x}_i\in\mathbb{R}^{N_x} \) for \( i=1,\cdots,n \). The vector \( \pmb{x} \) is linearly independent (respectively dependent) with the collection \( \{\pmb{x}_i\}_{i=1}^{n} \) if and only if \( \pmb{x}\notin \mathrm{span}\{\pmb{x}_i\}_{i=1}^{n} \) (respectively \( \pmb{x}\in \mathrm{span}\{\pmb{x}_i\}_{i=1}^{n} \)).
It is clear then that, e.g., \( \pmb{x}_1 \) is linearly dependent with \( \mathrm{span}\{\pmb{x}_i\}_{i=1}^n \) trivially.
A related idea is whether for some vector \( \pmb{x}\in \mathrm{Span}\{\pmb{x}_i\}_{i=1}^n \) the choice of the scalar coefficients \( \alpha_i \) defining \( \pmb{x}=\sum_{i=1}^n \alpha_x \pmb{x}_i \) is unique.
Bases
Let \( V\subset \mathbb{R}^{N_x} \) be a subspace. A collection \( \{\pmb{x}_i\}_{i=1}^{n} \) is said to be a basis for \( V \) if \( V = \mathrm{span}\{\pmb{x}_i\}_{i=1}^n \) and if \[ \begin{align} \pmb{0} = \sum_{i=1}^n \alpha_i \pmb{x}_i \end{align} \] holds if and only if \( \alpha_i=0 \) \( \forall i \).
In particular, a choice of a basis for \( V \) gives a unique coordinatization of any vector \( \pmb{x}\in V \).
\[ \begin{align} \pmb{x}=\sum_{i=1}^n \alpha_i \pmb{x}_i = \sum_{i=1}^n \beta_i \pmb{x}_i & & \Leftrightarrow & &\pmb{0} = \sum_{i=1}^n \left(\alpha_i - \beta_i \right) \pmb{x}_i \end{align} \] and all \( \beta_i = \alpha_i \).
Orthogonal (Orthonormal) bases
Let \( \{\pmb{x}_i\}_{i=1}^n \) define a basis for \( V \subset \mathbb{R}^{N_x} \). The basis is said to be orthogonal if and only if each pair of basis vectors is orthogonal. A basis is said to orthonormal if, moreover, each basis vector has norm equal to one. In particular, for an orthonormal basis, if \[ \begin{align} \pmb{x} = \sum_{i=1}^n \alpha_i \pmb{x}_i \end{align} \] then \( \alpha_i = \pmb{x}_i^\top \pmb{x} \).
The above property shows that we can recover the “projection” coefficient \( \alpha_i \) of \( \pmb{x} \) into \( V \) using the inner product of the vector \( \pmb{x} \) with the basis vector \( \pmb{x}_i \).
Orthogonal subspaces
The subspaces \( W,V\subset \mathbb{R}^{N_x} \) are orthogonal if and only if for every \( \pmb{y}\in W \) and \( \pmb{x}\in V \), \[ \begin{align} \pmb{y}^\top \pmb{x} = \pmb{0}. \end{align} \] Orthogonal subspaces will be denoted with \( W \perp V \).
With these constructions in mind, we can now introduce two of the most fundamental tools of inner product spaces: