A review of inner product spaces and matrix algebra

Instructions:

Use the left and right arrow keys to navigate the presentation forward and backward respectively. You can also use the arrows at the bottom right of the screen to navigate with a mouse.

FAIR USE ACT DISCLAIMER:
This site is for educational purposes only. This website may contain copyrighted material, the use of which has not been specifically authorized by the copyright holders. The material is made available on this website as a way to advance teaching, and copyright-protected materials are used to the extent necessary to make this class function in a distance learning environment. The Fair Use Copyright Disclaimer is under section 107 of the Copyright Act of 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, education and research.

Outline

  • The following topics will be covered in this lecture:
    • An introduction to arrays in Python
    • Basic vector operations
    • Orthogonality
    • Subspaces
    • Orthogonal projection lemma
    • Gram-Schmidt and QR decomposition
    • Matrix / vector multiplication
    • Matrix / matrix multiplication
    • Special classes of matrices

A review of inner product spaces

  • Linear algebra is a fundamental concept for applying and understanding statistical methods in more than one variable.

    • This is at the basis of formulating multivariate distributions and random vectors, as well as their analysis.
  • More specifically in data assimilation, algorithm design is strongly shaped by numerical stability and scalability;

    • for this reason, an understanding of vector subspaces, projections and matrix factorizations is critical for performing dimensional / computational reductions.
  • We will not belabor the details and proofs of these results, as these can be found in other classes / books devoted to the subject.

    • Likewise, it will be assumed that in computation, optimized numerical linear algebra libraries like LAPACK and OpenBLAS (or their wrappers like numpy) will be utilized.
  • For this reason, these lectures will survey a variety of results from an applied perspective, providing intuition to how and why these tools are used.

  • We will start by introducing the basic characteristics of vectors / matrices, their operations and their implementation in Numpy.

  • Along the way, we will introduce some essential language and concepts about vector spaces, inner product spaces, linear transformations and important tools in applied matrix algebra.

Pythonic programming

  • Python uses several standard scientific libraries for numerical computing, data processing and visualization.
  • At the core, there is a Python kernel and interpreter that can take human readable inputs and turn these into machine code.
  • This is the basic Python functionality, but there are extensive specialized libraries.
  • The most important of these for scientific computing are the following:
    1. Numpy – designed for large array manipulation in vectorized operations;
    2. Scipy – a library of numerical routines and scientific computing ecosystem;
    3. Pandas – R Dataframe inspired, data structures and analysis;
    4. Scikit-learn – a general regression and machine learning library;
    5. Matplotlib – Matlab inspired, object oriented plotting and visualization library.

Numpy arrays

  • To accommodate the flexibility of the Python programming environment, conventions around methods name spaces and scope have been adopted.

    • The convention is to utilize import statements to call methods of the library.
    • For example, we will import the library numpy as a new object to call methods from
import numpy as np
  • The tools we use from numpy will now be called from numpy as an object, with the form of the call looking like np.method()

  • Numpy has a method known as “array”;

my_vector = np.array([1,2,3])
my_vector
array([1, 2, 3])
  • Notice that we can identify properties of an array, such as its dimensions, as follows:
np.shape(my_vector)
(3,)

Numpy arrays continued

  • Arrays are the object class in numpy that handles both vector and matrix objects:
my_array = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
my_array
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])
np.shape(my_array)
(3, 3)

Numpy arrays continued

  • Note that numpy arrays function as mathematical multi-linear matricies in arbitrary dimensions:
my_3D_array = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
my_3D_array
array([[[1, 2],
        [3, 4]],

       [[5, 6],
        [7, 8]]])
np.shape(my_3D_array)
(2, 2, 2)

Array notations

  • Mathematically, we will define the vector notations \( \pmb{x} \in \mathbb{R}^{N_x} \), matrix notations \( \mathbf{A} \in \mathbb{R}^{N_x \times N_x} \), and matrix-slice notations \( \mathbf{A}^j \in \mathbb{R}^{N_x} \) as

    \[ \begin{align} \pmb{x} := \begin{pmatrix} x_1 \\ \vdots \\ x_{N_x} \end{pmatrix} & & \mathbf{A} := \begin{pmatrix} a_{1,1} & \cdots & a_{1, N_x} \\ \vdots & \ddots & \vdots \\ a_{N_x,1} & \cdots & a_{N_x, N_x} \end{pmatrix} & & \mathbf{A}^j := \begin{pmatrix} a_{1,j} \\ \vdots \\ a_{N_x, j} \end{pmatrix} \end{align} \]

  • Elements of the matrix \( \mathbf{A} \) may further be referred to by index in row and column as

    \[ \mathbf{A}_{i,j} = \mathbf{A}\left[i,j\right] = a_{i,j} \]

  • In numpy, we can make a reference to sub-arrays analogously with the : slice notation:

my_array[0:2,0:3]
array([[1, 2, 3],
       [4, 5, 6]])
my_array[:,0]
array([1, 4, 7])

Array operations

  • Because arrays are understood as mathematical objects, they have inherent methods for mathematical computation.

  • Suppose we have two vectors

    \[ \begin{align} \pmb{x} = \begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix}\in\mathbb{R}^{3 \times 1} & & \pmb{y} = \begin{pmatrix} y_1 \\ y_2 \\ y_3 \end{pmatrix}\in \mathbb{R}^{3\times 1} \end{align} \]

  • We can perform basic mathematical operations on these element-wise as follows

    \[ \begin{align} \pmb{x} + \pmb{y} = \begin{pmatrix} x_1 + y_1 \\ x_2 + y_2 \\ x_3 + y_3 \end{pmatrix} & & \pmb{x}\circ\pmb{y} = \begin{pmatrix} x_1 * y_1 \\ x_2 * y_2 \\ x_3 * y_3 \end{pmatrix} \end{align} \]

  • Both of these operations generalize to vectors of arbitrary length.

Numpy arrays continued

  • In Python the syntax for such operations is given by
x = np.array([1, 2, 3])
y = np.array([4, 5, 6])
x+y
array([5, 7, 9])
x*y
array([ 4, 10, 18])
  • The simple, element-wise multiplication and addition of vectors can be performed on any arrays of matching dimension as above.

  • This type of multiplication is known as the Schur product of arrays.

The Schur product of arrays
Let \( \mathbf{A},\mathbf{B} \in \mathbb{R}^{N \times M} \) be arrays of arbitrary dimension with \( N,M \geq 1 \). The Schur product is defined \[ \begin{align} \mathbf{A}\circ \mathbf{B}:= \begin{pmatrix}a_{1,1} * b_{1,1} & \cdots & a_{1,M}* b_{1,M} \\ \vdots & \ddots & \vdots\\ a_{N,1}* b_{N,1} & \cdots & a_{N,M} * b_{N,M} \end{pmatrix} \end{align} \]

Euclidean inner product

  • The array-valued Schur product is not the most widely-used array product;

    • rather, the scalar-valued inner product and its extension to general matrix multiplication will be more common.
  • Notice that the two previously defined vectors \( \pmb{x} \) and \( \pmb{y} \) were defined as column vectors

    \[ \begin{align} \pmb{x} = \begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix}\in\mathbb{R}^{3 \times 1} & & \pmb{y} = \begin{pmatrix} y_1 \\ y_2 \\ y_3 \end{pmatrix}\in \mathbb{R}^{3\times 1} \end{align} \]

  • The transpose of \( \pmb{x} \) is defined as the row vector,

    \[ \begin{align} \pmb{x}^\top = \begin{pmatrix} x_1 & x_2 & x_3 \end{pmatrix} \in \mathbb{R}^{1 \times 3} \end{align} \]

  • The standard, Euclidean vector inner product is defined for the vectors \( \pmb{x} \) and \( \pmb{y} \) as follows

    \[ \begin{align} \pmb{x}^\top \pmb{y} = x_1 * y_1 + x_2 * y_2 + x_3 * y_3 \end{align} \]

  • That is, we take each row element from \( \pmb{x}^\top \) and multiply it by each column element of \( \pmb{y} \) and take the sum of these products.

  • This generalizes to vectors of arbitrary length \( N_x \) as,

    \[ \begin{align} \pmb{x}^\top \pmb{y} = \sum_{i=1}^{N_x} x_i * y_i \end{align} \]

The Euclidean norm and inner product

The Euclidean inner product
For two vectors \( \pmb{x},\pmb{y} \in\mathbb{R}^{N_x} \), the Euclidean inner product is given as \[ \begin{align} \langle \pmb{x}, \pmb{y}\rangle := \pmb{x}^\top \pmb{y} = \sum_{i=1}^{N_x} \pmb{x}_i *\pmb{y}_i \end{align} \]
  • The previous formula arises by formally extending the Euclidean distance formula to arbitrary dimensions.
Euclidean norm
Let \( \pmb{x}\in\mathbb{R}^{N_x} \). The Euclidean norm of \( \pmb{x} \) is defined \[ \begin{align} \parallel \pmb{x}\parallel := \sqrt{ \sum_{i=1}^{N_x} x_i^2} \equiv \sqrt{\pmb{x}^\top\pmb{x}} \end{align} \]
  • It is important to note that there are other distances that can be defined on \( \mathbb{R}^{N_x} \) different than the Euclidean distance;

    • in particular, the Euclidean distance represents a “flat” distance in all directions without any preference or penalty.
  • Note, it can be shown that the Euclidean inner product satisfies

    \[ \begin{align} \pmb{x}^\top\pmb{y} = \parallel \pmb{x} \parallel * \parallel \pmb{y} \parallel \cos\left(\theta\right), \end{align} \] where,

    1. \( \parallel \pmb{x}\parallel \) refers to the Euclidean length of the vector, defined as \( \parallel \pmb{x}\parallel =\sqrt{\pmb{x}^\top\pmb{x}} \); and
    2. \( \theta \) is the angle formed by the two vectors \( \pmb{x} \) and \( \pmb{y} \) at the origin \( \boldsymbol{0} \).

The Euclidean norm and inner product

  • Following our previous example, we will demonstrate the array transpose function and the dot product:

  • Recall that x had the following dimensions

np.shape(x)
(3,)
  • If we compare x and its transpose, we see
x
array([1, 2, 3])
np.transpose(x)
array([1, 2, 3])
  • This is due to the fact that numpy does not distinguish between row and column vectors.

    • The transpose() function extends, however, to two-dimensional arrays in the usual fashion.

The Euclidean norm and inner product

  • We can therefore compute the “dot” or Euclidean inner product several different ways:
x.dot(y)
32
np.sum(x*y)
32
np.inner(x,y)
32
x @ y
32
  • The @ notation refers to general matrix multiplication, which we will discuss shortly.

Orthogonality

  • Notice that the equation

    \[ \begin{align} \pmb{x}^\top\pmb{y} = \parallel \pmb{x} \parallel * \parallel \pmb{y} \parallel \cos\left(\theta\right), \end{align} \]

    generalizes the idea of a perpendicular angle between lines;

    • the above product is zero if and only if \( \theta = \frac{\pi}{2} + k *\pi \) for any \( k\in \mathbb{Z} \).
Orthogonal vectors
We say that two vectors \( \pmb{x},\pmb{y} \) are orthogonal if and only if \[ \begin{align} \pmb{x}^\top \pmb{y} = \pmb{0} \end{align} \]
  • Notice that with scalar / vector multiplication defined as

    \[ \tilde{\pmb{x}}:= \alpha * \pmb{x}:= \begin{pmatrix} \alpha * x_1 \\ \vdots \\ \alpha * x_{N_x}\end{pmatrix} \]

    then

    \[ \begin{align} \tilde{\pmb{x}}^\top \pmb{y} = 0 & & \Leftrightarrow & & \pmb{x}^\top \pmb{y} = 0 \end{align} \]

  • This brings us to an important notions of linear combinations and subspaces.

Linear combinations and subspaces

  • The scalar multiples of \( \pmb{x} \) give a simple example of linear combinations of vectors.
Linear combination
Let \( n\geq 1 \) be an arbitrary integer, \( \alpha_i\in\mathbb{R} \) and \( \pmb{x}_i\in\mathbb{R}^{N_x} \) for each \( i=1,\cdots,n \). Then \[ \begin{align} \pmb{x} := \sum_{i=1}^n \alpha_i \pmb{x}_i \end{align} \] is a linear combination of the vectors \( \{\pmb{x}_i\}_{i=1}^{n} \).
  • A subspace can then be defined from linear combinations of vectors as follows.
Subspace
The collection of vectors \( V\subset \mathbb{R}^{N_x} \) is denoted a subspace if and only if for any arbitrary collection vectors \( \pmb{x}_i\in V \) and scalars \( \alpha_i\in\mathbb{R} \), their linear combination \[ \sum_{i=1}^n \alpha_i \pmb{x}_i = \pmb{x} \in V. \]
  • With the above linear combinations in mind, we will use the following notation

    \[ \begin{align} \mathrm{span}\{\pmb{x}_i\}_{i=1}^n := \left\{\pmb{x}\in\mathbb{R}^{N_x} : \exists\text{ } \alpha_i \text{ for which } \pmb{x}= \sum_{i=1}^{n}\alpha_i \pmb{x}_i\right\} . \end{align} \]

  • It can be readily seen then that the span of any collection of vectors is a subspace by construction.

Linear independence and bases

  • Related notions are linear independence, dependence and bases
Linear independence / dependence
Let \( \pmb{x}\in\mathbb{R}^{N_x} \) and \( \pmb{x}_i\in\mathbb{R}^{N_x} \) for \( i=1,\cdots,n \). The vector \( \pmb{x} \) is linearly independent (respectively dependent) with the collection \( \{\pmb{x}_i\}_{i=1}^{n} \) if and only if \( \pmb{x}\notin \mathrm{span}\{\pmb{x}_i\}_{i=1}^{n} \) (respectively \( \pmb{x}\in \mathrm{span}\{\pmb{x}_i\}_{i=1}^{n} \)).
  • It is clear then that, e.g., \( \pmb{x}_1 \) is linearly dependent with \( \mathrm{span}\{\pmb{x}_i\}_{i=1}^n \) trivially.

  • A related idea is whether for some vector \( \pmb{x}\in \mathrm{Span}\{\pmb{x}_i\}_{i=1}^n \) the choice of the scalar coefficients \( \alpha_i \) defining \( \pmb{x}=\sum_{i=1}^n \alpha_x \pmb{x}_i \) is unique.

Bases
Let \( V\subset \mathbb{R}^{N_x} \) be a subspace. A collection \( \{\pmb{x}_i\}_{i=1}^{n} \) is said to be a basis for \( V \) if \( V = \mathrm{span}\{\pmb{x}_i\}_{i=1}^n \) and if \[ \begin{align} \pmb{0} = \sum_{i=1}^n \alpha_i \pmb{x}_i \end{align} \] holds if and only if \( \alpha_i=0 \) \( \forall i \).
  • In particular, a choice of a basis for \( V \) gives a unique coordinatization of any vector \( \pmb{x}\in V \).

    • If we suppose there existed two coordinatizatons for a vector \( \pmb{x} \) in the basis \( \{\pmb{x}_i\}_{i=1}^n \),

    \[ \begin{align} \pmb{x}=\sum_{i=1}^n \alpha_i \pmb{x}_i = \sum_{i=1}^n \beta_i \pmb{x}_i & & \Leftrightarrow & &\pmb{0} = \sum_{i=1}^n \left(\alpha_i - \beta_i \right) \pmb{x}_i \end{align} \] and all \( \beta_i = \alpha_i \).

Orthogonal bases and subspaces

  • When we define a choice of inner product, such as the Euclidean inner product, a special class of basis is often useful for theoretical / computational purposes.
Orthogonal (Orthonormal) bases
Let \( \{\pmb{x}_i\}_{i=1}^n \) define a basis for \( V \subset \mathbb{R}^{N_x} \). The basis is said to be orthogonal if and only if each pair of basis vectors is orthogonal. A basis is said to orthonormal if, moreover, each basis vector has norm equal to one. In particular, for an orthonormal basis, if \[ \begin{align} \pmb{x} = \sum_{i=1}^n \alpha_i \pmb{x}_i \end{align} \] then \( \alpha_i = \pmb{x}_i^\top \pmb{x} \).
  • The above property shows that we can recover the “projection” coefficient \( \alpha_i \) of \( \pmb{x} \) into \( V \) using the inner product of the vector \( \pmb{x} \) with the basis vector \( \pmb{x}_i \).

    • This is a critical property, which we will generalize after we define orthogonal subspaces.
Orthogonal subspaces
The subspaces \( W,V\subset \mathbb{R}^{N_x} \) are orthogonal if and only if for every \( \pmb{y}\in W \) and \( \pmb{x}\in V \), \[ \begin{align} \pmb{y}^\top \pmb{x} = \pmb{0}. \end{align} \] Orthogonal subspaces will be denoted with \( W \perp V \).
  • With these constructions in mind, we can now introduce two of the most fundamental tools of inner product spaces:

    • orthogonal projections; and
    • the Gram-Schmidt process.

Orthogonal projections

  • When we think of orthogonal projections, we can think about the way the shadow of an object is projected onto two dimensions via the sun.
    • Particularly, the orthogonal projection would correspond to high-noon with the sun directly overhead.