Use the left and right arrow keys to navigate the presentation forward and backward respectively. You can also use the arrows at the bottom right of the screen to navigate with a mouse.

FAIR USE ACT DISCLAIMER:

This site is for educational purposes only. This website may contain copyrighted material, the use of which has not been specifically authorized by the copyright holders. The material is made available on this website as a way to advance teaching, and copyright-protected materials are used to the extent necessary to make this class function in a distance learning environment. The Fair Use Copyright Disclaimer is under section 107 of the Copyright Act of 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, education and research.

- The following topics will be covered in this lecture:
- Basic vector properties
- Vector inner products
- Basic matrix properties
- Matrix multiplication and invariants

Matrix algebra is a fundamental concept for applying and understanding statistical methods in more than one variable.

- This is at the basis of
**formulating multivariate distributions and random vectors, as well as their analysis**.

- This is at the basis of
We will start by recalling a few basic ideas about vectors and their properties such as the

**inner product and the norm**.Then we will introduce the basic characteristics of matrices and their operations and their implementation in R.

Thereafter, other operations, such as the

**inverse and the solution to linear equations will be introduced**.Finally, we will introduce some basic properties about

**norms and matrix spectrum**, with an emphasis on certain classes of matrices.

- We have had a lot of practice now using vectors generally, such as with the slice operator

```
1:10
```

```
[1] 1 2 3 4 5 6 7 8 9 10
```

We should introduce some basic mathematical properties about vectors and their analysis.

Suppose we have two vectors

\[ \begin{align} \mathbf{a} = \begin{pmatrix} a_1 \\ a_2 \\ a_3 \end{pmatrix}\in\mathbb{R}^{3 \times 1} & & \mathbf{b} = \begin{pmatrix} b_1 \\ b_2 \\ b_3 \end{pmatrix}\in \mathbb{R}^{3\times 1} \end{align} \]

We can perform basic mathematical operations on these element-wise as follows

\[ \begin{align} \mathbf{a} + \mathbf{b} = \begin{pmatrix} a_1 + b_1 \\ a_2 + b_2 \\ a_3 + b_3 \end{pmatrix} & & \mathbf{a}*\mathbf{b} = \begin{pmatrix} a_1 * b_1 \\ a_2 * b_2 \\ a_3 * b_3 \end{pmatrix} \end{align} \]

Both of these operations generalize to vectors of arbitrary length.

- However, the
**above multiplication rule is rarely used in practice**as it is not as meaningful as the scalar-valued product considered next.

- However, the

Notice that the two previously defined vectors \( \mathbf{a} \) and \( \mathbf{b} \) were defined as column vectors

\[ \begin{align} \mathbf{a} = \begin{pmatrix} a_1 \\ a_2 \\ a_3 \end{pmatrix}\in\mathbb{R}^{3 \times 1} & & \mathbf{b} = \begin{pmatrix} b_1 \\ b_2 \\ b_3 \end{pmatrix}\in \mathbb{R}^{3\times 1} \end{align} \]

The

**transpose of \( \mathbf{a} \)**is**defined as the row vector**,\[ \begin{align} \mathbf{a}^\mathrm{T} = \begin{pmatrix} a_1 & a_2 & a_3 \end{pmatrix} \in \mathbb{R}^{1 \times 3} \end{align} \]

The

**standard vector inner product**is defined for the vectors \( \mathbf{a} \) and \( \mathbf{b} \) as follows\[ \begin{align} \mathbf{a}^\mathrm{T} \mathbf{b} = a_1 * b_1 + a_2 * b_2 + a_3 * b_3 \end{align} \]

That is, we

**take each row element from \( \mathbf{a}^\mathrm{T} \)**and**multiply it by each column element of \( \mathbf{b} \)**and**take the sum of these products**.This generalizes to vectors of arbitrary length \( n \) as,

\[ \begin{align} \mathbf{a}^\mathrm{T} \mathbf{b} = \sum_{i=1}^n a_i * b_i \end{align} \]

This rule also defines the general form of matrix multiplication that we will consider shortly.

Let's consider a quick example of the vector product of two vectors.

- We will write this manually in the form of the equation we derived earlier, first in terms of the element-wise product

```
a <- 1:3
b <- 4:6
a * b
```

```
[1] 4 10 18
```

- taking the sum, we obtain the inner product

```
sum(a*b)
```

```
[1] 32
```

```
t(a)%*%b
```

```
[,1]
[1,] 32
```

Mathematically, the standard inner product can be described as follows,

\[ \begin{align} \mathbf{a}^\mathrm{T}\mathbf{b} = \parallel \mathbf{a} \parallel * \parallel \mathbf{b} \parallel \cos\left(\theta\right), \end{align} \]

where,

\( \parallel \mathbf{a}\parallel \) refers to the Euclidean length of the vector, defined as \( \parallel \mathbf{a}\parallel^2 =\mathbf{a}^\mathrm{T}\mathbf{a} \); and

\( \theta \) is the angle formed by the two vectors \( \mathbf{a} \) and \( \mathbf{b} \) at the origin \( \boldsymbol{0} \).

There are other ways to define the length of a vector that do not use the inner product as above, but we will be more interested in these ideas in the case of matrices.

The above standard inner product is also a special case of

**general matrix multiplication**.

- We have developed a basic use of matrices in R already, often encoding data into a matrix or a dataframe format,

```
require("faraway")
gala_mat <- as.matrix(gala)
gala_mat
```

```
Species Endemics Area Elevation Nearest Scruz Adjacent
Baltra 58 23 25.09 346 0.6 0.6 1.84
Bartolome 31 21 1.24 109 0.6 26.3 572.33
Caldwell 3 3 0.21 114 2.8 58.7 0.78
Champion 25 9 0.10 46 1.9 47.4 0.18
Coamano 2 1 0.05 77 1.9 1.9 903.82
Daphne.Major 18 11 0.34 119 8.0 8.0 1.84
Daphne.Minor 24 0 0.08 93 6.0 12.0 0.34
Darwin 10 7 2.33 168 34.1 290.2 2.85
Eden 8 4 0.03 71 0.4 0.4 17.95
Enderby 2 2 0.18 112 2.6 50.2 0.10
Espanola 97 26 58.27 198 1.1 88.3 0.57
Fernandina 93 35 634.49 1494 4.3 95.3 4669.32
Gardner1 58 17 0.57 49 1.1 93.1 58.27
Gardner2 5 4 0.78 227 4.6 62.2 0.21
Genovesa 40 19 17.35 76 47.4 92.2 129.49
Isabela 347 89 4669.32 1707 0.7 28.1 634.49
Marchena 51 23 129.49 343 29.1 85.9 59.56
Onslow 2 2 0.01 25 3.3 45.9 0.10
Pinta 104 37 59.56 777 29.1 119.6 129.49
Pinzon 108 33 17.95 458 10.7 10.7 0.03
Las.Plazas 12 9 0.23 94 0.5 0.6 25.09
Rabida 70 30 4.89 367 4.4 24.4 572.33
SanCristobal 280 65 551.62 716 45.2 66.6 0.57
SanSalvador 237 81 572.33 906 0.2 19.8 4.89
SantaCruz 444 95 903.82 864 0.6 0.0 0.52
SantaFe 62 28 24.08 259 16.5 16.5 0.52
SantaMaria 285 73 170.92 640 2.6 49.2 0.10
Seymour 44 16 1.84 147 0.6 9.6 25.09
Tortuga 16 8 1.24 186 6.8 50.9 17.95
Wolf 21 12 2.85 253 34.1 254.7 2.33
```

There are several special matrices that are frequently encountered in practical and theoretical work.

Diagonal matrices are special matrices where all off-diagonal elements are equal to 0;

- i.e., the matrix \( \mathbf{A}\in\mathbb{R}^{n\times p} \) is a diagonal matrix if \( a_{ij} = 0 \) for all \( i\neq j \).

The function

`diag()`

extracts the main diagonal of a matrix in R,

```
dim(gala_mat)
```

```
[1] 30 7
```

```
diag(gala_mat)
```

```
[1] 58.00 21.00 0.21 46.00 1.90 8.00 0.34
```

```
for (i in 1:7) {print(gala_mat[i,i])}
```

```
[1] 58
[1] 21
[1] 0.21
[1] 46
[1] 1.9
[1] 8
[1] 0.34
```

- We can also use the
`diag()`

function to produce a diagonal matrix,

```
diag(3)
```

```
[,1] [,2] [,3]
[1,] 1 0 0
[2,] 0 1 0
[3,] 0 0 1
```

```
diag(2,3)
```

```
[,1] [,2] [,3]
[1,] 2 0 0
[2,] 0 2 0
[3,] 0 0 2
```

```
diag(1:3)
```

```
[,1] [,2] [,3]
[1,] 1 0 0
[2,] 0 2 0
[3,] 0 0 3
```

Diagonal matrices have the benefit that their operation is like that of regular scalar algebra.

- Particularly, operations can be considered element-wise on the diagonal

```
diag(1:3) - diag(2,3)
```

```
[,1] [,2] [,3]
[1,] -1 0 0
[2,] 0 0 0
[3,] 0 0 1
```

```
diag(1:3) * diag(2,3)
```

```
[,1] [,2] [,3]
[1,] 2 0 0
[2,] 0 4 0
[3,] 0 0 6
```

```
diag(1:3) %*% diag(2,3)
```

```
[,1] [,2] [,3]
[1,] 2 0 0
[2,] 0 4 0
[3,] 0 0 6
```

Note that in the R language

.`*`

corresponds to element-wise multiplicationDefine the two matrices,

\[ \begin{align} A = \begin{pmatrix} a_1 & a_2 & a_3 \\ a_4 & a_5 & a_6 \\ a_7 & a_8 & a_9 \end{pmatrix} & & B = \begin{pmatrix} b_1 & b_2 & b_3 \\ b_4 & b_5 & b_6 \\ b_7 & b_8 & b_9 \end{pmatrix} \end{align} \]

Their element-wise product is then,

\[ \begin{align} A * B = \begin{pmatrix} a_1 * b_1 & a_2 * b_2 & a_3 * b_3 \\ a_4 * b_4 & a_5 * b_5 & a_6 * b_6 \\ a_7 * b_7 & a_8 * b_8 & a_9 * b_9 \end{pmatrix} \end{align} \]

Note, that this is not the same product as the

**matrix product defined by**in general.`%*%`

The above element-wise product is like the element-wise vector product;

- this will only be occasionally considered, unlike the matrix product, defined in terms of the vector inner product, which underpins all linear algebra.

As a simple case that leads to general matrix multiplication, let us consider

**matrix / vector multiplication**.Let's suppose that

\[ \begin{align} \mathbf{N} \in \mathbb{R}^{N \times p} & & \mathbf{x} \in \mathbf{R}^{p \times 1} \end{align} \]

We will suppose that we can write the following

\[ \mathbf{N} = \begin{pmatrix} \mathbf{n}_1^\mathrm{T} \\ \vdots \\ \mathbf{n}_N^\mathrm{T} \end{pmatrix} \] where each \( \mathbf{n}_i^\mathrm{T} \in \mathbb{R}^{1 \times p} \) is a row of the matrix \( \mathbf{N} \).

Then, the recall the vector multiplication, \[ \mathbf{n}_i^\mathrm{T} \mathbf{x} = \sum_{j=1}^p n_{i,j} * x_{j} \]

The product of the matrix and the vector is given as, \[ \mathbf{N}\mathbf{x} = \begin{pmatrix} \mathbf{n}^\mathrm{T}_1 \mathbf{x} \\ \vdots \\ \mathbf{n}_N^\mathrm{T} \mathbf{x} \end{pmatrix} \]

This type of multiplication is commonly known as

**row-versus-column multiplication**.Particularly, for a simple matrix and vector pair,

\[ \begin{align} \mathbf{A} = \begin{pmatrix} 1 & 2 \\ 3 & 4\end{pmatrix} & & \mathbf{x} = \begin{pmatrix} 1 \\ 1 \end{pmatrix} \end{align} \]

we recover the product

\[ \begin{align} \mathbf{A}\mathbf{x} = \begin{pmatrix} 1 * 1 + 2 * 1 \\ 3 * 1 + 4 * 1 \end{pmatrix} = \begin{pmatrix} 3 \\ 7 \end{pmatrix} \end{align} \]

- For two general matrices, \[ \begin{align} \mathbf{N} \in \mathbb{R}^{N \times p} & & \mathbf{M} \in \mathbb{R}^{p \times M} \end{align} \]
- Let us write these matrices in terms of sub-vectors as, \[ \begin{align} \mathbf{N} = \begin{pmatrix} \mathbf{n}_1^\mathrm{T} \\ \vdots \\ \mathbf{n}_N^\mathrm{T} \end{pmatrix} & & \mathbf{M} = \begin{pmatrix} \mathbf{m}_1 & \cdots & \mathbf{m}_M\end{pmatrix} \end{align} \] where
- \( \mathbf{n}_i^\mathrm{T} \in \mathbb{R}^{1 \times p} \) is the \( i \)-th row vector in the matrix \( \mathbf{N} \in \mathbb{R}^{N\times p} \); and
- \( \mathbf{m}_i \in \mathbb{R}^{p\times 1} \) is the \( i \)-th column vector in the matrix \( \mathbf{M}\in \mathbb{R}^{p \times M} \)
- We have their product defined as, \[ \begin{align} \mathbf{N} \mathbf{M} = \begin{pmatrix} \mathbf{n}_1^\mathrm{T} \mathbf{m_1} & \mathbf{n}_1^\mathrm{T} \mathbf{m}_2 & \cdots & \mathbf{n}_1^\mathrm{T} \mathbf{m}_M \\ \vdots & \ddots & \cdots & \vdots \\ \mathbf{n}_N^\mathrm{T} \mathbf{m}_1 & \mathbf{n}_N^\mathrm{T} \mathbf{m}_2 & \cdots & \mathbf{n}_N^\mathrm{T} \mathbf{m}_M \end{pmatrix} \in \mathbb{R}^{N\times M} \end{align} \]
- Notice that for the product to make sense,
**the dimensionality has to match between the \( p \) columns in \( \mathbf{N} \) and the \( p \) rows in \( \mathbf{M} \)**. - This
**inner dimension of \( p \) is eliminated in the product**of the matrices to form the final dimensionality of \( N \times M \).

Notice that given our previous definitions, this means that

**matrix multiplication and element-wise multiplication are equivalent for diagonal matrices**.- Particularly,
**all the operations reduce to elements on the diagonal**, as all other operations cancel.

- Particularly,
**Not every matrix can be reduced to a diagonal matrix**, but**many can be reduced to almost-diagonal or other useful forms**under various coordinate transformations.Matrices can be considered a

**coordinate representation of a general, linear transformation**;- the
**choice of coordinates will affect how the linear transformation is represented as a matrix**.

- the
A property of the linear transformation that does not depend on the choice of coordinates is called an

**invariant of the matrix under coordinate transformation**.

The most basic invariant we can consider is the

**trace of a matrix**.For an arbitrary matrix of size \( n\times p \), \( \mathbf{A}_{ij} = a_{ij} \) we define this as,

\[ \mathrm{tr}\left(\mathbf{A}\right) = \sum_{i=1}^{\mathrm{min}(n,p)} a_{ii} \]

i.e.,

**the sum of all diagonal elements**.**Q:**suppose we randomly generate a matrix as follows:

```
set.seed(0)
my_matrix <- matrix(rnorm(16), nrow=4, ncol=4)
my_matrix
```

```
[,1] [,2] [,3] [,4]
[1,] 1.2629543 0.4146414 -0.005767173 -1.1476570
[2,] -0.3262334 -1.5399500 2.404653389 -0.2894616
[3,] 1.3297993 -0.9285670 0.763593461 -0.2992151
[4,] 1.2724293 -0.2947204 -0.799009249 -0.4115108
```

- How can we find the trace of this matrix in R?

**A:**we can use the`diag`

and the`sum`

function to obtain

```
sum(diag(my_matrix))
```

```
[1] 0.07508687
```

Trace is relatively easy to understand how to compute, but the meaning of trace can take a while to appreciate.

- We will see one particularly useful form of this in the Frobenius norm of a matrix later on.

**Determinants**on the other hand are**difficult to understand how to compute, but give a very basic tool for understanding matrices**.Only in the special case of a \( 2\times 2 \) matrix is a determinant easy to compute,

\[ \begin{align} \mathbf{A} = \begin{pmatrix} a_1 & a_2 \\ a_3 & a_4 \end{pmatrix} & & \mathrm{det}\left(\mathbf{A}\right) = a_1*a_4 - a_2*a_3 \end{align} \]

We will suppress the general calculation of determinants and instead focus on one of the most useful properties for matrix analysis.

- We will briefly discuss why this is the case later as we introduce eigenvalues and matrix spectra.

The determinant in practice is often used to check the invertibility of a matrix \( \mathbf{A} \).

If \( \mathbf{A} \) is a square matrix, i.e., \( \mathbf{A} \in \mathbb{R}^{n\times n} \), then its

**inverse is defined**,\[ \mathbf{A}^{-1} \mathbf{A} = \mathbf{A} \mathbf{A}^{-1} = \mathbf{I}_n \]

**if it actually even exists**.The following useful property can be used to determine if a matrix is invertible.

The matrix \( \mathbf{A} \) is invertible if and only if \( \mathrm{det}\left(\mathbf{A}\right) \neq 0 \). I.e., the inverse \( \mathbf{A}^{-1} \) will only exist when \( \mathrm{det}\left(\mathbf{A}\right) \neq 0 \) and if \( \mathrm{det}\left(\mathbf{A}\right) = 0 \) there is no such inverse as discussed above.

- We note that the
**determinant and the inverse of a matrix \( \mathbf{A} \) can only be computed**in the case when \( \mathbf{A} \) is**square in its dimensions**.

**Q:**suppose we randomly generate a matrix as follows:

```
set.seed(0)
my_matrix <- matrix(rnorm(16), nrow=4, ncol=4)
my_matrix
```

```
[,1] [,2] [,3] [,4]
[1,] 1.2629543 0.4146414 -0.005767173 -1.1476570
[2,] -0.3262334 -1.5399500 2.404653389 -0.2894616
[3,] 1.3297993 -0.9285670 0.763593461 -0.2992151
[4,] 1.2724293 -0.2947204 -0.799009249 -0.4115108
```

- Then suppose we compute the determinant as,

```
det(my_matrix)
```

```
[1] -2.429628
```

can we say the inverse exists?

**A:**the determinant is non-zero so an inverse exists.

- Now suppose that we define the following matrix,

```
A <- matrix(1:16, nrow=4, ncol=4)
A
```

```
[,1] [,2] [,3] [,4]
[1,] 1 5 9 13
[2,] 2 6 10 14
[3,] 3 7 11 15
[4,] 4 8 12 16
```

```
det(A)
```

```
[1] 0
```

- This clearly does not have an inverse, but the question as to why may not be obvious.

- Consider,

```
(A[,2] - A[,1])
```

```
[1] 4 4 4 4
```

```
(A[,3] - A[,4])
```

```
[1] -4 -4 -4 -4
```

```
(A[,2] - A[,1]) + (A[,3] - A[,4])
```

```
[1] 0 0 0 0
```

This says that there is a direct linear dependence between the columns of the matrix, and a zero vector can be written as a combination of the columns.

The maximal number of linearly independent columns can be computed as follows:

```
qr(A)$rank
```

```
[1] 2
```

- Then consider,

```
qr(my_matrix)$rank
```

```
[1] 4
```

Notice that the size of

`my_matrix`

and`A`

is \( 4\times 4 \), so that we can say that`my_matrix`

has an entire set of linearly independent columns.The

**determinant function detects then when there is a linear dependence**between the columns, and**gives a zero when there is a dependence**.**Only square matrices with linearly independent columns (non-zero determinants) have inverses**.