# Activity 09/16/2020 ## STAT 757 -- Section 1001
Instructor: Colin Grudzien
## Instructions: We will work through the following series of activities as a group and hold small group work and discussions in Zoom Breakout Rooms. Follow the instructions in each sub-section when the instructor assigns you to a breakout room. ## Activities: ### Activity 1: basic properties of matrices #### Question 1: In this problem, we will only assume the following: $$n, p \geq 2$$ and let $\mathbf{X}\in \mathbb{R}^{n \times p}$. Define $\mathbf{X}^\dagger = \left(\mathbf{X}^\mathrm{T} \mathbf{X}\right)^{-1} \mathbf{X}^\mathrm{T}$. Carefully compute the value of the product $\mathbf{X}^\dagger \mathbf{X}.$ #### Question 2: We discussed that a projection operator $\Pi$ has the property in general that it is idempotent, i.e., $\Pi^2 = \Pi$. In the special case of the hat matrix, $$\mathbf{H}= \mathbf{X}\left(\mathbf{X}^\mathrm{T} \mathbf{X}\right)^{-1} \mathbf{X}^\mathrm{T},$$ show that $H^2=H$. Then use this fact to show that for the complementary orthogonal projection $$\left(\mathbf{I}-\mathbf{H}\right)^2 = \left(\mathbf{I} - \mathbf{H}\right)$$. ### Debrief: We will discuss the result of activity 1 as a class. ### Activity 2: #### Question 1: What is the geometric meaning of the statements \begin{align} \sum_{i=1}^n \hat{\epsilon}_i X_i &= 0 \\ \sum_{i=1}^n \hat{\epsilon}_i\hat{Y}_i &= 0? \end{align} How does this relate to the column span of the design matrix? #### Question 2: Suppose we are given $n=2$ cases of the data $\left\{\left(X_i,Y_i\right)\right\}_{i=1}^2$. Construct the design matrix for the standard simple regression problem. Do you anticipate issues with performing regression analysis in this data set? Explain why or why not. ### Debrief: We will discuss the results of activity 1 as a class. ### Activity 3: #### Question 1: Use the standard model in matrix form and the definition of the least squares estimated parameters, \begin{align} \mathbf{Y} = \mathbf{X} \boldsymbol{\beta} + \boldsymbol{\epsilon} & & \hat{\boldsymbol{\beta}} \triangleq \left(\mathbf{X}^\mathrm{T} \mathbf{X} \right)^{-1} \mathbf{X}^\mathrm{T}\mathbf{Y}, \end{align} to prove that $\hat{\boldsymbol{\beta}}$ is an unbiased estimate of $\boldsymbol{\beta}$. Furthermore, use the definition of the covariance of a random vector, \begin{align} cov(\mathbf{Y})= \mathbb{E}\left\{\left(\mathbf{Y} - \mathbb{E}\left[\mathbf{Y}\right]\right)\left(\mathbf{Y} - \mathbb{E}\left[\mathbf{Y}\right]\right)^\mathrm{T} \right\}, \end{align} to derive the covariance of this estimate.