# Activity 09/14/2020
## STAT 757 -- Section 1001

Instructor: Colin Grudzien

## Instructions:
We will work through the following series of activities as a group and hold small group work and discussions in Zoom Breakout Rooms. Follow the instructions in each sub-section when the instructor assigns you to a breakout room.
## Activities:
### Activity 1: warmup on ideas from lecture
#### Question 1:
Suppose that $X$ is the number of past years of work experience and $Y$ is weekly average income in dollars for full-time, regular workers in the United States in 2019 in a simple regression model. Suppose that
in addition, $X=0$ is within the scope of the model. Is $\beta_0=0$ a reasonable assumption for the form of the signal? Justify why or why not with the implication for the regression model and, in particular, for the mean response.
#### Question 2:
Recall that in the viewing assignment we had shown,
$$\begin{align}
TSS&= ESS + RSS + 2\sum_{i=1}^n \left(\hat{Y}_i - \overline{Y}\right)\left(Y_i - \hat{Y}_i \right).
\end{align}$$
Recall the following two properties, which we discussed in earlier exercises:
- The sum of the residuals is equal to zero $\sum_{i=1}^n \hat{\epsilon}_i = 0$; and
- The sum of the residuals, weighted by the corresponding fitted value, is equal to zero $\sum_{i=1}^n \hat{Y}_i \hat{\epsilon}_i=0$.

Use the above two properties to prove that the sum of the cross terms at the top is equal to zero.
### Debrief from activity 1:
We will discuss the results of activity 1 as a class.
### Activity 2: degrees of freedom and sums of squares
#### Question 1:
Use the definition of the **unbiased, sample-based variance estimate** to recover the **regression estimate** for $\sigma^2$.
Recall, the denominator must divide by the degrees of freedom remaining after the *estimation of the mean*. Explain what each term corresponds to in the equation and where degrees of freedom are lost.
#### Question 2:
Suppose our estimated parameter value $\hat{\beta}_1=0$, but that the other assumptions of the standard model hold.
Recall the definitions,
$$\begin{align}
RSS = \sum_{i=1}^n \left(Y_i - \hat{Y}_i\right)^2 & &TSS = \sum_{i=1}^n \left(Y_i - \overline{Y}\right)^2
\end{align}$$
What is the implication for $R^2$?
#### Question 3:
Suppose that the following form of the regression equation is satisfied
$$\begin{align}
Y_i = \beta_0 + \epsilon_i.
\end{align}$$
Under the above condition, can you compute the expected value of the mean square
$$\begin{align}
\mathbb{E}\left[ \frac{TSS}{n-1} \right]?
\end{align}$$
### Debrief:
We will discuss the results of activity 2 as a class
### Activity 3: take home problem
#### Question 1:
Recall again the definition of the $TSS$,
$$TSS \triangleq \sum_{i=1}^n \left(Y_i - \overline{Y}\right)^2.$$ We will now consider in full generality, what is the value of the expectation of the mean square
$$\begin{align}
\mathbb{E}\left[\frac{TSS}{n-1}\right]?
\end{align}$$ Consider using the standard definition of the regression equation as a substitution for $\overline{Y}$ and $Y_i$.