# Activity 09/14/2020 ## STAT 757 -- Section 1001
Instructor: Colin Grudzien
## Instructions: We will work through the following series of activities as a group and hold small group work and discussions in Zoom Breakout Rooms. Follow the instructions in each sub-section when the instructor assigns you to a breakout room. ## Activities: ### Activity 1: warmup on ideas from lecture #### Question 1: Suppose that $X$ is the number of past years of work experience and $Y$ is weekly average income in dollars for full-time, regular workers in the United States in 2019 in a simple regression model. Suppose that in addition, $X=0$ is within the scope of the model. Is $\beta_0=0$ a reasonable assumption for the form of the signal? Justify why or why not with the implication for the regression model and, in particular, for the mean response. #### Question 2: Recall that in the viewing assignment we had shown, \begin{align} TSS&= ESS + RSS + 2\sum_{i=1}^n \left(\hat{Y}_i - \overline{Y}\right)\left(Y_i - \hat{Y}_i \right). \end{align} Recall the following two properties, which we discussed in earlier exercises:
1. The sum of the residuals is equal to zero $\sum_{i=1}^n \hat{\epsilon}_i = 0$; and
2. The sum of the residuals, weighted by the corresponding fitted value, is equal to zero $\sum_{i=1}^n \hat{Y}_i \hat{\epsilon}_i=0$.
Use the above two properties to prove that the sum of the cross terms at the top is equal to zero. ### Debrief from activity 1: We will discuss the results of activity 1 as a class. ### Activity 2: degrees of freedom and sums of squares #### Question 1: Use the definition of the unbiased, sample-based variance estimate to recover the regression estimate for $\sigma^2$. Recall, the denominator must divide by the degrees of freedom remaining after the *estimation of the mean*. Explain what each term corresponds to in the equation and where degrees of freedom are lost. #### Question 2: Suppose our estimated parameter value $\hat{\beta}_1=0$, but that the other assumptions of the standard model hold. Recall the definitions, \begin{align} RSS = \sum_{i=1}^n \left(Y_i - \hat{Y}_i\right)^2 & &TSS = \sum_{i=1}^n \left(Y_i - \overline{Y}\right)^2 \end{align} What is the implication for $R^2$? #### Question 3: Suppose that the following form of the regression equation is satisfied \begin{align} Y_i = \beta_0 + \epsilon_i. \end{align} Under the above condition, can you compute the expected value of the mean square \begin{align} \mathbb{E}\left[ \frac{TSS}{n-1} \right]? \end{align} ### Debrief: We will discuss the results of activity 2 as a class ### Activity 3: take home problem #### Question 1: Recall again the definition of the $TSS$, $$TSS \triangleq \sum_{i=1}^n \left(Y_i - \overline{Y}\right)^2.$$ We will now consider in full generality, what is the value of the expectation of the mean square \begin{align} \mathbb{E}\left[\frac{TSS}{n-1}\right]? \end{align} Consider using the standard definition of the regression equation as a substitution for $\overline{Y}$ and $Y_i$.