Hypothesis testing concluded

05/03/2021

Instructions:

Use the left and right arrow keys to navigate the presentation forward and backward respectively. You can also use the arrows at the bottom right of the screen to navigate with a mouse.

FAIR USE ACT DISCLAIMER:
This site is for educational purposes only. This website may contain copyrighted material, the use of which has not been specifically authorized by the copyright holders. The material is made available on this website as a way to advance teaching, and copyright-protected materials are used to the extent necessary to make this class function in a distance learning environment. The Fair Use Copyright Disclaimer is under section 107 of the Copyright Act of 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, education and research.

Outline

  • The following topics will be covered in this lecture:
    • General hypothesis testing of the mean

General hypothesis testing of the mean

  • When we studied confidence intervals, we differentiated how we could construct these in the case where the population variance was known or unknown.

  • Particularly, we had to use the sample estimate for the standard error,

    \[ \frac{s}{\sqrt{n}} \] in place of the true standard error in our calculations.

  • Under the assumption that the random sample \( X_i \) is normally distributed, we found that

    \[ T = \frac{X - \overline{x}}{\frac{s}{\sqrt{n}}} \] is t distributed in \( n-1 \) degrees of freedom.

  • Using the associated \( t_\frac{\alpha}{2} \) critical value, we calculated the \( (1-\alpha)\times 100\% \) confidence interval as

    \[ \left( \overline{x} - \hat{\sigma}_\overline{X} t_\frac{\alpha}{2} , \overline{x} + \hat{\sigma}_\overline{X} t_\frac{\alpha}{2}\right). \]

  • With the same development in the case \( \sigma^2 \) is known, we can replace the \( z_\frac{\alpha}{2} \) and \( z_\alpha \) critical values with t critical values to produce general hypothesis tests.

General hypothesis testing of the mean

  • Specifically, if we have a formal hypothesis test

    \[ \begin{align} H_0:\mu = \tilde{\mu} & & H_1: \mu \neq \tilde{\mu}; \end{align} \] and if the variance \( \sigma^2 \) is also unknown;

  • then assuming the null, we write the acceptance region as

    \[ \left( \tilde{\mu} - \hat{\sigma}_\overline{X} t_\frac{\alpha}{2} , \tilde{\mu} + \hat{\sigma}_\overline{X} t_\frac{\alpha}{2}\right). \]

  • If the sample mean \( \overline{X} \) lies outside of the acceptance region, i.e., in the critical region,

    \[ \left(-\infty, \tilde{\mu} - \hat{\sigma}_\overline{X} t_\frac{\alpha}{2}\right) \cup \left( \tilde{\mu} + \hat{\sigma}_\overline{X} t_\frac{\alpha}{2}, \infty\right), \]

    • we reject the null hypothesis with \( \alpha \times 100\% \) significance.
  • Alternatively, if the sample mean lies within the acceptance region, we fail to reject the null hypothesis with \( \alpha\times 100\% \) significance.

General hypothesis testing of the mean

  • We similarly will compute the P value in the case where \( \sigma^2 \) is unknown.

  • Our test statistic will be the value,

    \[ \begin{align} t_0 = \frac{\overline{X} - \tilde{\mu}}{\frac{s}{\sqrt{n}}} \end{align} \]

  • Using the t distribution in \( n-1 \) degrees of freedom as our probability model, we calculate the probability of observing a value at least as extreme as \( T_0 \).

  • If the P value falls below \( \alpha \), we can reject the null hypothesis at \( \alpha\times 100\% \) significance.

  • This is how the t.test() function computes both the confidence interval and the hypothesis test simultaneously.

  • In particular, it will assume a null hypothesis of \( H_0: \mu = 0 \) be default.

    • This default value is adjusted in the function keywords, along with the alternate hypothesis.
  • We will now begin to consider how to use this function more generally.

General hypothesis testing of the mean – examples

  • The sodium content of twenty 300-gram boxes of organic cornflakes was determined.

  • The data (in milligrams) are as follows:

sodium_sample <- c(131.15, 130.69, 130.91, 129.54, 129.64, 128.77,130.72, 128.33, 128.24, 129.65, 130.14, 129.29, 128.71, 129.00, 129.39, 130.42, 129.53, 130.12, 129.78, 130.92)
  • Let's suppose we want to test the hypothesis,

    \[ \begin{align} H_0: \mu = 130 & & H_1:\mu \neq 130; \end{align} \]

  • If we use t.test() directly, notice the output

t.test(sodium_sample)

    One Sample t-test

data:  sodium_sample
t = 662.06, df = 19, p-value < 2.2e-16
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
 129.3368 130.1572
sample estimates:
mean of x 
  129.747 

General hypothesis testing of the mean – examples

  • Rather, to set the correct null and alternative hypothesis, we write,
t.test(sodium_sample, mu=130, alternative="two.sided")

    One Sample t-test

data:  sodium_sample
t = -1.291, df = 19, p-value = 0.2122
alternative hypothesis: true mean is not equal to 130
95 percent confidence interval:
 129.3368 130.1572
sample estimates:
mean of x 
  129.747 
  • Notice that the above includes the test statistic \( t_0 = -1.291 \).

    • This also lists the number of degrees of freedom \( df= n-1 = 19 \) for the t distribtion.
  • Most importantly, this lists the P value, \( \approx 0.2122 \).

  • If we take \( \alpha=0.05 \), a common convention, we say \( P> \alpha \), such that we fail to reject the null hypothesis of \( \mu = 130 \).

General hypothesis testing of the mean – examples

  • Suppose we wanted to perform a hypothesis test to make sure the mean sodium is not too high;

    • if we wanted to evaluate the one-sided hypothesis test

    \[ \begin{align} H_0: \mu \leq 130 & & H_1:\mu >130, \end{align} \]

  • we would write in R

t.test(sodium_sample, mu=130, alternative="greater")

    One Sample t-test

data:  sodium_sample
t = -1.291, df = 19, p-value = 0.8939
alternative hypothesis: true mean is greater than 130
95 percent confidence interval:
 129.4081      Inf
sample estimates:
mean of x 
  129.747 
  • Here, we once again fail to reject the null hypothesis at \( \alpha\times 100\% = 5\% \) significance, as \( P\approx 0.89 \).

General hypothesis testing of the mean – examples

  • Computing the power of a t test is about the same as if we were computing the power with a normal distribution, replacing the appropriate z / t critical values and probabilities as necessary.

  • However, computing sample size necessary for a hypothesis test to reach a certain power is very complicated, and is much more easily done with technology.

  • Moreover, there is a built-in feature in R that will compute either the power of a test, or the needed sample size to attain a power, with the t test.

  • The power.t.test() takes the following arguments

power.t.test(n, delta, sd, sig.level, power, alternative, type="one.sample")
  • where
    • n is the sample size
    • delta is the difference between the assumed, but untrue, null hypothesis and the unknown, but assumed true, alternative hypothesis;
    • sd is the the sample standard deviation;
    • sig.level is the value of \( \alpha \);
    • power is the power of the test;
    • alternative is the alternative hypothesis; and
    • we need to specify the type="one.sample" as above.

General hypothesis testing of the mean – examples

  • When we enter the power.t.test(),
power.t.test(n, delta, sd, sig.level, power, alternative, type="one.sample")
  • we will actually leave out one of power or n out as an argument.

  • The argument that is left out, power or n, will be computed from the other arguments.

  • We will continue our example with the sodium sample, now evaluating the power of our earlier tests

t.test(sodium_sample, mu=130, alternative="two.sided")

    One Sample t-test

data:  sodium_sample
t = -1.291, df = 19, p-value = 0.2122
alternative hypothesis: true mean is not equal to 130
95 percent confidence interval:
 129.3368 130.1572
sample estimates:
mean of x 
  129.747 

General hypothesis testing of the mean – examples

  • In our sodium sample example, we had
s <- sd(sodium_sample)
n <- length(sodium_sample)
mu_null <- 130.0
  • Suppose we have a specific value for the alternative hypothesis in mind, i.e.,
mu_alternative <- 130.5
  • and we wish to determine the power of the test to reject the false, null hypothesis.

  • We will leave the power argument blank in the function, but we need to calculate delta.

  • delta is given as the absolute difference between our false null hypothesis, and the true alternative, i.e.,

delta <- abs(mu_null - mu_alternative)
delta
[1] 0.5

General hypothesis testing of the mean – examples

  • To calculate the power of the hypothesis test,

    \[ \begin{align} H_0 : \mu = 130 & & H_1:\mu \neq 130 \end{align} \]

  • where we assume the true alternative hypothesis is \( H_1: \mu=130.5 \),

  • with a significance level of \( \alpha=0.05 \),

  • we can compute this at once witht the power.t.test() as:

power.t.test(n=n, delta=delta, sd=s, sig.level=0.05, power=NULL, type="one.sample")

     One-sample t test power calculation 

              n = 20
          delta = 0.5
             sd = 0.8764288
      sig.level = 0.05
          power = 0.6775708
    alternative = two.sided

General hypothesis testing of the mean – examples

  • Suppose we want to calculate power of the same type of hypothesis test, but with a different, one-sided alternative hypothesis.

    • e.g.,

    \[ \begin{align} H_0:\mu \leq 130 & & H_1 :\mu > 130. \end{align} \]

  • We specify this in the function as,

power.t.test(n=n, delta=delta, sd=s, alternative="one.sided" , sig.level=0.05, power=NULL, type="one.sample")

     One-sample t test power calculation 

              n = 20
          delta = 0.5
             sd = 0.8764288
      sig.level = 0.05
          power = 0.7921742
    alternative = one.sided

General hypothesis testing of the mean – examples

  • On the other hand, suppose we need to find the sample size necessary to meet a certain power with one of the earlier hypothesis tests.

  • E.g., we might try to reject the null if a true mean sodium content is actually 130.1 milligrams, with a power of the test equal to 0.75.

  • To do so, we now need to negelct the sample size argument n and supply the power argument power.

  • The needed arguments are assigned below:

s <- sd(sodium_sample)
mu_null <- 130.0
mu_alternative <- 130.1
delta <- abs(mu_null - mu_alternative)
pow <- 0.75

General hypothesis testing of the mean – examples

  • We determine the appropriate sample size via
power.t.test(n=NULL, delta=delta, sd=s, power=pow, type="one.sample")

     One-sample t test power calculation 

              n = 535.0307
          delta = 0.1
             sd = 0.8764288
      sig.level = 0.05
          power = 0.75
    alternative = two.sided
  • for the two-sided test, or for the one-sided test we use
power.t.test(n=NULL, delta=delta, sd=s, alternative="one.sided", power=pow, type="one.sample")

     One-sample t test power calculation 

              n = 414.5589
          delta = 0.1
             sd = 0.8764288
      sig.level = 0.05
          power = 0.75
    alternative = one.sided

Summary

  • This covers all the standard tools for general hypothesis testing and confidence intervals for the mean;

    • the last homework exercises and the final project questions can be solved with these tools.
  • Other hypothesis tests, such as a test on the standard deviation,

\[ \begin{align} H_0 :\sigma = \tilde{\sigma} & & H_1: \sigma \neq \tilde{\sigma} \end{align} \] can be performed very similarly in R, using built-in functions.

  • For further hypothesis testing in the future, you are recommended to search the appropriate R documentation for how to use these functions.

  • The interpretations of the P value, level of significance, power of the test will all translate;

    • however, the specific distribution / test statistic will change in general.
  • However, the test statistic, P value and power are all computed “under-the-hood” and it is more important that you can comfortably interpret these outputs.

  • The objective for this course was to give all students a working toolbox for standard statistical analysis and inference.

  • Specifically, we learned the underlying theory of statistical inference, and how to make such an analysis in the modern environment of R Markdown.

  • Everyone is encouraged to continue building their skills in this framework;

    • for a more detailed class on statistical computation, consider STAT 445 in the fall.