Use the left and right arrow keys to navigate the presentation forward and backward respectively. You can also use the arrows at the bottom right of the screen to navigate with a mouse.
FAIR USE ACT DISCLAIMER: This site is for educational purposes only. This website may contain copyrighted material, the use of which has not been specifically authorized by the copyright holders. The material is made available on this website as a way to advance teaching, and copyright-protected materials are used to the extent necessary to make this class function in a distance learning environment. The Fair Use Copyright Disclaimer is under section 107 of the Copyright Act of 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, education and research.
Recall again the sampling distribution for the sample mean;
we can identify with some probability how accurate our estimate for the true mean is.
The standard error is defined as the standard deviation of this sample statistic distribution, i.e., \( \sigma_{\hat{\theta}} \).
The standard error of the sample mean measures the accuracy of the estimation of the mean,
We will recall how to construct confidence intervals for the mean of a normal distribution.
Courtesy of Härdle, W.K. et al. Basic Elements of Computational Statistics. Springer International Publishing, 2017.
Using the results from the last slide, we can say that for \( Z= \sqrt{n}\frac{\overline{X}_n -\mu}{\sigma} \),
\[ \begin{align} & P\left(-z_{1−\frac{\alpha}{2}}\leq Z \leq z_{1−\frac{\alpha}{2}}\right) = 1 - \alpha \\ \Leftrightarrow & P\left(-z_{1−\frac{\alpha}{2}}\leq \sqrt{n}\frac{\overline{X}_n -\mu}{\sigma} \leq z_{1−\frac{\alpha}{2}}\right) = 1 - \alpha \end{align} \]
We will re-write the interval in the above as follows:
\[ \begin{align} \left(-z_{1−\frac{\alpha}{2}} \frac{\sigma}{\sqrt{n}}\leq \overline{X}_n -\mu \leq z_{1−\frac{\alpha}{2}}\frac{\sigma}{\sqrt{n}}\right) &= \left(-\overline{X}_n -z_{1−\frac{\alpha}{2}} \frac{\sigma}{\sqrt{n}}\leq -\mu \leq -\overline{X}_n+ z_{1−\frac{\alpha}{2}}\frac{\sigma}{\sqrt{n}}\right) \\ &= \left(\overline{X}_n -z_{1−\frac{\alpha}{2}} \frac{\sigma}{\sqrt{n}}\leq \mu \leq \overline{X}_n+ z_{1−\frac{\alpha}{2}}\frac{\sigma}{\sqrt{n}}\right) \end{align} \]
From the above statement, we can read that
Upon replication of a sample of size \( n \), the random interval \[ \left(\overline{X}_n -z_{1−\frac{\alpha}{2}} \frac{\sigma}{\sqrt{n}}\leq \mu \leq \overline{X}_n+ z_{1−\frac{\alpha}{2}}\frac{\sigma}{\sqrt{n}}\right) \] has a probability of \( 1-\alpha \) of covering \( \mu \). Particularly, for a given observed sample mean \( \overline{x}_n \), constructing a confidence interval as above will keep \( \overline{x}_n \) within a radius of \( z_{1−\frac{\alpha}{2}}\frac{\sigma}{\sqrt{n}} \) from the true value \( \mu \) \( (1-\alpha)\times 100\% \) of the time over infinite replications.
Courtesy of Montgomery & Runger, Applied Statistics and Probability for Engineers, 7th edition
The issue with the mentioned approach to confidence intervals is that the true population value of \( \sigma \) is almost never known in any practical application.
For this reason, we can pass to the student's t-distribution again.
Recall that we showed that for the sample mean of the normal random variables \( \overline{X}_n \); and
the sample standard deviation of the normal random variables \( S \),
\[ \frac{\overline{X}_n - \mu}{\frac{S}{\sqrt{n}}} \sim t_{n-1}. \]
Therefore, in practice we can construct the same type of random interval but for a \( t_{\frac{\alpha}{2}} \) critical value of \( t_{n-1} \), \[ \left(\overline{X}_n -t_{\frac{\alpha}{2}} \frac{S}{\sqrt{n}}\leq \mu \leq \overline{X}_n+ t_{\frac{\alpha}{2}}\frac{S}{\sqrt{n}}\right). \]
The above derivation is at the basis of practical confidence intervals for the population mean.
A related, dual concept is the hypothesis test for the mean.
Statistical Hypothesis
A statistical hypothesis is a statement about the parameters of one or more populations.
In hypothesis testing, the null and alternative hypotheses have special meanings philosophically and in the mathematics.
We cannot generally “prove” a hypothesis to be true;
Instead, we can only determine if a hypothesis seems unlikely enough to reject;
To begin such a test formally, we need to first make some assumption about the true parameter.
The null hypothesis \( H_0 \) will always take the form of an equality, or an inclusive inequality.
\[ \begin{align} H_0: & \theta \text{ is } (= / \leq / \geq) \text{ some proposed value}. \end{align} \]
The contradictory / competing hypothesis is the alternative hypothesis, written
\[ \begin{align} H_1: & \theta \text{ is } (\neq / > / <) \text{ some proposed value} \end{align} \]
Once we have formed a null and alternative hypothesis:
\[ \begin{align} H_0: & \theta \text{ is } (= / \leq / \geq) \text{ some proposed value}\\ H_1: & \theta \text{ is } (\neq / > / <) \text{ some proposed value} \end{align} \]
we use the sample data to consider how likely or unlikely it was to observe such data with the proposed parameter.
If the null hypothesis is sufficiently unlikely, we reject the null hypothesis in favor of the alternative hypothesis.
However, if the evidence (the sample) doesn't contradict the null hypothesis, we tentatively keep this assumption.
Type I Error
Rejecting the null hypothesis \( H_0 \) when it is true is defined as a type I error.
Type II Error
Failing to reject the null hypothesis \( H_0 \) when it is false is defined as a type II error.
Courtesy of Montgomery & Runger, Applied Statistics and Probability for Engineers, 7th edition
Probability of Type I Error
\[ \alpha = P(\text{type I error}) = P(\text{reject }H_0\text{ when }H_0\text{ is true}) \]
Probability of Type II Error
\[ \beta = P(\text{type II error}) = P(\text{failing to reject }H_0\text{ when }H_0\text{ is false}). \] The complementary probability, \( 1- \beta \) is called the power of the hypothesis test.
To calculate \( \beta \), we must have a specific alternative hypothesis;
This is because, the unknown, true alternative hypothesis for \( \mu \) will determine the sampling distribution for \( \overline{X} \).
From this sampling distribution, we compute the appropriate probability for failing to reject our null hypothesis, given the true distribution with respect to the true alternative.
Student’s t test can be used in R through the function t.test()
, which will include the dual confidence interval.
Specifically, if we have a formal hypothesis test
\[ \begin{align} H_0:\mu = \tilde{\mu} & & H_1: \mu \neq \tilde{\mu}; \end{align} \] and if the variance \( \sigma^2 \) is also unknown;
then assuming the null, we write the acceptance region as
\[ \left( \tilde{\mu} - \hat{\sigma}_\overline{X} t_\frac{\alpha}{2} , \tilde{\mu} + \hat{\sigma}_\overline{X} t_\frac{\alpha}{2}\right). \]
If the sample mean \( \overline{X} \) lies outside of the acceptance region, i.e., in the critical region,
\[ \left(-\infty, \tilde{\mu} - \hat{\sigma}_\overline{X} t_\frac{\alpha}{2}\right) \cup \left( \tilde{\mu} + \hat{\sigma}_\overline{X} t_\frac{\alpha}{2}, \infty\right), \]
Alternatively, if the sample mean lies within the acceptance region, we fail to reject the null hypothesis with \( \alpha\times 100\% \) significance.
The sodium content of twenty 300-gram boxes of organic cornflakes was determined.
The data (in milligrams) are as follows:
sodium_sample <- c(131.15, 130.69, 130.91, 129.54, 129.64, 128.77,130.72, 128.33, 128.24, 129.65, 130.14, 129.29, 128.71, 129.00, 129.39, 130.42, 129.53, 130.12, 129.78, 130.92)
Let's suppose we want to test the hypothesis,
\[ \begin{align} H_0: \mu = 130 & & H_1:\mu \neq 130; \end{align} \]
If we use t.test()
directly, notice the output
t.test(sodium_sample)
One Sample t-test
data: sodium_sample
t = 662.06, df = 19, p-value < 2.2e-16
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
129.3368 130.1572
sample estimates:
mean of x
129.747
t.test(sodium_sample, mu=130, alternative="two.sided")
One Sample t-test
data: sodium_sample
t = -1.291, df = 19, p-value = 0.2122
alternative hypothesis: true mean is not equal to 130
95 percent confidence interval:
129.3368 130.1572
sample estimates:
mean of x
129.747
Notice that the above includes the test statistic \( t_0 = -1.291 \).
Most importantly, this lists the P value, \( \approx 0.2122 \).
If we take \( \alpha=0.05 \), a common convention, we say \( P> \alpha \), such that we fail to reject the null hypothesis of \( \mu = 130 \).
Suppose we wanted to perform a hypothesis test to make sure the mean sodium is not too high;
\[ \begin{align} H_0: \mu \leq 130 & & H_1:\mu >130, \end{align} \]
we would write in R
t.test(sodium_sample, mu=130, alternative="greater")
One Sample t-test
data: sodium_sample
t = -1.291, df = 19, p-value = 0.8939
alternative hypothesis: true mean is greater than 130
95 percent confidence interval:
129.4081 Inf
sample estimates:
mean of x
129.747
Computing the power of a t test or the sample size necessary for a hypothesis test to reach a certain power is complicated to perform analytically, and is more practically done with technology.
There is a built-in feature in R that will compute either the power of a test, or the needed sample size to attain a power, with the t test.
The power.t.test()
takes the following arguments
power.t.test(n, delta, sd, sig.level, power, alternative, type="one.sample")
n
is the sample sizedelta
is the difference between the assumed, but untrue, null hypothesis and the unknown, but assumed true, alternative hypothesis;sd
is the the sample standard deviation;sig.level
is the value of \( \alpha \);power
is the power of the test;alternative
is the alternative hypothesis; andtype="one.sample"
as above.power.t.test()
,power.t.test(n, delta, sd, sig.level, power, alternative, type="one.sample")
we will actually leave out one of:
power
; orn
as an argument.
The argument that is left out, power
or n
, will be computed from the other arguments.
We will continue our example with the sodium sample, now evaluating the power of our earlier tests
t.test(sodium_sample, mu=130, alternative="two.sided")
One Sample t-test
data: sodium_sample
t = -1.291, df = 19, p-value = 0.2122
alternative hypothesis: true mean is not equal to 130
95 percent confidence interval:
129.3368 130.1572
sample estimates:
mean of x
129.747
s <- sd(sodium_sample)
n <- length(sodium_sample)
mu_null <- 130.0
mu_alternative <- 130.5
and we wish to determine the power of the test to reject the false, null hypothesis.
We will leave the power
argument blank in the function, but we need to calculate delta
.
delta
is given as the absolute difference between our false null hypothesis, and the true alternative, i.e.,
delta <- abs(mu_null - mu_alternative)
delta
[1] 0.5
To calculate the power of the hypothesis test,
\[ \begin{align} H_0 : \mu = 130 & & H_1:\mu \neq 130 \end{align} \]
where we assume the true alternative hypothesis is \( H_1: \mu=130.5 \),
with a significance level of \( \alpha=0.05 \),
we can compute this at once witht the power.t.test()
as:
power.t.test(n=n, delta=delta, sd=s, sig.level=0.05, power=NULL, type="one.sample")
One-sample t test power calculation
n = 20
delta = 0.5
sd = 0.8764288
sig.level = 0.05
power = 0.6775708
alternative = two.sided
Suppose we want to calculate power of the same type of hypothesis test, but with a different, one-sided alternative hypothesis.
\[ \begin{align} H_0:\mu \leq 130 & & H_1 :\mu > 130. \end{align} \]
We specify this in the function as,
power.t.test(n=n, delta=delta, sd=s, alternative="one.sided" , sig.level=0.05, power=NULL, type="one.sample")
One-sample t test power calculation
n = 20
delta = 0.5
sd = 0.8764288
sig.level = 0.05
power = 0.7921742
alternative = one.sided
On the other hand, suppose we need to find the sample size necessary to meet a certain power with one of the earlier hypothesis tests.
E.g., we might try to reject the null if a true mean sodium content is actually 130.1 milligrams, with a power of the test equal to 0.75.
To do so, we now need to negelct the sample size argument n
and supply the power argument power
.
The needed arguments are assigned below:
s <- sd(sodium_sample)
mu_null <- 130.0
mu_alternative <- 130.1
delta <- abs(mu_null - mu_alternative)
pow <- 0.75
power.t.test(n=NULL, delta=delta, sd=s, power=pow, type="one.sample")
One-sample t test power calculation
n = 535.0307
delta = 0.1
sd = 0.8764288
sig.level = 0.05
power = 0.75
alternative = two.sided
power.t.test(n=NULL, delta=delta, sd=s, alternative="one.sided", power=pow, type="one.sample")
One-sample t test power calculation
n = 414.5589
delta = 0.1
sd = 0.8764288
sig.level = 0.05
power = 0.75
alternative = one.sided