04/14/2021
Use the left and right arrow keys to navigate the presentation forward and backward respectively. You can also use the arrows at the bottom right of the screen to navigate with a mouse.
FAIR USE ACT DISCLAIMER: This site is for educational purposes only. This website may contain copyrighted material, the use of which has not been specifically authorized by the copyright holders. The material is made available on this website as a way to advance teaching, and copyright-protected materials are used to the extent necessary to make this class function in a distance learning environment. The Fair Use Copyright Disclaimer is under section 107 of the Copyright Act of 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, education and research.
The following topics will be covered in this lecture:
In practice, we almost never know the true population standard deviation \( \sigma \) and we must use the sample standard deviation \( s \) as a point estimate.
Our standard error estimate is \( \hat{\sigma}_\overline{X}= \frac{s}{\sqrt{n}} \), and this will be utilized for a more general construction of confidence intervals.
If we have a large sample size, with \( n>40 \), we can use this estimate of the standard error effectively within the confidence interval as follows.
Large-Sample Confidence Interval on the Mean
When n is large, the quantity \[ \frac{\overline{X} - \mu}{\frac{s}{\sqrt{n}}} \] has an approximate standard normal distribution. Consequently, \[ x − z_\frac{\alpha}{2} \frac{s}{\sqrt{n}} \leq \mu \leq x + z_\frac{\alpha}{2} \frac{s}{\sqrt{n}} \] is a large-sample confidence interval for \( \mu \), with confidence level of approximately \( (1-\alpha)\times 100\% \).
This is a form of the central limit theorem being used again where the underlying population distribution does not matter;
However, when the sample is small and \( \sigma^2 \) is unknown, we must make an assumption about the form of the underlying distribution to obtain a valid CI procedure.
A reasonable assumption in many cases is that the underlying distribution is normal.
Many populations encountered in practice are well approximated by the normal distribution, so this assumption will lead to confidence interval procedures of wide applicability.
In fact, moderate departure from normality will have little effect on validity.
When the assumption is unreasonable, an alternative is to use nonparametric statistical procedures that are valid for any underlying distribution.
Suppose that the population of interest has a normal distribution with unknown mean \( \mu \) and unknown variance \( \sigma^2 \).
Assume that a random sample of size \( n \), say, \( X_1, X_2 , \cdots , X_n \), is available, and let \( \overline{X} \) and \( S^2 \) be the sample mean and variance, respectively.
We wish to construct a two-sided CI on \( \mu \);
\[ Z = \frac{\overline{X} - \mu}{\frac{\sigma}{\sqrt{n}}} \]
has a standard normal distribution.
The random variable \( Z \) now becomes
\[ T = \frac{\overline{X} − \mu}{\frac{S}{\sqrt{n}}}. \]
For the random variable
\[ T = \frac{\overline{X} − \mu}{\frac{S}{\sqrt{n}}}. \]
logical questions are:
If \( n \) is large, the distribution differs very little from the standard normal by the central limit theorem.
However, \( n \) is usually small in most engineering problems, and in this situation, a different distribution must be employed to construct the CI.
The pdf of the t-distribution is
\[ \begin{align} f(T,n) = \frac{\Gamma\left\{\frac{n+1}{2}\right\}}{\sqrt{\pi n} \Gamma\left(\frac{n}{2}\right)\left(1 + \frac{T^2}{n}\right)^{\frac{n+1}{2}}} \end{align} \] where the Gamma function is a “special function”.
We plot the density below fo n=1, n=2 and n=5 degrees of freedom with the normal density plotted for reference.
par(cex = 2.0, mar = c(5, 4, 4, 2) + 0.3)
t = seq(-5, 5, length = 300)
colors = c("black", "red", "green")
df = c(1, 2, 5) # degrees of freedom(df) for the t-distribution
plot(t, dnorm(t, 0, 1), xlab = "t", ylab = "pdf", type = "l", lwd = 2, col="blue")
for (i in 2:4) { lines(t, dt(t, df[i]), col = colors[i])}
The degrees of freedom determine the shape of the student t.
For \( n > 2 \) degrees of freedom, the mean and variance of Student’s t-distribution are
\[ \begin{align} \mu_T= 0 & & \sigma_T^2 = \frac{n}{n-2} \end{align} \]
As \( n\rightarrow \infty \), the student t distribution becomes closer and eventually converges to the standard normal.
par(cex = 2.0, mar = c(5, 4, 4, 2) + 0.3)
t = seq(-5, 5, length = 300)
colors = c("black", "red", "green")
df = c(10, 100, 1000) # degrees of freedom(df) for the t-distribution
plot(t, dnorm(t, 0, 1), xlab = "t", ylab = "pdf", type = "l", lwd = 2, col="blue")
for (i in 2:4) { lines(t, dt(t, df[i]), col = colors[i])}
The quantiles of a t-distributed rv \( T \) are denoted by \( t_p \), and, due to symmetry, \( t_p =−t_{1− p} \).
In R the generic functions for the t distribution are the following:
dt(x, df)
is the probability density function of the t distribution with df
degrees of freedom.pt(q, df)
is the cumulative density function of the t distribution with df
degrees of freedom.rt(n, df)
randomly generates a sample of size n from the t distribution with df
degrees of freedom.qt(p, df)
is the quantile function of the t distribution with df
degrees of freedom.With these above generic functions for the t distribution, we can almost identically compute the student t confidence interval as we did for the normal confidence interval.
Confidence Interval on the Mean, Variance Unknown
If \( \overline{x} \) and \( s \) are the mean and standard deviation of a random sample from a normal distribution with unknown variance \( \sigma^2 \) with a sample size \( n \). A \( (1-\alpha)\times 100\% \) confidence interval on \( \mu \) is given by \[ \begin{align} &\overline{x} - \hat{\sigma}_\overline{X} t_\frac{\alpha}{2} \leq \mu \leq \overline{x} + \hat{\sigma}_\overline{X} t_\frac{\alpha}{2}\\ \Leftrightarrow&\overline{x} - \frac{s}{\sqrt{n}} t_\frac{\alpha}{2} \leq \mu \leq \overline{x} + \frac{s}{\sqrt{n}}t_\frac{\alpha}{2} \end{align} \] where \( t_\frac{\alpha}{2} \) is the upper \( \frac{\alpha}{2} \) critical point of the t distribution with n − 1 degrees of freedom.
In practice, this is how we will more typically compute a confidence interval on the mean.
Because this is the most common way to compute such a confidence interval in practice, there are actually built-in functions in the R language to handle this.
Computing the confidence interval “manually” with the quantile function is mostly pedagogical, but we will demonstrate how this is done with a few more examples.
Shortly, we will learn how to compute confidence intervals and hypothesis tests with the t distribution using the
t.test()
function in R.
For now, in order to compute the t confidence interval manually, we need to find the appropriate critical value for the equation
\[ \overline{x} - \frac{s}{\sqrt{n}} t_\frac{\alpha}{2} \leq \mu \leq \overline{x} + \frac{s}{\sqrt{n}}t_\frac{\alpha}{2} \]
We can find this critical point in the same way as for the normal, using R, as follows.
Suppose we have a sample size of \( n=20 \); this gives \( n-1=19 \) degrees of freedom, i.e.,
t_alpha_over_2 <- qt(0.975, df=19)
t_alpha_over_2
[1] 2.093024
is the critical point for the \( 95\% \) two-sided confidence interval.
t_alpha_over_2 <- qt(0.995, df=19)
t_alpha_over_2
[1] 2.860935
An article in the Journal of Materials Engineering describes the results of tensile adhesion tests;
alloy_load_failures <- c(19.8, 10.1, 14.9, 7.5, 15.4, 15.4, 15.4, 18.5, 7.9, 12.7, 11.9, 11.4, 11.4, 14.1, 17.6, 16.7, 15.8, 19.5, 8.8, 13.6, 11.9, 11.4)
n <- length(alloy_load_failures)
n
[1] 22
x_bar <- mean(alloy_load_failures)
x_bar
[1] 13.71364
s <- sd(alloy_load_failures)
s
[1] 3.553576
se <- s / sqrt(n)
se
[1] 0.7576249
Notice, if we want to compute the \( 95\% \) confidence interval of the mean, we cannot use the z critical value accurately as the sample size is under 40, and we do not know the population standard deviation.
Therefore, we compute the t critical value as
t_alpha_over_2 = qt(0.975, df=n-1)
t_alpha_over_2
[1] 2.079614
ci <- c(x_bar - se * t_alpha_over_2, x_bar + se * t_alpha_over_2)
ci
[1] 12.13807 15.28920
z_alpha_over_2 <- qnorm(0.975)
z_alpha_over_2
[1] 1.959964
t_alpha_over_2
[1] 2.079614
This demonstrates the way in which the t distribution models the increased uncertainty of the population mean, due to the unknown population standard deviation.
As mentioned before, this process of manually computing confidence intervals is really just pedagogical.
We will now begin to introduce the realistic way confidence intervals are computed in practice.
alloy_load_failures
[1] 19.8 10.1 14.9 7.5 15.4 15.4 15.4 18.5 7.9 12.7 11.9 11.4 11.4 14.1 17.6
[16] 16.7 15.8 19.5 8.8 13.6 11.9 11.4
t.test(alloy_load_failures)
One Sample t-test
data: alloy_load_failures
t = 18.101, df = 21, p-value = 2.731e-14
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
12.13807 15.28920
sample estimates:
mean of x
13.71364
t.test(alloy_load_failures, conf.level=0.99)
One Sample t-test
data: alloy_load_failures
t = 18.101, df = 21, p-value = 2.731e-14
alternative hypothesis: true mean is not equal to 0
99 percent confidence interval:
11.56853 15.85874
sample estimates:
mean of x
13.71364
In reality, this is the default way that one will compute a confidence interval on the mean.
We will begin to favor this approach over the pedagogical approach, constructing confidence intervals using qt
or qnorm
.
So far, we showed how a parameter of a population can be estimated from sample data;
We first showed how to construct a point estimate based on a sample;
In order to rectify the issue with only providing a single point estimate, we constructed an interval of likely values called a confidence interval.
With a level of confidence \( (1 -\alpha)\times 100\% \), specified in terms of the failure rate \( \alpha \), we supplied a range of plausible values for the parameter given the sample on hand.
In many situations, a dual type of problem is of interest, where we will be concerned with how unlikely a possible parameter value might be.
For a \( 95\% \) level of confidence, we had an \( \alpha=5\% \) rate of failure in the confidence interval proceedure.
This principle has been the basis of us finding \( z_\frac{\alpha}{2} \) and \( t_\frac{\alpha}{2} \) critical values for \( \alpha = 0.05 \) corresponding to \( 5\% \).
Particularly, we would have found it unlikely that in more than \( 1 \) out of \( 20 \) replications of samples the associated confidence interval did not contain the true parameter.
In real applications, there may be two competing claims (or hypotheses) about the value of a parameter.
The engineer must use sample data to determine which claim is most plausible, and which one can be rejected as unlikely.
For example, suppose that an engineer is designing an air crew escape system;
The rocket motor contains a propellant, and for the ejection seat to function properly, the propellant should have a mean burning rate of 50 cm/sec.
If the burning rate is too low, the ejection seat may not function properly, leading to an unsafe ejection and possible injury of the pilot.
Higher burning rates may imply instability in the propellant or an ejection seat that is too powerful, again leading to possible pilot injury.
The practical engineering question that must be answered is: Does the mean burning rate of the propellant equal 50 cm/sec, or is it some other value (either higher or lower)?
This type of question can be answered using a statistical technique called hypothesis testing.
We have already gotten some idea of the duality of these problems as t.test()
computes both simultaneously.
We will now develop this idea more formally.
Statistical Hypothesis
A statistical hypothesis is a statement about the parameters of one or more populations.
Because we use probability distributions to model populations, a statistical hypothesis may also be thought of as a statement about the probability distribution of a random variable.
The hypothesis will usually involve one or more parameters of this distribution.
For example, consider the air crew escape system described already.
Suppose that we are interested in the burning rate of the solid propellant.
Burning rate is a random variable that can be described by a probability distribution.
Suppose that our interest focuses on the mean burning rate (a parameter of this distribution).
Specifically, we are interested in deciding whether or not the mean burning rate is \( 50 \) centimeters per second.
We may express this formally as
\[ \begin{align} H_0∶& \mu = 50 \text{ centimeters per second}\\ H_1∶& \mu \neq 50 \text{ centimeters per second} \end{align} \]
\( H_0 \) is known as the null hypothesis and \( H_1 \) is known as the alternative hypothesis.
In hypothesis testing, the null and alternative hypotheses have special meanings philosophically and in the mathematics.
We cannot generally “prove” a hypothesis to be true;
Instead, we can only determine if a hypothesis seems unlikely enough to reject;
To begin such a test formally, we need to first make some assumption about the true parameter.
The null hypothesis \( H_0 \) will always take the form of an equality, or an inclusive inequality.
\[ \begin{align} H_0: & \theta \text{ is } (= / \leq / \geq) \text{ some proposed value}. \end{align} \]
\[ \begin{align} H_0∶ & \mu = 50 \text{ centimeters per second}. \end{align} \]
The contradictory / competing hypothesis is the alternative hypothesis, written
\[ \begin{align} H_1: & \theta \text{ is } (\neq / > / <) \text{ some proposed value} \end{align} \]
\[ \begin{align} H_1∶ & \mu \neq 50 \text{ centimeters per second}. \end{align} \]
Once we have formed a null and alternative hypothesis:
\[ \begin{align} H_0: & \theta \text{ is } (= / \leq / \geq) \text{ some proposed value}\\ H_1: & \theta \text{ is } (\neq / > / <) \text{ some proposed value} \end{align} \]
we use the sample data to consider how likely or unlikely it was to observe such data with the proposed parameter.
If the null hypothesis is sufficiently unlikely, we reject the null hypothesis in favor of the alternative hypothesis.
However, if the evidence (the sample) doesn't contradict the null hypothesis, we tentatively keep this assumption.
In our example, we would say either:
In our example, the alternative hypothesis specifies values of \( \mu \) that could be either greater or less than 50 centimeters per second;
In some situations, we may wish to formulate a one-sided alternative hypothesis, as in
\[ \begin{align} H_0∶ & \mu \geq 50\text{ centimeters per second} \\ H_1∶ & \mu < 50\text{ centimeters per second} \end{align} \]
or
\[ \begin{align} H_0∶ & \mu \leq 50\text{ centimeters per second} \\ H_1∶ & \mu > 50\text{ centimeters per second} \end{align} \]
The above situations have an exact analogy with one-sided confidence bounds, similar to the two-sided test and the two-sided confidence interval.
We will now elaborate on the meaning of determining if a hypothesis is sufficiently unlikely.