04/09/2020

Use the left and right arrow keys to navigate the presentation forward and backward respectively. You can also use the arrows at the bottom right of the screen to navigate with a mouse.

FAIR USE ACT DISCLAIMER:

This site is for educational purposes only. This website may contain copyrighted material, the use of which has not been specifically authorized by the copyright holders. The material is made available on this website as a way to advance teaching, and copyright-protected materials are used to the extent necessary to make this class function in a distance learning environment. The Fair Use Copyright Disclaimer is under section 107 of the Copyright Act of 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, education and research.

- The following topics will be covered in this lecture:
- Review of point estimates, confidence intervals and critical values
- Margin of error
- Estimating a population proportion
- Finding the right sample size
- Estimating a population mean
- The student t distribution
- Confidence intervals for the mean
- The special case when \( \sigma \) is known
- Finding the right sample size

Courtesy of Mario Triola, *Essentials of Statistics*, 6th edition

- In the last lecture, we saw how a sample proportion generates a random variable.
- That is, when we take a sample of a population and compute the proportion of the sample for which some statement is true.
- Suppose we want to
**replicate this sampling procedure****infinitely many times**… - It impossible to replicate the sampling infinitely many times, but we can
**construct a probabilistic model for this replication process**with a**probability distribution**.

- Formally, we will
**define \( \hat{p} \)**to be the**random variable**equal to the**proportion derived from a random sample of \( n \) observations**. - For
**each replication**,**\( \hat{p} \) attains a different value based on chance**. - Then, for
**random, independent samples**,**\( \hat{p} \)**tends to be**normally distributed**about**\( p \)**. - We can thus use the value of
**\( \hat{p} \)**and the**distribution of**to estimate**\( \hat{p} \)****\( p \)**and how close we are to it. - We know that
**\( \hat{p} \)**is an**unbiased estimator**of the**true population proportion \( p \)**. - That is, over infinitely many resamplings, the
**expected value****(mean of the probability distribution)**for**\( \hat{p} \)**is equal to**\( p \)**. - When we have a
,**specific sample data set**, and a specific value for**\( \hat{p} \)**associated to it**\( \hat{p} \)**is called a**point estimate**for**\( p \)**. - The
**measure of “how-close”**we think this is to the true value is called a**confidence interval**.

Courtesy of Mario Triola, *Essentials of Statistics*, 6th edition

- Let’s recall how we constructed
**confidence intervals (CI)**in the last lecture. - Suppose that we want to estimate the
**true proportion \( p \)**with some**level of confidence**: - if we replicated the sampling procedure infinitely many times, the
**average number of times**we found**\( p \)**in our confidence interval would be**equal to the level of confidence**.

- Let’s take an example
**confidence level of \( 95\% \)**– this corresponds to a**rate of failure of \( 5\% \)**over infinitely many replications. - Generally, we will write the confidence level as, \[ (1 - \alpha) \times 100\% \] so that we can associate this confidence level with its rate of failure \( \alpha \).
- Recall, we earlier studied ways that we can
**compute the critical value associated to some \( \alpha \)**for the normal distribution. - We will use the same principle here to find
**how wide is the interval around**for which**\( p \)****\( \hat{p} \)**will lie in this interval \( (1-\alpha)\times 100\% \) of the time.

- we want to find the critical value \( z_\frac{\alpha}{2} \) for which:
- \( (1-\frac{\alpha}{2})\times 100\% \) of the area under the normal density lies to the left of \( z_\frac{\alpha}{2} \); and
- \( (1-\frac{\alpha}{2})\times 100\% \) of the area under the normal density lies to the right of \( -z_\frac{\alpha}{2} \).
- Put together, \( (1-\alpha)\times 100\% \) of values lie within \( [-z_\frac{\alpha}{2},z_\frac{\alpha}{2}] \).