Use the left and right arrow keys to navigate the presentation forward and backward respectively. You can also use the arrows at the bottom right of the screen to navigate with a mouse.

FAIR USE ACT DISCLAIMER:

This site is for educational purposes only. This website may contain copyrighted material, the use of which has not been specifically authorized by the copyright holders. The material is made available on this website as a way to advance teaching, and copyright-protected materials are used to the extent necessary to make this class function in a distance learning environment. The Fair Use Copyright Disclaimer is under section 107 of the Copyright Act of 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, education and research.

- The following topics will be covered in this lecture:
- Random samples
- Point estimators
- The central limit theorem
- Standard error

Suppose we take a sample of \( n = 10 \) observations \( \{x_{1,i}\}_{i=1}^{10} \) from a population and compute the sample average,

\[ \overline{x}_1 = \frac{1}{n} \sum_{i=1}^n x_{1,i} = \frac{1}{10}\sum_{i=1}^{10} x_{1,i} \]

getting the result \( \overline{x}_1 = 10.2 \).

Now we repeat this process, taking a second sample of \( n = 10 \) observations from the same population,

\[ \{x_{2,i}\}_{i=1}^{10} \]

and the resulting sample average is \( \overline{x}_2=10.4 \).

This discrepancy is what we call

**sampling error**, in which the**random variation in a sample**of a fixed size \( n \) upon replication produces**differences in the computation of a statistic**.The sample average depends on the observations in the sample, which differ from sample to sample because they are random variables.

Consequently, the

**sample average**(or any other function of the sample data) is a**random variable**.Because a

**statistic is a random variable**, it has a**probability distribution**.

Specifically, suppose that we want to obtain an estimate of a

**population parameter**, where the population is modeled with a random variable \( X \).We know that before the data are collected, the observations are considered to be random variables,

- i.e., we treat an
**independent sequence of measurements**of \( X \),

\[ X_1, X_2, \cdots , X_n \]

- as random variables all drawn from a
**parent distribution**\( X \sim F_X(x) \) (where the CDF will define the distribution).

**Random sample**

The random variables \( X_1 , X_2, \cdots , X_n \) are a**random sample**of size \( n \) if the \( X_i \)’s are independent random variables and every \( X_i \) has the same probability distribution.- i.e., we treat an
We then say that the

**measurements**we obtain are**possible outcomes**of the sample variables \( \{X_i\}_{i=1}^n \); particularly, if we make a computation of the sample mean,\[ \overline{X} = \frac{1}{n} \sum_{i=1}^n X_i \]

the above is treated as a random variable (a linear combination of random variables) which has a random outcome, dependent on the realizations of the \( X_i \).

More generally, any function of the observations, i.e.,

**any statistic**, is also modeled as a**random variable**.If \( h \) is a

**general function**used to**compute some statistic**, we thus define\[ \tilde{X} = h(X_1, \cdots, X_n) \]

to be a

**random variable**that will depend on the particular realizations of \( X_1,\cdots, X_n \).We call the

**probability distribution of a statistic**a**sampling distribution**.**Sampling Distribution**

The probability distribution of a statistic is called a**sampling distribution**.

Given

**particular realizations**of the sample random variables, we obtain a**fixed numerical value**.Each numerical value in a data set is treated as the observed realization of a random variable.

Given particular realizations \( x_1,\cdots,x_n \) of the random variables \( X_1, \cdots, X_n \), the value

\[ \overline{x} = \frac{1}{n}\sum_{i=1}^n x_i \]

is

**not a random variable**, as this is a**fixed numerical value**.Given some particular, observed realizations \( x_1, \cdots,x_n \),

\[ \tilde{x} = h(x_1, \cdots, x_n) \]

is a fixed numerical value, based on the fixed, observed data values \( x_1, \cdots, x_n \).

When discussing inference problems, it is convenient to have a

**general symbol to represent the parameter**of interest – we use the Greek symbol \( \theta \) (theta) to represent the parameter.The symbol \( \theta \) can represent the mean \( \mu \), the variance \( \sigma^2 \), or any parameter of interest to us.

The objective of

**point estimation**is to**estimate a single number based on sample data**that is the most plausible value for \( \theta \).The numerical value of a sample statistic is used as the point estimate.

Once we describe the process of point estimation, the next step is to describe how we

**quantify the uncertainty of the estimate**.If \( X \) is a random variable with probability distribution \( F_X(x) \), characterized by the unknown parameter \( \theta \),

- and if \( X_1 , X_2, \cdots , X_n \) is a random sample of size \( n \) from \( X \),

the statistic \( \hat{\Theta} = h(X_1 , X_2 , ... , X_n ) \) given as a function of the sample is called a

**point estimator of \( \theta \)**.Note that \( \hat{\Theta} \) is a

**random variable**because it is a function of random variables.After the sample has been selected, \( \hat{\Theta} \) takes on a particular numerical value \( \hat{\Theta} \) called the point estimate of \( \theta \).

The

**uncertainty of the point estimate \( \hat{\Theta} \)**can be understood as**how much will the sampling error cause a discrepancy**between \( \hat{\Theta} \) and the true \( \theta \).

- We will now introduce some formal definitions:
**Point estimators**

A**point estimate**of some population parameter \( \theta \) is a single numerical value \( \hat{\theta} \) of a statistic \( \hat{\Theta} \). This is a particular realization of the random variable \( \hat{\Theta} \), viewed as a random variable; \( \hat{\Theta} \) is called the**point estimator**.

Estimation problems modeled as above occur frequently in engineering.

We often need to estimate

- The mean \( \mu \) of a single population
- The variance \( \sigma^2 \) (or standard deviation \( \sigma \)) of a single population
- The proportion \( p \) of items in a population that belong to a class of interest
- The difference in means of two populations, \( \mu_1 - \mu_2 \)
- The difference in two population proportions, \( p_1 − p_2 \)

Reasonable point estimates of these parameters are as follows:

- For \( \mu \),
- the estimate is \( \hat{\mu}=\overline{x} \), the
**sample mean**.

- the estimate is \( \hat{\mu}=\overline{x} \), the
- For \( \sigma^2 \),
- the estimate is \( \hat{\sigma}^2 = s^2 \), the
**sample variance**.

- the estimate is \( \hat{\sigma}^2 = s^2 \), the
- For \( p \),
- the estimate is \( \hat{p}=\frac{x}{n} \), the
**sample proportion**, where \( x \) is the number of items in a random sample of size \( n \) that belong to the class of interest.

- the estimate is \( \hat{p}=\frac{x}{n} \), the
- For \( \mu_1 -\mu_2 \),
- the estimate \( \hat{\mu}_1 - \hat{\mu}_2 = \overline{x}_1 - \overline{x}_2 \), the
**difference between the sample means**of two independent random samples.

- the estimate \( \hat{\mu}_1 - \hat{\mu}_2 = \overline{x}_1 - \overline{x}_2 \), the
- For \( p_1 − p_2 \) ,
- the estimate is \( \hat{p}_1 - \hat{p}_2 \) , the
**difference between two sample proportions**computed from two independent random samples.

- the estimate is \( \hat{p}_1 - \hat{p}_2 \) , the

- For \( \mu \),
Although a point estimate may be the “best” estimate for a population parameter given a single sample, it is critically important to understand how far this estimate might be from the true value.

In order to determine the accuracy of this estimate, we use the concept of the

**sampling distribution**to**derive hypothesis tests and confidence intervals**.

Let's consider a simple argument for the sampling distribution of the sample mean \( X \).

Suppose that a random sample of size \( n \) is taken from a

**normal population**with mean \( \mu \) and variance \( \sigma^2 \).By definition of a

**random sample**each observation in this sample, say, \( X_1, X_2, \cdots, X_n \), is a**normally and independently distributed random variable**with mean \( \mu \) and variance \( \sigma^2 \).A special property of the normal distribution is that it can be translated and rescaled while remaining normal;

- similarly, a sum of independent, normally distributed random variables are also normally distributed.

- similarly, a sum of independent, normally distributed random variables are also normally distributed.
We conclude that the sample mean

\[ \overline{X}= \frac{X_1 + X_2 + \cdots + X_n}{n} \]

has a normal distribution with mean

\[ \mu_\overline{X} = \frac{\mu + \mu + \cdots + \mu}{n} = \mu \]

- and variance

\[ \sigma^2_\overline{X} = \frac{\sigma^2 + \sigma^2 + \cdots + \sigma^2}{n^2} = \frac{\sigma^2}{n} \]

More generally, if we are sampling from a population that has an unknown probability distribution, the

**sampling distribution of the sample mean**will still be**approximately normal**with mean \( \mu \) and variance \( \frac{\sigma^2}{n} \) if the sample size \( n \) is large.This is one of the most useful theorems in statistics, called the

**central limit theorem**:**The central limit theorem**

Let \( X_1 , X_2 , \cdots , X_n \) be a random sample of size \( n \) taken from a population with mean \( \mu \) and finite variance \( \sigma^2 \) and \( \overline{X} \) be the sample mean. Then the**limiting form of the distribution**of \[ Z = \frac{\overline{X} - \mu}{\frac{\sigma}{\sqrt{n}}} \] as \( n \rightarrow \infty \) is the**standard normal distribution**.Put another way, for \( n \) sufficiently large, \( \overline{X} \) has

**approximately**a \( N\left(\mu, \frac{\sigma^2}{n}\right) \) distribution – this says the following.- Suppose we take a sample of size \( n \) and compute the sample mean \( \overline{X} \).
- Then suppose we replicate this sample and record the observed realizations for the sample mean \( \overline{x}_1, \overline{x}_2, \cdots \).
- If the sample size \( n \) is lage, these data points \( \overline{x}_1, \cdots \) will be approximately bell shaped with the following properties:
- the bell will be centered approximately at \( \mu \), the true population mean;
- the spread of the data around the center will be given by approximately by the standard deviation \( \frac{\sigma}{\sqrt{n}} \).

- Particularly, if \( n \) is very large, the observed sample means will be very close to the center (the true mean).

- As a visualization of the concept, suppose again that we have a random sample indexed by \( j \) \[ X_{j,1}, \cdots, X_{j,n}. \]
- We will make replications for \( j=1,\cdots,m \) and get a random variable for sample mean indexed by \( j \), \[ \overline{X}_j = \frac{1}{n}\sum_{i=1}^n X_{j,i}. \]
- When we observe a realization of \( \overline{X}_j=\overline{x}_j \) or respectively the sample \[ X_{j,1}=x_{j,1}, \cdots, X_{j,n}=x_{j,n}, \] we record these fixed numerical values.