03/15/2021
Use the left and right arrow keys to navigate the presentation forward and backward respectively. You can also use the arrows at the bottom right of the screen to navigate with a mouse.
FAIR USE ACT DISCLAIMER: This site is for educational purposes only. This website may contain copyrighted material, the use of which has not been specifically authorized by the copyright holders. The material is made available on this website as a way to advance teaching, and copyright-protected materials are used to the extent necessary to make this class function in a distance learning environment. The Fair Use Copyright Disclaimer is under section 107 of the Copyright Act of 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, education and research.
The following topics will be covered in this lecture:
For a continuous random variable, the concepts from discrete random variables have direct analogs.
We have the following correspondences
Discrete | Continuous |
---|---|
Probability mass function \( f(x) \) | Probability density function \( f(x) \) |
\( P(X=x_\alpha) = f(x_\alpha) \) | \( P(a \leq X \leq b) = \int_{a}^b f(x)\mathrm{d}x \) |
Cumulative distribution function \( F(x)=P(X\leq x) \) | Cumulative distribution function \( F(X)=P(X\leq x) \) |
\( F(x) = \sum_{x_\alpha \in \mathbf{R}} f(x_\alpha) \) | \( F(x) = \int_{x\in \mathbb{R}}f(x)\mathrm{d}x \) |
\( \mu = \sum_{x_\alpha \in \mathbf{R}} xf(x_\alpha) \) | \( \mu = \int_{x\in \mathbb{R}} x f(x)\mathrm{d}x \) |
\( \sigma^2 = \sum_{x_\alpha \in \mathbf{R}}(x - \mu)^2 f(x_\alpha) \) | \( \sigma^2 = \int_{x\in \mathbb{R}}( x -\mu)^2 f(x)\mathrm{d}x \) |
Due to the difference between discrete measurements and continuous measurements (where we can arbitrarily sub-divide units) the probability of measuring a single value of a continuous random variable always has probability zero.
Particularly, with continuous random variables, we always define probabilities over ranges of values, assuming some kind of truncation approximation.
Otherwise the ideas are extremely similar by replacing sums with integrals (or Riemann sums).
We will now consider two very common continuous probability distributions.
Let's recall the copper current example from the last lecture.
The random variable \( X \) has a probability density defined as
\[ \begin{align} f(x) = \begin{cases} 5.0 & x\in[4.9, 5.1]\\ 0.0 & \text{else} \end{cases} \end{align} \]
Similarly we found that the cumulative distribution function is given by
\[ \begin{align} F(x)&= \begin{cases} 0.0 & x \in(-\infty, 4.9)\\ 5.0\left(x - 4.9\right) & x \in[4.9, 5.1]\\ 1.0 & x \in (5.1,\infty) \end{cases} \end{align} \]
The above is actually a specific example of the general continuous uniform probability distribution.
The continuous uniform is the probability model for a continuous random variable in which we assign equal probability to any sub-interval of equal length that lies in the range.
For example, in the above, we can see that
\[ \begin{align} P(X\in [4.9,5.0]) = 0.5; & & P(X\in [5.0,5.1]) = 0.5. \end{align} \]
More generally, a simple \( \text{height}\times \text{width} \) area argument shows that for any \( a,b\in[4.9,5.1] \),
\[ \begin{align} b-a=0.1 & & \Rightarrow & & P(X\in[a,b]) = 0.5. \end{align} \]
Let \( b>a \) be real numbers. A continuous random variable \( X \) with the probability density function \[ \begin{align} f(x) = \begin{cases} \frac{1}{b-a} & x \in[a,b]\\ 0 & \text{else} \end{cases}\end{align} \] is a continuous uniform random variable.
Courtesy of Montgomery & Runger, Applied Statistics and Probability for Engineers, 7th edition
The simple form of the continuous uniform distribution makes it easy to compute some key parameters.
For example,
\[ \begin{align} \mu &= \int_{-\infty}^\infty xf(x) \mathrm{d}x \\ &= \int_{a}^b \frac{x}{b-a} \mathrm{d}x \\ &= \frac{x^2}{2(b-a)}\big\vert_{a}^b\\ &= \frac{b^2 - a^2}{2(b-a)} \end{align} \]
But recall that we can factor
\[ b^2 - a^2 = (b+a)(b-a) \] such that
\[ \mu = \frac{b+a}{2}. \]
This corresponds once again to that the mean or expected value is the center of mass / average of the endpoints.
Noting that \( \mu=\frac{b+a}{2} \), we can similarly obtain the variance as
\[ \begin{align} \sigma^2 &= \int_{-\infty}^\infty (x - \mu)^2 f(x)\mathrm{d}x \\ &= \int_{a}^b \frac{\left(x - \frac{b+a}{2}\right)^2}{b-a}\mathrm{d}x \\ \end{align} \]
Recall, we can make a substitution as \( u = x - \frac{b+a}{2} \) such that \( du=dx \) and the range of integration is adjusted as
\[ \begin{align} u_\text{lower} = a - \frac{b+a}{2} = \frac{a - b}{2}& & u_\text{upper} = b - \frac{b+a}{2} = \frac{b-a}{2} \end{align} \]
This results in,
\[ \begin{align} \sigma^2 = \int_{\frac{a-b}{2}}^{\frac{b-a}{2}} \frac{u^2}{b-a}\mathrm{d}u = \frac{u^3}{3(b-a)}\big\vert_\frac{a-b}{2}^\frac{b-a}{2} \end{align} \]
Notice that the cubic term has the property that \( -(a-b)^3 = (b-a)^3 \), such that
\[ \begin{align} \sigma^2 &= \frac{(b-a)^3}{8\times3(b-a)} - \frac{(a-b)^3}{8\times3(b-a)} = \frac{(b-a)^2}{12} \end{align} \]
Finally, we can generally compute the cumulative distribution function of the continuous uniform as
\[ \begin{align} F(x) &= \int_{a}^x \frac{1}{b-a} \mathrm{d}u \\ &= \frac{u}{b-a} \big\vert_{a}^x \\ &= \frac{x - a}{b-a} \end{align} \] where \( x \in[a,b] \).
Therefore, the general form is
\[ \begin{align} F(x) = \begin{cases} 0 & x \in(-\infty, a) \\ \frac{x-a}{b-a} & x \in [a,b]\\ 1 & x \in [b,\infty) \end{cases} \end{align} \]
Courtesy of Melikamp CC via Wikimedia Commons
Furthermore, sometimes the central limit theorem is less obvious.
For example, assume that the deviation (or error) in the length of a machined part is the sum of a large number of infinitesimal effects, e.g.,
If the component errors are independent and equally likely to be positive or negative, the total error can be shown to have an approximate normal distribution.
Furthermore, the normal distribution arises in the study of numerous basic physical phenomena.
For example, physicist James Maxwell developed a normal distribution from simple assumptions regarding the velocities of molecules.
Let the normal random variable \( X \) have mean \( \mu \) and standard deviation \( \sigma \). The probability density function is given as \[ \begin{align} f(x) = \frac{1}{\sqrt{2\pi}\sigma}\exp\left(-\frac{\left(x - \mu\right)^2}{2\sigma^2}\right) \end{align} \]
Courtesy of Montgomery & Runger, Applied Statistics and Probability for Engineers, 7th edition
Courtesy of Montgomery & Runger, Applied Statistics and Probability for Engineers, 7th edition
A special normal distribution that is commonly used for calculations is the standard normal.
The standard normal random variable \( Z \) has mean \( \mu=0 \) and standard deviation \( \sigma=1 \). The probability density function is therefore given as \[ \begin{align} f(z) = \frac{1}{\sqrt{2\pi}}\exp\left(-\frac{z^2}{2}\right). \end{align} \] We denote the cumulative distribution function of the standard normal \( \Phi(z)=P(Z\leq z) \).
The primary reason that the standard normal is often used in practice is because of the following result:
Let the normal random variable \( X \) have mean \( \mu \) and standard deviation \( \sigma \). The random variable defined \[ Z = \frac{X - \mu}{\sigma} \] follows the standard normal distribution above.
This says that for any normal random variable \( X \), if we shift the center to zero \( X - \mu \);
and re-scale the spread to one
\[ Z=\frac{X - \mu}{\sigma}; \]
we can model the population with the easier-to-compute standard normal.
The previous technique is known as standardizing a normal variable.
Because it is simple to standardize a variable, this technique was widely used before computers to calculate normal probabilities.
Specifically, if one has a normal model for the population \( X \) with mean \( \mu \) and \( \sigma \), this says,
\[ \begin{align} X \leq x &&\Leftrightarrow && X - \mu \leq x - \mu && \Leftrightarrow && \frac{X - \mu}{\sigma} \leq \frac{x -\mu}{\sigma} & & Z \leq z \end{align} \]
Therefore,
\[ \begin{align} P(X\leq x) = P(Z\leq z) \end{align} \] and this holds for all other ranges.
Note, this argument doesn't work for every distribution.
It is a special property of the normal that we can shift the center and scale the spread and the distribution remains normal.
This is one of several very special properties of this model.
Because we standardize normal variables, the standard normal table for “z-scores” is a long used tool in statistics.
Rather than compute the probability for every normal variable individually, traditional statistics used a table for the standard normal to compute probabilities by standardizing the general normal variable.
Let \( x \) be a measurement of a normal random variable \( X \), which has mean \( \mu \) and standard deviation \( \sigma \). The z-score of the measurement is \[ \begin{align} z=\frac{x-\mu}{\sigma} \end{align} \] which measures how many standard deviations \( \sigma \) the measurement lies from the mean \( \mu \) in the positive or negative direction.
For our homework and the second midterm, we will practice the traditional approach using z-scores.
However, after the second midterm, we will focus on the modern approach using computer languages to make all such calculations.
Courtesy of Montgomery & Runger, Applied Statistics and Probability for Engineers, 7th edition
Courtesy of Montgomery & Runger, Applied Statistics and Probability for Engineers, 7th edition