Summarizing and graphing data part II

02/04/2020

Instructions:

Use the left and right arrow keys to navigate the presentation forward and backward respectively. You can also use the arrows at the bottom right of the screen to navigate with a mouse.

FAIR USE ACT DISCLAIMER:
This site is for educational purposes only. This website may contain copyrighted material, the use of which has not been specifically authorized by the copyright holders. The material is made available on this website as a way to advance teaching, and copyright-protected materials are used to the extent necessary to make this class function in a distance learning environment. The Fair Use Copyright Disclaimer is under section 107 of the Copyright Act of 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, education and research.

Outline

  • The following topics will be covered in this lecture:
    • Additional visual summaries of data
    • Scatter plots
    • Correlation
    • Regression
    • Measures of center

Frequency distributions

Freqency table of IQ scores for sample data set, table 2-2 from textbook.

Courtesy of Mario Triola, Essentials of Statistics, 5th edition

  • Frequency distribution (or frequency table) shows how data are partitioned among several categories (or classes).
  • We list the categories along with the number (frequency) of data values in each of them.
  • For all the observations in the original data table, we:
    1. Identify which partition or class that the observation belongs to.
      • E.g., if we look at an observation with IQ score of 100, this belongs to the class “90-109”.
    2. Tally the number of observations that belong to a class.
      • E.g., we look back over the entire table of raw data and count how many observations belong to the class “90-109”. This was 35 observations.
  • One key concept with fequency distributions is the partition of the data.
  • It is important that all the classes of the data are disjoint and exhaustive.
    • That is to say, all sample data belongs to one and only one class.
Diagram class boundaries between class limits.

Courtesy of Mario Triola, Essentials of Statistics, 5th edition

Histograms

Histogram of IQ scores for low lead exposure group.
Freqency table of IQ scores for sample data set, table 2-2 from textbook.

Courtesy of Mario Triola, Essentials of Statistics, 5th edition

  • Histograms – these are just graphical versions of frequency distributions.
    • These consist of bars of equal width corresponding to the class width of each data class.
    • The horizontal scale is the range of values for the data, separated into the distinct data classes.
    • The heights of the bars are just the frequency of the observations within the class.
    • Therefore, the two summaries of the data are totally equivalent:
      • The widths of the bars and the range of the data is taken from the left column.
      • The heights of the bars are taken from the right column.
    • We note, the scale for the vertical axis can also be given in relative frequency (percent or proportion units).
    • The change of scale between a histogram and a relative frequency histogram is equivalent to the way we change scale between frequency distributions and relative frequency distributions.

Frequency polygons