Chapters 4 and 5 Lecture Notes I

Math 140
Graph for Quantitative Variables
•
•
•
Quantitative data doesn’t exactly form categories the way
categorical data did. Instead, we have to impose them. We break
the number line at regular intervals to form ‘bins’.
Bars on a histogram represent the number of values falling into
each bin.
Minitab > Graph > Histogram
– Simple
– Select Variables
•http://www.canyons.edu/
faculty/morrowa/140/data
sets/
•Plot
Plot Histograms for
Year and Size in Acres
for the 1997-2008
Large Fires (100,000+
fires)
Chapters 4&5 – Quantitative Data I
For fire specific stats, see FEMA’s Fire Data Analysis Handbook, 2nd Ed.
http://www.usfa.dhs.gov/downloads/pdf/publications/fa-266.pdf
Shapes of Distributions: Modes
Just Checking…
•
•
•
What size of wildfire occurs
most?
What year had the most large
fires?
–
–
–
•
•
•
A mode is a local high point in the shape of the
distribution.
Unimodal: 1 mode
Bimodal: 2 modes
Multimodal: more than 2
The mode gives us one way to measure the center.
NOTE: Can appear to change as you adjust the bin size
Outliers:
O
tli
E t
Extreme
values
l
th
thatt
don’t appear to belong with
the rest of the data
– May be mistakes OR may just be
unusual… No way to tell
– Think/Examine them
Chapters 4-5
1
Math 140
Shapes of Distributions: Symmetric?
1. Single or Multiple Modes?
2. Symmetric or Skew?
–
–
Symmetric if the two halves on either side of center are
approximately mirror-images
Skew if it is not symmetric and there is a tail stretching off to
one side
•
•
–
Shapes of Distributions: Outliers?
1. Single or Multiple Modes?
2 Symmetric or Skew?
2.
3. Any Outliers or Gaps?
Skew Left: The tail extends to the left
Skew Right: The tail extends to the right
Uniform if roughly flat
Dot Plots
Just checking…
• A dot plot is a great way to quickly make a histogram by
hand.
– Layout a number line, and place a single dot for each case.
• Minitab > Graph > Dot Plot
– Simple
– Indicate which variables to graph
Chapters 4-5
2
Math 140
Measuring Center: Median
• The median is the middle
value of the data, with half the
data above it and half below.
• Minitab > Stat > Basic Statistics
> Display Descriptive Statistics
Histogram vs.
Bar Chart
– Under Statistics, select Median
• Differences?
• Similarities?
• Total Wildland Fires and Acres
(1960 2007)
(1960-2007)
– http://www.canyons.edu/faculty/m
orrowa/140/datasets/
– Graph and describe the
distribution for Acres
– Find the median
Just checking…
Descriptive Statistics: Acres
Variable
Acres
•
Median
3960842
Is the median
sensitive to or
resistant to
outliers?
The Quartiles
• Quartiles divide the data into 4 parts
– The lower quartile (Q1) is the value with ¼ of the data below it.
it
– The upper quartile (Q3) has ¾ of the data below it.
• Minitab > Stat > Basic Statistics > Display Descriptive
Statistics
– Under Statistics, select
• First quartile
• Median
• Third Quartile
• Find the quartiles for Acres
Chapters 4-5
3
Math 140
Just checking…
Descriptive Statistics: Acres
Variable
Acres
Q1
2646639
Median
3960842
Measuring Spread: Range
• The range is the difference between the highest and
lowest data values.
Q3
5185376
•
•
Questions
Minitab > Stat > Basic Statistics
> Display Descriptive Statistics
– Under Statistics, select
1. What percent of fires are
between 2,646,639 and
,
,
acres?
5,185,376
2. Which is bigger?
• Percent of fires below
5,185,376.
• Percent of fires above
2,646,639.
• Range
Descriptive Statistics: Acres
Variable
Acres
Range
8725336
Which Has a Bigger Range?
•
Measuring Spread: IQR
• The interquartile range (IQR) is the difference between
the first and third quartiles.
Is the range resistant to
outliers?
•
Minitab > Stat > Basic Statistics
> Display Descriptive Statistics
– Under Statistics, select
• Interquartile Range
Descriptive
p
Statistics: Acres
•
Chapters 4-5
Is the range a reliable
way to measure spread
of the bulk of the data?
Variable
Acres
IQR
2538737
• Which is more resistant to outliers? Range or IQR?
4
Math 140
The 5 Number Summary and Boxplot
•
5 Number Summary includes
– Minimum, Q1, Median, Q3, Maximum
•
Short Cut in Minitab
The boxplot displays the 5 number summary as a central box (Q1,
Median, Q3) marked with whiskers that extend to the non-outlying
data values. Outliers are usually marked by asterisks.
•
Stat > Basic Statistics >
Graphical Summary
•
Provides histogram, 5
number summary, and
more
•
Be Careful! Output
without explanation
doesn’t earn points
points.
•
Q: Which is better at
showing outliers? The
boxplot or histogram?
– Minitab > Graph > Boxplots
• Simple
• Select variables
Class Work
Recap: Different Data, Different Tools
•
Categorical
• Data Looks Like?
Data Looks Like?
•
Graph
– Bar Chart
•
– Frequency distribution
– Relative frequency
distribution
Summary/Description (so far)
Rules for checking answers: No Pens in the Front!!!
Homework
•
Textbook/Routine Homework
–
1.
2.
– Describe shape
• Symmetric/skew/uniform
• Modes
• Outliers/gaps
– Describe center
3.
• Median
• Quartiles
– Describe spread
• IQR = Q3 – Q1
– 5 Number Summary
Chapters 4-5
Chapter 4 and 5 Handout I
•
– Histogram
– Boxplot
• Graph
• Summary/Description
1.
Quantitative
•
To get credit, it is your responsibility to get checked off.
•
Due Next Time (25% chance of collection)
Read Chapters 4 and 5
Pg
g 78-79 # 1,, 5,, 7,, 9,, 11,, 23,, 45,, 51 (only
( y find median,, quartiles,
q
, IQR,
Q ,
5-number summary)
Pg 116 #37 (a-e)
Project/Exploration Homework
–
None this time
5