HW 1 categorical displays of data

Homework Assignment #1: Displaying Data
1) How can we help wood surfaces resist weathering, especially when restoring historic wooden
buildings? In a study of this question, researchers prepared wooden panels and then exposed them to
the weather. Here are some variables recorded: type of wood (cedar, pine, oak); type of water repellent
(solvent-based, water-based); paint thickness (millimeters); paint color (white, gray, light blue);
weathering time (months). Identify each variable as categorical or quantitative.
2) The pie chart below summarizes the genres of 120 movies released in 2005
a) Is this an appropriate display for the genres? If yes, why? If not, why not and what type of
display should have been used?
b) Which genre was least common?
3) The pie chart shows the ratings assigned to 120 movies released in 2005.
a) Is this an appropriate display for the genres? If yes, why? If not, why not and what type of
display should have been used?
b) Which genre was least common?
4) The Centers for Disease Control and Prevention lists causes of death in the US during 2004
Cause of Death
Percent
Heart Disease
27.2
Cancer
23.1
Circulatory diseases and stroke
6.3
Respiratory diseases
5.1
Accidents
a) What percents of deaths were from causes not listed here?
b) Create an appropriate display for these data.
5) Here’s a table that classifies movies released in 2006 by genre and rating.
G
PG
PG-13
R
Action
66.7
25
30.4
23.7
C omedy
33.3
60.0
35.7
10.5
Drama
0
15.0
14.3
44.7
Horror
0
0
19.6
21.1
Total
100
100
100
100
Total
29.2
31.7
23.3
15.8
100
a) What percentage of these movies were comedies?
b) What percentage of the PG rated movies were comedies?
c) Which of the following can you learning from this table? Give the answer if you can find it from
the table. If you cannot find it, explain why.
i)
The percentage of PG-13 movies that were comedies
ii)
The percentage of dramas that were R rated
iii)
The percentage of dramas that were G rated
iv)
The percentage of 2005 movies that were PG rated
6) Young people are more likely than older folk to buy music online. Here are the percents of people in
several age groups who bought music online in 2006.
Age Group
Bought music online
12 to 17 years
24%
18 to 24 years
21%
25 to 34 years
20%
35 to 44 years
16%
45 to 54 years
10%
55 to 64 years
3%
65 years and over
1%
a) Which type of display is best for this data set? Explain.
b) Create the display describe in part (a).
7) Here are data from a survey conducted at eight high schools on smoking among students and their
parents.
Neither parent smokes
One parent smokes
Both parents smoke
Student does not smoke 1168
1823
1380
Student smokes
188
416
400
a) How many students participated in the survey?
b) What percent of the students in the survey smoke?
c) Give the marginal distribution of parents’ smoking behavior both in counts and in percents.
d) Calculate the three conditional distributions of students’ smoking behaviors (one for each of the
parent behaviors) and use these percents to describe the relationship between smoking
behaviors of students and their parents.
8) Yellowstone National Park surveyed a random sample of 1526 winter visitors to the park. They asked
each person whether they owned, rented, or had never used a snowmobile. Respondents were also
asked whether they belonged to an environmental organization (like the Sierra Club). The two-way
table summarizes the survey responses.
Environmental Clubs
NO
YES
Total
Never used
445
212
657
Snowmobile renter
497
77
574
Snowmobile owner
279
16
295
Total
1221
305
1526
Do these data provide convincing evidence of an association between environmental club membership
and snowmobile use for the population of visitors to Yellowstone National Park? Follow the four-step
process.
9) Multiple Choice Practice
The National Survey of Adolescent Health interviewed several thousand teens (grades 7 to 12). One
question asked was “What do you think are the chances you will be married in the next ten years?”
here is a two-way table of the responses by gender:
Female
Male
Almost no chance
119
103
Some chance, but probably not
150
171
A 50-50 chance
447
512
A good chance
735
710
Almost certain
1174
756
I) The percent of females among the respondents was
a) 2625
b) 4877
c) about 46%
d) about 54%
e) none of these.
II) Your percent from the previous exercise is part of
a) the marginal distribution of females
b)the marginal distribution of gender
c) the marginal distribution of opinion about marriage
d) the conditional distribution of gender among adolescents with a given opinion
e) the conditional distribution of opinion among adolescents of a given gender
III) What percent of females thought that they were almost certain to be married in the next 10 years?
a) about 16% b) about 24% c) about 40% d) about 45% e) about 61%
IV) Your percent from the previous exercise is part of
a) The marginal distribution of gender
b) The marginal distribution of opinion about marriage
c) The conditional distribution of gender among adolescents with a given opinion.
d) The conditional distribution of opinion among adolescent of a given gender.
e) The conditional distribution of “Almost certain” among females.
10) Find a pie chart of a bar graph of categorical data from a newspaper or a magazine (can be an online
version of the print).
a) Is the graph clearly labeled?
b) Does the graph have strange or unusual scaling to make data look skewed?
c) Do you think the article accompanying the graph correctly interprets the data? Explain.