Chapter 9: Collecting and entering data

MQ Maths A Yr 11 - 09 Page 325 Thursday, July 5, 2001 9:08 AM
Collecting and
entering data
9
syllabus ref
efer
erence
ence
Strand:
Statistics and probability
Core topic:
Data collection and
presentation
In this cha
chapter
pter
9A Types of data
9B Collecting data
9C Organising and displaying
data using column and
sector graphs
9D Graphical methods of
misrepresenting data
9E Histograms and frequency
polygons
9F Stem-and-leaf plots
9G Five-number summaries
and boxplots
MQ Maths A Yr 11 - 09 Page 326 Wednesday, July 4, 2001 5:49 PM
326
M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d
Introduction
In Australia, we are fortunate to be able
to drink tap water. This is not the case in
many countries throughout the world
where the residents must consume bottled
water. If that were the case here, we
would be concerned with the price of
bottled water. Since we would require
ongoing supplies, it would be wise to
shop around for brands and sizes that suit
our particular needs.
In this chapter we investigate:
1. methods we could use to collect data
2. how we could prepare and organise the
data
3. ways in which we could display the
data
4. drawing conclusions from the data collected.
1 Distinguish between the terms qualitative and quantitative.
2 Consider the following temperatures (in degrees Celsius).
1, 3, 8, 6.5, −2, 25, 0, −1, 12
a Arrange the data in ascending order.
b Give the smallest, the largest and the middle temperatures.
3 At Central High School there are 117 students in Year 8 and 62 students in Year 12. In
Years 9 and 10, there are 102 and 92 students respectively, while there are 77 students
in Year 11.
a Arrange this data set in a logical table format.
b Draw a pie/sector graph to display the data.
c Represent the data as a column/bar graph.
4 Express the following in the units indicated.
a 500 g for 98c (express in c/g).
b $1.10 for 1.25 L (in $/L).
c 1.25 kg for $9.50 (in c/g).
5 Which is the better buy:
a 375-mL can of drink for $1, or a 1-L bottle of drink for $2.50?
MQ Maths A Yr 11 - 09 Page 327 Wednesday, July 4, 2001 5:49 PM
Chapter 9 Collecting and entering data
327
Types of data
Before we collect any data, we need to understand the various types of data we could
gather. The diagram below distinguishes the types.
Categorical
(qualitative)
Nominal
Ordinal
Data
Discrete
Numerical
(quantitative)
Continuous
Categorical data are observations that fit some qualitative category. Data of this
type do not involve numbers or measurements. If there is no order associated with
the categories formed by the data, then they are termed nominal data (for example,
answers to questions about a student’s hair colour or method of transport used to
travel to school). When the data categories are aligned to some qualitative scale,
they are termed ordinal data. A response to a question on a scale (for example,
strongly disagree to strongly agree) would constitute ordinal data (that is, some
order is implied).
Numerical data, on the other hand, involve quantitative amounts. Discrete data
are responses, observations or records that can take only certain, set values. Examples
of discrete data would include, for example, the number of children in a family and the
number of marks obtained in a test (even though you can be awarded half marks).
Continuous data may take any value within the range of the data. Here, we often find
data arising as the result of taking measurements (for example, a person’s height, the
daily temperature).
WORKED Example 1
State whether the following pieces of
data are categorical or numerical.
a The value of sales recorded at each
branch of a fast-food outlet
b The breeds of dog that appear at
a dog show
THINK
WRITE
a The value of
sales at each
branch can be
measured.
b The breeds of
dog at a show
cannot be
measured.
a The value of sales
are numerical data.
b The breeds of dog
are categorical
data.
MQ Maths A Yr 11 - 09 Page 328 Wednesday, July 4, 2001 5:49 PM
328
M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d
WORKED Example 2
State whether each of the following records
of numerical data is discrete or continuous.
a The number of people in each car that
passes through a tollgate
b The mass of a baby at birth
THINK
WRITE
a
a
The number of people in the car
must be a whole number.
Give a written answer.
A baby’s mass can be measured to
various degrees of accuracy.
Give a written answer.
1
2
b
1
2
The data are numerical and discrete.
b
The data are numerical and continuous.
remember
remember
1. Data can be classified as either:
(a) categorical — the data are in categories, or
(b) numerical — the data can be either measured or counted.
2. Numerical data can be either:
(a) discrete — the data can take only certain values, usually whole numbers, or
(b) continuous — the data can take any value depending on the degree of
accuracy.
9A
Types of data
1 State whether the data collected in each of the following situations would be
categorical or numerical.
1
a The number of matches in each box is counted for a large sample of boxes.
b The sex of respondents to a questionnaire is recorded as either M or F.
c A fisheries inspector records the lengths of 40 cod.
d The occurrence of hot, warm, mild and cool weather for each day in January is
recorded.
e The actual temperature for each day in January is recorded.
f Cinema critics are asked to judge a film by awarding it a rating from one to five stars.
WORKED
Example
MQ Maths A Yr 11 - 09 Page 329 Wednesday, July 4, 2001 5:49 PM
Chapter 9 Collecting and entering data
329
2 State whether the numerical data gathered in each of the following situations are
discrete or continuous.
2
a The heights of 60 tomato plants at a plant nursery
b The number of jelly beans in each of 50 packets
c The time taken for each student in a class of six-year-olds to tie his or her shoelaces
d The petrol consumption rate of a large sample of cars
e The IQ (intelligence quotient) of each student in a class
WORKED
Example
3 For each of the following, state if the data are categorical or numerical. If numerical,
state if the data are discrete or continuous.
a The number of students in each class at your school
b The teams people support at a football match
c The brands of peanut butter sold at a supermarket
d The heights of people in your class
e The interest rate charged by each bank
f A person’s pulse rate
4 An opinion poll was conducted. A thousand people were given the statement
‘Euthanasia should be legalised’. Each person was offered five responses: strongly
agree, agree, unsure, disagree and strongly disagree. Describe the data type in this
example.
5 A teacher marks her students’ work with a grade A, B, C, D, or E. Describe the data
type used.
6 A teacher marks his students’ work using a mark out of 100. Describe the data type
used.
7 multiple choice
8 The graph at right shows the number of days of each
weather type for the Gold Coast in January.
Describe the data in this example.
Number of days in January
The number of people who are using a particular bus service are counted over a 2-week
period. The data formed by this survey would best be described as:
A categorical data
B numerical and discrete data
C numerical and continuous data
14
D quantitative data
12
10
8
6
4
2
0
ot
H
ol
Co
180
Height (cm)
9 The graph at right shows a girl’s height each year
for 10 years.
Describe the data in this example.
m
ild
ar
M
W
Weather
160
140
120
100
5 6 7 8 9 10 11 12 13 14 15
Age
MQ Maths A Yr 11 - 09 Page 330 Wednesday, July 4, 2001 5:49 PM
330
M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d
Collecting data
Data can be collected using a variety
of techniques. Three common methods
are:
1. observation
2. survey
3. experiment
Observation
Let us return to consider collecting data
on bottled water. We could obtain data
by visiting outlets selling water and by
observing the prices charged. Imagine
that on a visit to three stores, a variety of brands,
sizes and prices of bottled water were observed. The table below indicates the variation
and costs of 1.5-L bottle of water at the three stores. The prices shown are shelf prices
for one particular week.
Bottled water (1.5 L)
Corner store
Coles
Woolworths
Natural spring water
$1.88c
Mount Franklin
Australian spring water
$1.39
$1.35
$1.55
Frantelle
Spring water
$1.89c
$1.17
$1.99c
Rain Farm
Pure Australian rainwater
$1.14
$1.09
$1.19
Peats Ridge Springs
Natural spring water
$1.99c
$1.20
First Choice
Spring water
$1.09
Evian
Natural spring water
$2.66
Brim-Brim
Natural spring water
$2.22
$2.67
Tourquay
Natural spring water
$1.19
Savings
Still spring water
$1.90c
Coles
Natural spring water
$1.29
Bells
Purified natural spring water
$1.19
Schweppes
Cool Ridge still spring water
$1.70
$2.39
$1.14
MQ Maths A Yr 11 - 09 Page 331 Wednesday, July 4, 2001 5:49 PM
Chapter 9 Collecting and entering data
Farmland
Natural spring water
331
$1.89c
Summit
Natural spring water
$1.19
Fraser Blue
Natural Australian spring water
$1.79
Aquaqueen
Australian spring water
$1.59
Cottonwood Valley
Natural spring water
$1.39
Snowy Mountain
Natural spring water
$1.03
Take note of:
1. the variety available at each venue
2. the variation in price between different brands (what could cause this?)
3. the brands that are common to the three stores
4. the variation in price for the same brand at the three venues.
Observations such as these would help you decide on a product suitable for your own
personal needs. However, the 1.5-litre bottle may not be appropriate for you. We shall
delve more deeply into this question with an investigation that also examines the
variety of sizes available.
es
in
ion v
t i gat
n inv
io
es
Collecting data by observing
bottled water prices
In this activity, we attempt to determine the amount of bottled water that you or
your family needs.
1 Visit your local suppliers of bottled water (as many of them as possible). Make
a record of the brands, sizes and costs of all types available at each outlet.
2 Organise your data into a table format (slight modifications to the table shown
previously would be appropriate). These records would be entered as discrete
numerical data. Check to ensure you have not made any obvious recording
errors.
3 Compare the price of bulk buying with that of purchasing several smaller
bottles of equivalent capacity.
4 Now look for a solution to your water requirements. (Consider your needs for a
weekly supply.) Take into account the following points.
(a) Your bottled water supply must provide for all drinking and food
preparation needs.
(b) The water has a long shelf life, so bulk purchases may be more economical
(if you have storage room and funds available).
t i gat
MQ Maths A Yr 11 - 09 Page 332 Wednesday, July 4, 2001 5:49 PM
332
M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d
Reporting your results
Write a formal report of this investigation, detailing:
1. Aim
2. Method of obtaining data
3. Table summarising data
4. Your own personal weekly needs
5. Purchases which would satisfy these needs — specify preferred brands, sizes
and the total cost. Explain your choice of brand and size.
es
in
ion v
t i gat
n inv
io
es
More data collection by observation
It is possible to collect a variety of types of data by observation. The following are
some suggestions you may wish to investigate.
1 Count the number of vehicles passing a particular point on a road during a
given time period. (Road counters across roads typically record this
information.)
2 Observe the number of people waiting at a department store for the doors to
open the morning before a sale commences and on the morning on which it
occurs.
3 Record the wildlife in a park.
4 Take note of the number of early morning joggers during the week compared
with the numbers who jog during the weekend.
5 Record the variation in the price of petrol during the week.
There are many situations where data are obtained by observation. In some
situations, such data provide evidence for development of a scheme, concept or
physical resource. This form of data collection is commonly used for planning,
marketing and the preparation of reports.
Surveys
Collecting data by survey is the form that is most frequently used. The survey is administered with the aid of a questionnaire. The degree of success of obtaining meaningful
and relevant data from a questionnaire depends largely on the care taken in designing
the questions. Methods used to collect data include:
1. personal interviews, where the interviewer usually asks prepared questions, then
records the respondents’ replies
2. telephone interviews, where the interview is conducted over the telephone
3. self-administered questionnaires, which are usually mailed to individuals who
complete the questionnaire, then return it in a pre-paid envelope or hold it for
collection.
t i gat
MQ Maths A Yr 11 - 09 Page 333 Wednesday, July 4, 2001 5:49 PM
Chapter 9 Collecting and entering data
es
in
ion v
t i gat
n inv
io
es
333
Survey methods
This investigation is best undertaken in a small group.
There are advantages and disadvantages of each of the survey methods
mentioned previously. As a group, discuss the various methods, then copy and
complete the table below with as many advantages and disadvantages as you can.
Retain the table for consultation later in designing your own questionnaire.
Type of survey
Advantages
Disadvantages
•
Personal interview
•
•
Telephone interview
•
•
Self-administered questionnaire
•
Questionnaires
As mentioned previously, it is important to take care in designing the questionnaire in
order to obtain relevant and reliable data. In formulating the questions, we must also
keep in mind that we require the data so collected to be in a form which is not difficult
to analyse.
The questions posed can be either open or closed.
Open questions are those where the respondent has no guided boundaries within
which to answer. Questions that belong to this class include:
‘Who is your favourite singer?’
‘What is your favourite food?’
The main difficulty with open questions is that they are often difficult to classify and
analyse.
Closed questions are of the type where the respondent must answer within a
category. The question about food (above) could be rephrased as:
‘Which of the following foods do you prefer most?
K meat
K seafood
K poultry
K vegetables
K fruit’
Analysis of these answers would be easier than trying to fit the open-ended answers
into a category.
It must be noted that options such as:
K none of the above or
K don’t know
are to be avoided, if possible, as they provide the respondent with a non-compliance
exit.
t i gat
MQ Maths A Yr 11 - 09 Page 334 Wednesday, July 4, 2001 5:49 PM
334
M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d
WORKED Example 3
Twenty people answered the open question:
‘What do you particularly dislike about the way the news is presented on TV?’.
They provided the following responses.
1. The ads interrupt too often.
2. Have to wait too long for the headline items.
3. The violent scenes are too graphic.
4. It was better when it was only half an hour instead of a full hour.
5. There’s too much violence.
6. The reporters are politically biased.
7. The accident scenes are not sensitively handled.
8. Some of the reports are too long.
9. It’s all about politics.
10. It mixes up local, interstate and overseas news.
11. I’d like it to be shorter.
12. There are more advertisements than news.
13. It seems to concentrate on murder and death.
14. I find it far too long.
15. The newsreaders don’t pronounce names correctly.
16. It’s too informal.
17. The newsreader is far too old.
18. I don’t like ads interrupting the news.
19. Some reports show too much blood and gore.
20. The interviews by the reporters are too long.
Classify the responses into categories to identify the main reasons given.
THINK
1
Look for 4 or 5 categories under which
they could be classified.
2
Classify the responses under these
headings.
3
Identify those not classified.
WRITE
Four main categories are apparent
1. Length
2. Violence
3. Advertisements
4. Newsreader
Length
Responses No. 4, 8, 11, 14, 20
Violence
Responses No. 3, 5, 7, 13, 19
Advertisements
Responses No. 1, 12, 18
Newsreader
Responses No. 15, 16, 17
Those not classified are:
No. 2. Too long for headline items
No. 6. Reporters politically biased
No. 9. All about politics
No. 10. Mixes up news
MQ Maths A Yr 11 - 09 Page 335 Wednesday, July 4, 2001 5:49 PM
Chapter 9 Collecting and entering data
THINK
4
335
WRITE
Identify the main reasons.
The main reasons for people disliking the
TV news presentations seem to be centred
around the length of the news and the
violence portrayed.
Note: It is possible to classify these responses into different categories. No single way is
correct. In practice, the responses from open-ended questions are generally classified
several ways until the researcher is satisfied with the classifications.
Preparing a questionnaire
In preparing good questions, it is advisable to keep the following points in mind:
1. The questions should flow smoothly from one to the next.
2. Introductory remarks should be included outlining the aim and purpose of the
questionnaire, along with any necessary instructions.
3. Jargon, slang and abbreviations should be avoided.
4. Do not ask questions which are vague or ambiguous.
5. Avoid bias and emotional language.
6. Avoid double-barrelled questions.
7. Do not pose leading questions (that is, those that lead to an expected response).
8. Make sure your questions are capable of being answered by your respondents.
9. Avoid questions with double negatives.
10. At the conclusion, thank the respondent for answering.
es
in
ion v
t i gat
n inv
io
es
Good and bad question writing
As an aid to good questionnaire writing, it is helpful to practise writing questions,
then do a pilot test to determine whether the questions are clear.
1 In small groups, discuss each of the points in the list above to gain a
comprehensive understanding of good question writing.
2 Choose a topic that interests you.
3 Illustrate each of the points mentioned above by designing questions that do not
conform to the guidelines of ‘good question writing’.
4 Rewrite the questions in a clear format.
5 Conduct a pilot test of both sets of your questions on the members of your
group.
t i gat
MQ Maths A Yr 11 - 09 Page 336 Wednesday, July 4, 2001 5:49 PM
336
es
in
ion v
t i gat
n inv
io
es
M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d
Survey of bottled water
With an understanding of good question writing, we can now put these techniques
into practice by conducting a classroom survey.
Let us consider the bottled-water situation we encountered earlier in the chapter.
1. The aim of this survey is to discover:
(a) whether people prefer bottled water to tap water
(b) if purchasing bottled water, their brand preference
(c) how much bottled water they would drink in a week
(d) what they really know about bottled water.
2. To encourage your respondents to take care in completing your survey, it
should have variety and be interesting. All your questions should not be of the
same style. Be creative in your question construction. Questions such as:
‘Draw a mouth on each face to show how much you like the types of water’
Love it
Nice
Tap water
OK
Not nice
Horrible
Bottled water
create interest in your survey and encourage the respondents to answer with
care. Also consider styles like ‘circle the answer’ and ‘tick the box’.
1 Design your questions to fulfil the aims of the survey. Decide on the variables
to be recorded. Are they categorical, numerical, discrete or continuous?
2 Make copies of your questionnaire to be answered by other members of your
class; alternatively, draw up a table to record the answers if you wish to use the
interview technique; then conduct your survey.
3 Critically appraise your questionnaire’s effectiveness after its administration.
Did your questionnaire achieve the aims of the survey? Did your respondents
find it interesting? Did you feel that the data obtained were reliable? Should
you disregard any data?
4 Retain these results for later when we will look at the variety of methods we
can use to display data.
Experiments
Collecting data by conducting experiments is commonly used in the sciences and
related fields. In order to draw reliable conclusions from experiments, the data must be
able to be reproduced. The conclusions from one experiment are frequently not sufficient evidence on which to base a report.
t i gat
MQ Maths A Yr 11 - 09 Page 337 Wednesday, July 4, 2001 5:49 PM
Chapter 9 Collecting and entering data
337
Let us return to consider the bottled-water observation discussed earlier. Much variation is evident among the prices of the 1.5-litre bottles of water. An assumption would
be that the more expensive bottles contained better-quality water. An experiment could
be devised to determine whether consumers can detect differences in the quality of the
water samples with a relatively simple taste test.
es
in
ion v
t i gat
n inv
io
es
Bottled water taste test
The aim of this experiment is to discover whether people can distinguish, by taste,
1. between different samples of bottled water (Test 1)
2. tap water from bottled water (Test 2).
This experiment would be best performed as a whole class activity.
1 Prepare three jugs: one containing tap water, one with a cheap variety of bottled
water and the third with an expensive variety of bottled water. Provide small
cups for each jug.
Because the samples may look different (for example, tap water may be slightly
coloured), this experiment would be best conducted with the participants
blindfolded.
2 For Test 1, blindfolded students will each be given a sip of a cheap variety and
a sip of an expensive variety of the bottled water. The test is to see whether the
taster can identify which is the cheaper variety and which the dearer.
3 Test 2 is conducted similarly, using bottled water and tap water.
4 Prepare a result sheet (as below).
Student
Test 1 (✓ or ✗)
Test 2 (✓ or ✗)
a Record the taste tests.
b Are any conclusions obvious at his stage?
c Identify any problems with the tests.
5 Retain the results of this experiment for use later, when we will consider ways
of displaying data.
We must not forget that the results of this taste test could not be published
as conclusive evidence of the ability of consumers generally to be able (or
unable) to determine the quality of water. The results obtained by your class
would need to be confirmed by similar results from tests performed on many
other groups.
t i gat
MQ Maths A Yr 11 - 09 Page 338 Wednesday, July 4, 2001 5:49 PM
338
es
in
ion v
t i gat
n inv
io
es
M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d
Gathering data from
the World Wide Web (www)
The Web is an excellent resource for data collection. We will spend some time now
investigating web sites that provide interesting and relevant data for use later on.
1 A census is conducted on the Australian population every five years. The
Australian Bureau of Statistics (ABS) has published the results of the 1996
census with data available for viewing on their web site. It takes some time for
the data from a census to be collated and analysed, so the results of the next
census (2001) may not be available until some years afterwards. Browse the
site, noting the data available. Take a note of the web address for future
reference.
2 The Bureau of Meteorology publishes climatic data for numerous towns in
Queensland. Locate its web site and investigate the range of data displayed.
Take particular note of the data recorded for the city or town closest to where
you live. Record the site address for future use.
3 What is a ‘gallup poll’? This famous poll is named after its founder, the
American statistician, George Gallup, who was born in 1901. Another wellknown poll is the Morgan Gallup Poll which publishes statistics on a variety of
topical issues throughout the world. Search the Web for information on gallup
polls. Record your findings in the form of a poster and present the results of
your search to the class.
remember
remember
1. Some important methods of collecting data are observations, surveys and
experiments.
2. All data collection requires careful planning with regard to the variables
recorded in order to maintain the quality of the data collected.
3. A survey may be conducted personally, via the telephone or it may be selfadministered. Each of these methods has advantages and disadvantages.
4. Care must be exercised in constructing a questionnaire so that the responses
collected are of good quality.
5. The questions may be of an open or closed format.
6. Open questions must eventually be placed in categories before proceeding to
the stage of presentation.
7. The data obtained from experiments must be capable of being reproduced
before generalisations can be made.
8. In all cases of data collection, the quality and reliability of the data must be
examined before processing.
t i gat
MQ Maths A Yr 11 - 09 Page 339 Wednesday, July 4, 2001 5:49 PM
Chapter 9 Collecting and entering data
9B
339
Collecting data
1 Explain what you understand by the terms ‘open’ and ‘closed’ questions. Give an
example of an open question. Rewrite your question in a closed format.
2 Twenty students were asked their opinions about the cause of congestion at the school’s
front gate. Analyse their responses below, suggest categories into which they could be
3
classified and identify the most commonly stated reasons for the congestion.
1. The cars shouldn’t come up the front driveway.
2. The front entrance is too small.
3. There should be another entrance.
4. The buses are the problem.
5. The students get in the way of the cars.
6. Bike riders should have a separate entrance.
7. The Senior school and the Junior school should start and finish at different times.
8. The cars block the gate.
9. The bike riders don’t know the road rules.
10. The buses should stop further down the road.
11. Too many students.
12. Parents don’t care where they park.
13. The front gates should be wider.
14. Bike riders should go out the back gate.
15. Kids block the cars.
16. Kids just sit around talking there.
17. There should be a traffic control officer there to direct the traffic.
18. Students should not just sit around there.
19. The buses all arrive at the same time.
20. The road is too narrow.
WORKED
Example
3 Thirty students were asked: ‘Identify one thing in your maths course which you particularly don’t like’. Classify the responses below into appropriate categories, then
identify the main reasons.
1. It’s too hard.
2. There’s too much homework.
3. I can’t understand the teacher.
4. It’s boring.
5. The boys are too distracting.
6. The teacher doesn’t like girls.
7. It’s too much work.
8. We get homework every night.
9. I can’t understand it.
10. The boys show off.
11. The boys always get better marks.
12. I can’t concentrate for that length of time.
13. We do something new every lesson.
14. I don’t like working in groups.
15. The teacher expects too much.
16. Our teacher is too strict.
MQ Maths A Yr 11 - 09 Page 340 Wednesday, July 4, 2001 5:49 PM
340
M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
The teacher doesn’t help us enough with our problems.
There’s too much work to cover.
We’re expected to do assignments.
I don’t like doing presentations to the class.
Our class is too big.
The work is not interesting.
I just don’t understand maths.
I don’t like the teacher.
The teacher expects too much of us.
There’s too much work to cover.
We’re expected to remember too much.
It won’t help me later in life.
The teacher picks on me because I don’t understand the work.
The course is not relevant.
4 Identify the areas of concern in the following questions, then rewrite each so that the
meaning is clear and understandable.
(a) How much do you earn?
(b) Do you exercise regularly?
(c) Is the GST in Australia less than the VAT in England?
(d) Do you generally support the causes of murderous terrorists who threaten the lives
of peace-loving people?
(e) Do you support the Prime Minister’s policy on wildlife preservation?
(f) What is your height in inches?
(g) Did you buy your sneakers for comfort and quality?
(h) You don’t agree with charging more for skim milk (where they’ve taken out the
cream) than for full cream milk, do you?
(i) Do you agree that we should do more for our ‘diggers’ who risked their lives
during the war so that we could be free?
5 Write the following open questions in closed format.
(a) What is your age?
(b) How much pocket money do you get each week?
(c) How do you travel to school?
(d) What type of destination do you prefer for a holiday?
Data preparation
Having considered so far the collection of data by observation, survey and experimental
methods, we must now reflect on the techniques at our disposal for treating these data.
However, before we rush into doing calculations, compiling tables and drawing graphs,
we must first carefully examine the data for anomalies.
We must make a decision with regard to non-compliant responses from a respondent. Sometimes a decision is made just to disregard those non-compliant responses
and to count the remainder of the compliant responses of the survey from the respondent. At other times, the whole survey from that respondent is disregarded. Alternatively, we could cope with the dilemma (after the survey has been administered) by
providing another category in the question to absorb all those responses which do not
slot into the categories given.
MQ Maths A Yr 11 - 09 Page 341 Wednesday, July 4, 2001 5:49 PM
Chapter 9 Collecting and entering data
341
Data must also be checked for recording errors, which commonly occur. Recording
errors should not be included as part of the data as they will distort calculations. Where
possible, when errors are detected, every effort should be made to correct the data
(make new observations, contact the respondent etc.). If for any reason the errors
cannot be rectified, that record should be excluded.
In the business world, databases are often created to allow responses for each question only within defined constraints (a database may not allow a number entry where a
word was expected, for instance). This reduces the chance of recording invalid data, but
would not stop you entering your age as 66 instead of 16, so careful scrutiny and
checking of data are essential before the results are displayed and analysed.
Organising and displaying data
Organising data
Once data have been collected and checked for errors they need to be put into an organised form. This involves tallying the responses to a questionnaire, accurately recording
your observations or tabulating the results of your research.
This task is made easier if the questionnaire is designed with ease of tabulation in
mind. Usually the results are first organised into a table and the number of responses in
each category recorded. This is often done with tally marks and using the gatepost
method.
WORKED Example 4
A survey is conducted among 24 students who were asked to name their favourite
spectator sport. Their responses are recorded below.
AFL
Cricket
Cricket
Soccer
Rugby League
Cricket
Tennis
Cricket
AFL
Rugby League
AFL
AFL
Rugby Union
Soccer
Netball
Basketball
Basketball
Netball
AFL
Cricket
Cricket
AFL
Rugby League
Cricket
Put these results into a table.
THINK
Draw a table and beside each sport put a
tally mark for each response. Every fifth
tally mark becomes a gatepost.
WRITE
Sport
Tally
Frequency
AFL
|||| |
6
Basketball
||
2
Cricket
|||| ||
7
Netball
||
2
Rugby League
|||
3
Rugby Union
|
1
Soccer
||
2
Tennis
|
1
MQ Maths A Yr 11 - 09 Page 342 Wednesday, July 4, 2001 5:49 PM
342
M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d
For simplicity, numerical data may be tabulated in groups.
WORKED Example 5
A Year 11 class was surveyed on their weekly income. The responses are shown below.
$75
$115
$60
$54
$88
$0
$98
$102
$56
$45
$83
$71
$40
$37
$87
$117
$43
$79
$58
$89
$70
$105
$99
$55
Complete the table below.
Income
Tally
Frequency
0–20
21–40
41–60
61–80
81–100
101–120
THINK
WRITE
Count the number of responses within each
category and put a tally mark in the
column.
Income
Tally
Frequency
0–20
|
1
21–40
||
2
41–60
|||| ||
7
61–80
||||
4
81–100
|||| |
6
101–120
||||
4
Displaying data
The most common way of displaying data is by using a graph. Different graphs have
different purposes. We will now look briefly at column graphs and sector graphs, then
look at histograms, stem plots and boxplots.
Column graphs
A column graph (or bar graph) is used when we wish to show a quantity. Categories
are written on the horizontal axis and frequencies on the vertical axis.
MQ Maths A Yr 11 - 09 Page 343 Thursday, July 5, 2001 10:54 AM
Chapter 9 Collecting and entering data
343
WORKED Example 6
The table below shows the results of the survey
on favourite sports.
Sport
Frequency
AFL
6
Basketball
2
Cricket
7
Netball
2
Rugby League
3
Rugby Union
1
Soccer
2
Tennis
1
Show this information in a column graph.
1
Draw the horizontal axis showing each sport.
2
Draw a vertical axis to show frequencies up to 7.
Draw the columns all the same width with
gaps between.
Use a ruler.
Label the axes.
Give the graph a title.
3
4
5
6
WRITE
Frequency
THINK
7
6
5
4
3
2
1
0
Favourite sports of 24 students
l e
l t
n r s
FL tbal cke tbal agu nio cce nni
i
e
e
e
e
U
r
So T
sk C N y L by
a
b
g
B
g u
Ru R
Sport
A
Sector graphs
A sector graph (circle graph, or pie
graph) is used when we want the
graph to display a comparison of
quantities. An angle is drawn at the
centre of the circle that is the same
fraction of 360° as the fraction of
people making each response.
MQ Maths A Yr 11 - 09 Page 344 Wednesday, July 4, 2001 5:49 PM
344
M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d
WORKED Example 7
For the table in worked example 6, draw a sector graph.
THINK
1
WRITE
Calculate each angle as a fraction
of 360°.
AFL =
6
-----24
× 360°
Basketball =
= 90°
Cricket =
7
-----24
2
-----24
× 360°
= 30°
× 360°
Netball =
= 105°
2
-----24
× 360°
= 30°
Rugby League =
× 360°
3
-----24
= 45°
Rugby Union =
1
-----24
× 360°
= 15°
Soccer =
2
-----24
× 360°
= 30°
2
3
Draw the graph.
Label each sector or provide a
legend.
Tennis =
1
-----24
× 360°
= 15°
Sport
AFL
Basketball
Cricket
Netball
Rugby League
Rugby Union
Soccer
Tennis
These graphs can also be drawn using a spreadsheet and the charting tool.
In our next investigation we shall explore how to enter data into a spreadsheet and
how to display the results in a variety of graphical forms.
MQ Maths A Yr 11 - 09 Page 345 Wednesday, July 4, 2001 5:49 PM
Chapter 9 Collecting and entering data
345
remember
remember
When data are collected they are usually first organised into table form.
Data can be easily counted using a tally column and the gatepost method.
Sometimes numerical data are better organised into categories.
A column graph is drawn when we want to display quantities.
A sector graph is drawn when we want to compare quantities.
9C
WORKED
4
1 A class of students was asked to identify the make of car their family owned. Their
responses are shown below.
Holden
Ford
Nissan
Mazda
Mitsubishi
Ford
Holden
Holden
Toyota
Toyota
Nissan
Ford
Holden
Ford
Holden
Mazda
Mitsubishi
Ford
Holden
Ford
Toyota
Toyota
Toyota
Holden
Ford
Holden
Toyota
Mazda
Ford
Toyota
L Spread
XCE
sheet
Example
Organising and displaying
data using column and sector
graphs
E
1.
2.
3.
4.
5.
Frequency
tally tables
Put these results into a table.
E
L Spread
XCE
Frequency
tally tables
DIY
Put these results into a table.
WORKED
Example
5
3 The marks scored on a Maths exam, out
below.
87
44
95
66
78
69
54
60
66
69
66
77
71
83
74
81
69
70
of 100, by 25 Year 11 students are shown
66
79
57
92
66
78
71
Copy and complete the table below.
Mark
40–49
50–59
60–69
70–79
80–89
90–99
Tally
Frequency
sheet
2 The results of a spelling test done by 30 students are shown below.
6
7
6
8
4
6
6
7
5
9
5
7
8
10
5
9
7
7
7
6
4
7
8
8
7
8
6
5
9
7
MQ Maths A Yr 11 - 09 Page 346 Wednesday, July 4, 2001 5:49 PM
346
4 The data below show the
certain month.
114 195 175 163
178 216 200 147
169 185 173 164
141 155 132 143
EXCE
et
reads
L Sp he
Column
graphs
EXCE
et
Column
graphs
DIY
9.1
SkillS
HEET
number of customers that entered a shop each day in a
180
168
130
190
120
173
119
179
204
102
158
200
199
150
163
Choose suitable groupings to tabulate these data.
WORKED
reads
L Sp he
M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d
Example
5 Draw a column graph to display the data from question 1.
6
WORKED
Example
6 Draw a sector graph to display the data from question 1.
7
7 Draw a column graph to display the data from question 2.
8 Draw a column graph to display the data from question 3.
9 Draw a column graph to display the data from question 4.
10 Draw a sector graph to compare the number of people in each category from
question 3.
1
For questions 1–4, state if the data are quantitative or qualitative. If they are quantitative,
also state whether they are continuous or discrete.
1 Customers in a video shop vote for their favourite movie.
2 Customers in a video shop have records kept on the number of movies they hire each
year.
3 The video shop keeps records of the number of times each movie has been hired.
4 The video shop keeps records of the length of each movie.
The bar chart at right shows the marital status of
respondents to a survey.
15
5 How many people responded to the survey?
10 Draw a pie/sector graph of the data.
Widowed
9 How many people had been married at some time?
Separated
8 How many respondents were either divorced or
separated?
5
Divorced
7 How many people were married?
10
Married
6 What was the most common marital status?
Never
married
9.1
Frequency
Work
ET
SHE
MQ Maths A Yr 11 - 09 Page 347 Wednesday, July 4, 2001 5:49 PM
Chapter 9 Collecting and entering data
es
in
ion v
t i gat
n inv
io
es
347
Spreadsheets — Displaying
numerical data
Worked examples 6 and 7 explained the calculations needed to display manually
the same numerical data as a column graph and as a sector (pie) graph. In this
activity, we look at entering the data into a spreadsheet, then exploring some of the
graphical tools available. The instructions provided refer to the Excel spreadsheet.
If you are using a different spreadsheet, your teacher will give you the equivalent
commands.
1 Enter the heading ‘Sport’ and the categories of sport in Column A as shown
above.
2 Enter the heading ‘Frequency’ and the relative frequencies in Column B as
shown.
• Highlight the sporting classifications and the frequencies, enter the Chart
Wizard and select the Column graph option. Follow the instructions,
remembering to label the axes and title the graph, to produce the column
graph shown above.
• Select the data again and follow through the Chart Wizard to produce the
pie graph shown.
• Print out a copy of your table and graphs.
t i gat
MQ Maths A Yr 11 - 09 Page 348 Wednesday, July 4, 2001 5:49 PM
348
es
in
ion v
t i gat
n inv
io
es
M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d
Spreadsheets — Displaying
categorical coded data
The previous investigation dealt with entering and graphing numerical data.
Frequently, data are collected in categories that are each given a code. Examine the
shopping survey shown below. It represents a section of the ‘Australian Shoppers
Survey’ distributed by PMP Data Based Marketing to households in Australia.
2. SHOPPING
1. How many litres of soy milk do you consume per week?
1–2 litres
1
3–4 litres
2
4 litres +
3
2. Which of the following beverages would you consider having home delivered?
Juice
1
Water
2
Soft Drink
3
Once a year
Never
5
6
3. How often do you have your hair coloured at a salon?
More than once per 6 weeks
Once every 6 weeks
1
2
Every 3 months
Every 6 months
3
4
4. How often do you purchase shampoo?
Once a week
Once a fortnight
1
2
Once a month 3
Once every 2 mths 4
5. What type of washing machine do you own/use?
Top Loader
1
Front Loader
3
6. What brands of laundry detergent do you regularly use?
Once a week
Cold Power
Cold Powermatic
Drive
01
02
03
04
Dynamo
Dynamomatic
Omo
Omomatic
05
06
07
Radiant
Surf
Other
09
10
11
Toys ‘R’ Us
toyspot.com
Toyworld
09
10
11
7. Which of the following is your main toy store?
Big W
dstore
Grace Bros.
K-mart
01
02
03
04
Mr Toys
Myer
Target
Toy Kingdom
05
06
07
8. How many dogs do you have in your home?
None
1
One
2
Two
3
3 or more 4
3
3 or more 4
9. How many cats do you have in your home?
None
1
One
2
Two
10. Do you treat your pets for any of the following?
Fleas/Ticks
1
11. Do you or your partner smoke?
Yes
1
Worms
2
No
2
12. Please sign that you are over 18 and a smoker:
Your signature_______________________________________________________________
Partner’s signature ___________________________________________________________
13.
Have you purchased or would you consider purchasing any of the following
by mail or telephone?
Have Bought
Considering
You
Books
Computer Equipment
Computer Software
Cosmetics
Craft Products
Fashion Wear
Kitchen Products
Music (Tapes, CDs, Records)
Videos
Vitamins/Health Supplements
Wine
01
02
03
04
05
06
07
08
09
10
11
Ptnr
23
24
25
26
27
28
29
30
31
32
33
You
12
13
14
15
16
17
18
19
20
21
22
Ptnr
34
35
36
37
38
39
40
41
42
43
44
14. Have you/your partner bought or searched for goods to buy over the internet?
You
1
Your partner
2
Thinking about buying beauty care products which are new to the market,
please indicate whether you agree or disagree with the following statements:
15.
In general, I am among the first to buy new beauty care products when they
appear on the market.
Strongly Agree
Somewhat Agree
Neither Agree or Disagree
1
2
3
Somewhat Disagree
Strongly Disagree
4
5
16. I enjoy taking chances in buying new beauty care products.
Strongly Agree
Somewhat Agree
Neither Agree or Disagree
1
2
3
Somewhat Disagree
Strongly Disagree
4
5
t i gat
MQ Maths A Yr 11 - 09 Page 349 Wednesday, July 4, 2001 5:49 PM
Chapter 9 Collecting and entering data
349
Notice that:
1. all the questions are of the closed type
2. the response categories for the questions each have a code associated with them
(this enables them to be entered into a computer database or spreadsheet for
analysis)
3. the ordinal data in questions 15 and 16 are coded.
Let us take a small number of responses to question 15 and see how the data might
be treated.
1 Open a spreadsheet, head Column A with ‘Respondent’ and enter the
respondents’ names as shown below.
2 Head Column B as ‘Code’, then enter the coded responses shown beneath the
heading.
3 Columns A and B represent the raw data. We will collate these records on the
spreadsheet to the right of these columns.
4 Head Column D with the word ‘Code’, then enter the coded categories 1 to 5
beneath.
5 In Column E, enter the meanings of the coded categories beside their
respective code (this just makes the spreadsheet and graphs more meaningful).
6 Head Column F with the word ‘Number’. In this column we are going to
count the number of 1’s, 2’s etc. which occur in all the responses in Column
B. The formula that enables us to do this is the COUNTIF command. Its
format is =COUNTIF(range,criteria).
7 To count the number of 1’s in the range B3 to B14, the formula would be
=COUNTIF(B3:B14,1). Enter this formula in Cell F3.
MQ Maths A Yr 11 - 09 Page 350 Wednesday, July 4, 2001 5:49 PM
350
M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d
8 To count the number of 2’s, in Cell F4 enter the formula
=COUNTIF(B3:B14,2).
9 Complete Cells F5, F6 and F7 with similar formulas to count the number of
responses coded 3, 4 and 5 in Column B.
10 Use the Chart Wizard with Columns E and F to display graphically the
responses.
11 Print out a copy of your spreadsheet and graph/s.
es
in
ion v
t i gat
n inv
io
es
Spreadsheets — Displaying
data from the Web
t i gat
In previous web searches you have noted web sites with interesting and relevant
data. Select one of these sites and obtain the necessary data.
1 Enter the data into a spreadsheet. Remember to include meaningful headings to
enhance the quality of your display.
2 Present your data in both table and graphical form (don’t forget to label axes
and include a title).
3 Print out a copy of your presentation and arrange it in a poster format.
4 Examine your graph/s and write a summary of the data.
5 Present your findings to the class.
Methods of misrepresenting data
Many people have reasons for misrepresenting data: politicians may wish to magnify
the progress achieved during their term, or business people may wish to accentuate
their reported profits. There are numerous ways of misrepresenting data. In this section,
only graphical methods of misrepresentation are considered.
Vertical axis and horizontal axis
It is a truism that the steeper the graph the better the growth appears. A ‘rule of thumb’
for statisticians is that for the sake of appearances, the vertical axis should be twothirds to three-quarters the length of the horizontal axis. This rule was established in
order to have some comparability between graphs.
The following figure illustrates how distorted the graph appears when the vertical
axis is disproportionately large.
MQ Maths A Yr 11 - 09 Page 351 Wednesday, July 4, 2001 5:50 PM
Chapter 9 Collecting and entering data
Declared roads as at 30 June 1979
Secondary
roads
Sealed
13 000
Paved
Formed
12 000
Unconstructed
11 000
State
highways
10 000
Developmental
roads
Main roads
Length of road (km)
9000
8000
7000
6000
5000
4000
3000
2000
1000
Urban and
sub arterials
Source: Dept of Mapping & Surveying (1980),
Queensland resources atlas, 2nd rev. ed.
(Courtesy Dept of Lands)
Changing the scale on the vertical axis
The following table gives the holdings of ROPE Corporation during 2001.
Quarter
Holdings in
$’000 000
J–M
A–J
J–S
O–D
200
200
201
202
351
MQ Maths A Yr 11 - 09 Page 352 Wednesday, July 4, 2001 5:50 PM
M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d
Holdings ($’000 000)
Here is one way of representing
this data:
But it is not very spectacular,
is it? Now look at the following
graph.
300
x
x
J-M
A-J
200
x
x
J-S
O-D
100
Quarter
Holdings ($’000 000)
202
The shareholders would be
happier with this one.
x
201.5
x
201
200.5
x
200
x
J-M
Omitting certain values
A-J
J-S
Quarter
O-D
202
Holdings ($’000 000)
If one chose to ignore the second
quarter’s value, which shows no
increase, then the graph would look
even better.
x
201.5
x
201
200.5
x
J-M
200
J-S
Quarter
O-D
Foreshortening the vertical axis
Look at the figures below. Notice in graph (a) that the numbers from 0 to 4000 have
been omitted. In graph (b) these numbers have been inserted. The rate of growth of the
Queensland Police Force looks far less spectacular in graph (b) than in graph (a).
5500
x
5000
x
5000
x
4500
x
x
4000
1977
x x
x x
79
81
x
Qld police strength
x
Qld police strength
352
4000
x x
x
x
x
x x
xx x
x
3000
2000
1000
83
85
87
1977
79
81
83
85
Year
Year
(a)
(b)
Source: Qld Year Book, 1989, p.53 and The Australian Bureau of Statistics.
87
Foreshortening the vertical axis is a very common procedure. It does have the
advantage of giving extra detail but it can give the wrong impression about growth
rates.
MQ Maths A Yr 11 - 09 Page 353 Wednesday, July 4, 2001 5:50 PM
353
Chapter 9 Collecting and entering data
Visual impression
In this graph, height is the property that gives the true relation, yet the impression of a
much greater increase is given by the volume of each money bag.
Net value of production
$m
400
300
200
100
S
S
S
1990
1995
Year
2000
A non-linear scale on an axis or on both axes
Consider the following two graphs.
500
Particles / unit area
Particles / unit area
500
400
300
200
100
95
96 97
Year
(a)
98
300
200
100
95
99
96 97
Year
(b)
98
99
Both of these graphs show the same numerical information. But graph (a) has a
linear scale on the vertical axis and graph (b) does not. Graph (a) emphasises the everincreasing rate of growth of pollutants while graph (b) suggests a slower, linear growth.
WORKED Example 8
The following data give wages and profits for a certain company. All figures are in millions
of dollars.
Year
Wages
% increase in wages
Profits
% increase in profits
1985
1990
1995
2000
6
25
1
20
9
50
1·5
50
13
44
2·5
66
20
54
5
100
Continued over page
MQ Maths A Yr 11 - 09 Page 354 Wednesday, July 4, 2001 5:50 PM
354
M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d
20
18
16
14
12
10
8
6
4
2
Wages
Profits
Wages and profits (% increase)
Wages and profits ($m)
Now consider the graphs:
100
1985 1990 1995 2000
Year
(a)
Profits
75
50
25
Wages
1985 1990 1995 2000
Year
(b)
Consider the following questions.
a Do the graphs accurately reflect the data?
b Which graph would you rather have published if you were:
ii an employer dealing with employees requesting pay increases?
ii an employee negotiating with an employer for a pay increase?
THINK
WRITE
a
a Graphs do represent data accurately. However, quite a different picture of wage and
profit increases is painted by graphing with
different units on the y-axis.
1
2
b ii
Look at the scales on both axes. All
scales are linear.
Look at the units on both axes.
Graph (a) has y-axis in $ while
graph (b) has y-axis in %.
1
i
2
ii
1
2
Compare wage increase with
profit increases.
The employer wants high profits.
Consider again the increases in
wages and profits.
The employee doesn’t like to see
profits increasing at a much
greater rate than wages.
b ii The employer would prefer graph (a)
because he/she could argue that
employees’ wages were increasing at a
greater rate than profits.
ii The employee would choose graph (b),
arguing that profits were increasing at a
great rate while wage increases clearly
lagged behind.
remember
remember
To determine whether data in graphical form have been misrepresented, check
that:
1. scale on both axes is linear
2. scales on vertical and horizontal axes have not been lengthened or shortened to
give a biased impression
3. certain values have not been omitted in the graph
4. picture graphs are drawn to represent a height and not a volume
MQ Maths A Yr 11 - 09 Page 355 Wednesday, July 4, 2001 5:50 PM
355
Chapter 9 Collecting and entering data
9D
Graphical methods of
misrepresenting data
1 This graph shows the dollars spent
on health care for 1990, 1994, and
8
1998. Draw another bar graph that
minimises the fall in health care
funds.
WORKED
105
Health care cost ($ m)
Example
104
103
102
101
100
1990
1994
1998
2 Examine this graph of employment growth.
Total employment
(millions)
Growth of total employment, 1947–1981
6
5
4
3
2
1
0
1947
1954
1961
1966
1971
1976
1981
Why is this graph misleading?
3 Examine this graph.
650
Road fatalities, Queensland
600
550
500
450
400
1987
1984
1981
1978
1975
1972
1969
Source: Qld Year Book, 1989, p. 205 and the Australian Bureau of Statistics.
a Redraw this graph with the vertical axis showing road fatalities starting at 0.
b Does the decrease in road fatalities appear to be as significant as the graph
suggests?
MQ Maths A Yr 11 - 09 Page 356 Wednesday, July 4, 2001 5:50 PM
M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d
4 This graph shows the student-to-teacher ratio in Queensland for the years 1979 to 1987.
Apart from the fact that the 0 on the vertical scale has not been shown, this graph is
misleading.
Student to teacher ratio, Queensland
21
Non-government
19
(a)
Rate
356
Government
17
15
1987
1985
1983
1981
(a) Break in continuity of series
1979
Source: Qld Year Book, 1989, p. 125 and The Australian Bureau of Statistics.
In 1981 what was the student-to-teacher ratio at:
a non-government schools?
b government schools?
How can you explain this when most classes in the city government schools are 25 or
more? Explain fully.
5 You run a company that is listed on the Stock Exchange. During 2002 you have given
substantial rises in salary to all your staff. However, profits have not been as spectacular
as in 2001. The following table gives the figures for the mean salary and profits for
each quarter. Draw 2 graphs, one showing profits, the other showing salaries, that will
show you in the best possible light to your shareholders.
1st quarter
2nd quarter
3rd quarter
4th quarter
6
5·9
6
6·5
4
5·9
6
7·5
Profits
$’000 000
Salaries
$’000 000
6 You are a manufacturer and your plant is discharging heavy metals into a waterway.
Your own chemists do tests every 3 months and the following table gives the
results for a period of 2 years. Draw a graph which will show your company in the
best light.
2000
Date
Concentration
(parts per million)
2001
Jan.
Apr.
July
Oct.
Jan.
Apr.
July
Oct.
7
9
18
25
30
40
49
57
MQ Maths A Yr 11 - 09 Page 357 Wednesday, July 4, 2001 5:50 PM
Chapter 9 Collecting and entering data
357
7 This pie graph shows the break-up of workers compensation costs incurred by
employers other than the government.
Break-up of non-government workers compensation costs
Common law
claims $143.5m
Total $202.8m
Common law fees
and outlays $19m
Weekly compensation
payouts and statutory lump
sum claims $40.3m
Source: Courier-Mail, 21 September 1991.
a What fraction of the total costs are weekly compensation payouts and statutory
lump sum claims?
b What angle should be at the centre of this sector?
c What angle is at the centre of this sector?
d Why has this distortion of angle occurred? Discuss how this might be used to mislead the reader.
8 This graph shows how the $27 that a buyer pays for a CD is distributed among the
departments involved in its production and marketing.
Where your $27 goes
Other recording
costs 65c
Record company
Distribution 56c
sales process $1.27
Record company
administration costs $1.54
Mechanical
royalties $1.57
Record shop $7.40
Record company
profit $1.54
Advertising
$1.94
Sales tax
$3.27
Production
$3.40
Royalties and costs
to artist $3.86
You are required to find out whether or not the graph is misleading, to explain fully
your reasoning, and to support any statements that you make. Also,
a comment on the shape of the graph and how it could be obtained.
b Does your visual impression of the graph support the figures?
The community is constantly bombarded with graphical representations of data. We
must be aware that these displays are carefully constructed so that a cursory glance
conveys the impression intended by the presenter. On closer inspection, a different
impression is often revealed. It is wise to look at the detail in a graph and not to rely
simply on the overall visual impact.
MQ Maths A Yr 11 - 09 Page 358 Wednesday, July 4, 2001 5:50 PM
358
t i gat
t i gat
es
in
ion v
t i gat
n inv
io
es
es
in
ion v
n inv
io
es
M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d
Misleading reports
1 Look in any newspaper, magazine or government publication and select three
different types of statistical presentation. Discuss the reasons why each
particular type of presentation was chosen and whether other forms of
presentation might have been just as good.
2 Look at the annual report of a company and discuss critically the forms of
statistical presentation used.
3 Collect three examples of misleading graphs. For each graph explain how it is
misleading. How do such misleading graphs benefit the people who produced
the graphs?
4 Work in groups to prepare misleading company reports with graphs of wages,
profits and percentage increases. Present a report to the class. Other students
must spot the flaws.
5 Work in groups to design a suitable advertisement to explain why someone
should become a member of your club.
a Draw a graph to support your claims.
b Present your graph and argument to the class.
c While other groups are presenting, list any information they have left out.
d Find other advertisements that use this technique to give certain viewpoints.
e List some additional information you think should have been given.
Spreadsheets creating
misleading graphs
Let us return to worked example 8 and use a spreadsheet to draw graph (a).
t i gat
MQ Maths A Yr 11 - 09 Page 359 Wednesday, July 4, 2001 5:50 PM
Chapter 9 Collecting and entering data
359
1 Enter the data as indicated in the spreadsheet above.
2 Graph the data using the Chart Wizard. You should obtain a graph similar to
Graph 1.
3 Copy and paste the graph twice within the spreadsheet.
4 Graph 2 gives the impression that the wages are a great deal higher than the
profits. This effect was obtained by reducing the length of the horizontal axis.
Experiment with shortening the horizontal length and lengthening the vertical
axis.
5 In Graph 3 we get the impression that the wages and profits are not very
different. This effect was obtained by lengthening the horizontal axis and
shortening the vertical axis. Experiment with various combinations.
6 Print out your three graphs and examine their differences.
Note that all three graphs have been drawn from the same data using valid scales.
A cursory glance leaves us with three different impressions. Clearly, it is important
to look carefully at the scales on the axes of graphs.
Another method which could be used to change the shape of a graph is to
change the scale of the axes.
7 Right click on the axis value, enter the Format axis option, click on the Scale
tab, then experiment with changing the scale values on both axes.
Techniques such as these are used to create different visual impressions of the
same data.
8 Use the data in the table of worked example 8 to create a spreadsheet, then
produce two graphs depicting the percentage increase in both wages and profits
over the years giving the impression that:
a the profits of the company have not grown at the expense of wage increases
(the percentage increase in wages is similar to the percentage increase in
profits)
b the company appears to be exploiting its employees (the percentage
increase in profits is greater than that for wages).
Other forms of graphical display
We shall now consider three other forms of graphical display of data: histograms, stem
plots and boxplots.
Displaying data using frequency histograms
A frequency histogram is similar to a column graph with the following essential
features.
1. Gaps are never left between the columns, except for a half-unit space before the first
column.
2. If the chart is coloured or shaded then it is done all in one colour. (The columns are
essentially all representing different levels of the same thing.)
3. Frequency is always plotted on the vertical axis.
4. For ungrouped data the horizontal scale is marked so that the data labels appear
under the centre of each column. For grouped data the horizontal scale is marked so
that the class centre of each class appears under the centre of the column.
MQ Maths A Yr 11 - 09 Page 360 Wednesday, July 4, 2001 5:50 PM
360
M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d
WORKED Example 9
The table below shows the number of people living in each house in a street.
No. of people
1
2
3
4
5
Frequency
1
4
10
15
8
Show this information in a frequency histogram.
2
Draw a set of axes with the number of
people living in a house on the
horizontal axis and frequency on the
vertical axis.
Draw the graph, leaving half a column
width space before the first column.
16
14
12
10
8
6
4
2
0
1 2 3 4 5
Number of people in a house
A frequency polygon is a line graph that can be
drawn by joining the centres of the tops of each
column of the histogram. The polygon starts and
finishes on the horizontal axis a half column width
space from the group boundary of the first and last
groups.
The figure at right shows the frequency polygon
drawn on top of the histogram for the previous
worked example.
It is common practice to draw the histogram and
the polygon on the same set of axes.
Frequency
1
WRITE
Frequency
THINK
16
14
12
10
8
6
4
2
0
1 2 3 4 5
Number of people in a house
WORKED Example 10
The frequency table below shows a class set of marks on an exam. Draw a frequency
histogram and polygon on the same set of axes.
Mark
51–60
61–70
71–80
81–90
91–100
Class centre
55.5
65.5
75.5
85.5
95.5
Frequency
3
5
12
7
3
MQ Maths A Yr 11 - 09 Page 361 Wednesday, July 4, 2001 5:50 PM
Chapter 9 Collecting and entering data
THINK
2
3
4
WRITE
Draw a set of axes with the exam mark on the
horizontal axis and frequency on the vertical
axis. Show the class centres for the exam marks.
Draw the columns, leaving a half-column-width
space before the first column.
Draw a line graph to the centre of each column.
Make sure the line graph begins and ends on the
horizontal axis.
Frequency
1
361
12
10
8
6
4
2
0
.5 .5 .5 .5 .5
55 65 75 85 95
Exam mark
A graphics calculator can also be used to draw histograms. The instructions given
below are for the Texas Instruments TI 83 graphics calculator. If you have another variety, your teacher will show you the equivalent commands.
WORKED Example 11
The marks out of 20 received by 30 students for a book-review assignment are given in the
frequency table below.
Mark
Frequency
12
13
14
15
16
17
18
19
20
2
7
6
5
4
2
3
0
1
Display these data on a histogram using a graphics calculator.
THINK
1
Enter the data.
(a) Clear any previous equations.
(b) Press Y= and clear any functions.
(c) Press STAT , select 1:Edit and press ENTER .
(d) Enter the marks in L1 and the frequency in L2.
2
Set up the calculator for graphing.
(a) Press 2nd [STAT PLOT] and select 1:Plot1.
Press ENTER .
(b) Select On and press ENTER .
(c) Select the type of graph required. The
histogram is the third along on the top row.
(d) At Xlist type in L1.
(e) At Freq type in L2.
(f) Press ZOOM and highlight 9:Zoom Stat;
press ENTER .
(g) If not all of the histogram is shown, press
WINDOW and reset the x- and y-range and
step values.
DISPLAY
MQ Maths A Yr 11 - 09 Page 362 Wednesday, July 4, 2001 5:50 PM
362
M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d
remember
remember
1. Numerical data can be graphed using histograms and polygons.
2. When drawing histograms always put frequency on the vertical axis and never
leave gaps between columns.
3. If the histogram is illustrating ungrouped data, the data labels on the horizontal
axis are placed under the centre of each column.
4. If the histogram is illustrating grouped data, the data labels on the horizontal
axis (that is, the class centres) are placed under the centre of each column.
5. A frequency polygon is a line graph, which can be drawn by joining the centres
of the tops of each column of the histogram.
9E
Histograms and frequency
polygons
This exercise can be completed manually,
or with the aid of a graphics calculator.
WORKED
ogram
GC pr
Example
UV
statistics
EXCE
et
reads
L Sp he
Histograms
and frequency
polygons
9
1 A survey is done on young drivers taking
the written test for their licence.
The number of mistakes each makes
is recorded and the results are shown in
the frequency distribution table at right.
Show this information in a frequency
histogram.
2 Students in a class were asked the
number of children in their families.
The results are shown in the frequency
distribution table at right.
Show this information in a frequency
histogram and polygon.
3 The table below shows the age in years of
the members of a surf club.
Show this information in a frequency
polygon.
No. of mistakes
(score)
0
1
2
3
4
5
No. of drivers
(frequency)
5
8
11
4
3
1
No. of children
in a family
0
2
3
4
5
6
Frequency
3
5
8
4
2
1
Age
18
19
20
21
22
23
24
25
No. of members
3
5
8
13
15
10
8
5
MQ Maths A Yr 11 - 09 Page 363 Wednesday, July 4, 2001 5:50 PM
Chapter 9 Collecting and entering data
363
4 The label on a box of matches states that the average contents of a box is 50 matches.
Quality control surveyed 50 boxes for the number of matches and the results are
shown below.
48
50
49
48
50
50
50
50
49
48
50
51
50
47
47
51
49
51
53
50
50
48
53
49
51
49
53
52
52
49
53
52
54
50
50
52
50
47
51
49
48
49
50
50
52
51
49
49
50
51
a Put this information into a frequency table.
b Show the results in a frequency histogram and polygon.
WORKED
Example
5 The table below shows the length of 71 fish caught in a competition.
10
Length of fish
(mm)
Class centre
Frequency
300–309
304.5
9
310–319
314.5
15
320–329
324.5
20
330–339
334.5
12
340–349
344.5
8
350–359
354.5
7
Show this information in a frequency histogram and polygon.
6 Sixty people were involved in a psychology experiment. The following frequency
table shows the times taken for the 60 people to complete a puzzle for the experiment.
Time taken
(seconds)
Class centre
Frequency
6 to almost 8
1
8 to almost 10
4
10 to almost 12
15
12 to almost 14
18
14 to almost 16
12
16 to almost 18
8
18 to almost 20
2
a Copy the frequency table and complete the class centre column.
b Show the information in a frequency histogram and polygon.
MQ Maths A Yr 11 - 09 Page 364 Wednesday, July 4, 2001 5:50 PM
364
M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d
Stem-and-leaf plots
As an alternative to a frequency table, a stem-and-leaf plot may be used to group and
summarise data.
A stem is made using the first part of each piece of data. The second part of each
piece of data forms the leaves. Consider the case below.
The following data show the mass (in kg) of 20 possums trapped, weighed then
released by a wildlife researcher.
1.8 0.9 0.7 1.4 1.6 2.1 2.7 2.2 1.8 2.3
2.3 1.5 1.1 2.2 3.0 2.5 2.7 3.2 1.9 1.7
The stem is made from the whole number part of the mass and the leaves are the
decimal part. The first piece of data was 1.8 kg. The stem of this number could be considered to be 1 and the leaf 0.8. The second piece of data was 0.9. It has a stem of 0 and
a leaf of 0.9.
To compose the stemand-leaf plot, rule a vertical
column of stems then enter
the leaf of each piece of data
in a neat row beside the
appropriate stem. The first
row of the stem-and-leaf plot
records all data from 0.0 to 0.9.
The second row records data
from 1.0 to 1.9 etc.
Attach a key to the plot to show the reader the meaning of each entry.
It is convention to assemble the data in order of size, so this stem-and-leaf plot
should be written in such a way that the numbers in each row of ‘leafs’ are in ascending
order.
Key: 0 | 7 = 0.7 kg
Stem
0
1
2
3
Leaf
7 9
1 4 5 6 7 8 8 9
1 2 2 3 3 5 7 7
0 2
When preparing a stem-and-leaf plot, it is important to try to keep the numbers in
neat vertical columns because a neat plot gives the reader an idea of the distribution of
scores. The plot itself looks a bit like a histogram turned on its side.
MQ Maths A Yr 11 - 09 Page 365 Wednesday, July 4, 2001 5:50 PM
Chapter 9 Collecting and entering data
365
WORKED Example 12
The information below shows the mass, in kilograms, of twenty 16-year-old boys.
65 45 56 57 58 54 61 72 70 69
61 58 49 52 64 71 66 65 66 60
Show this information in a stem-and-leaf plot.
THINK
1
2
3
WRITE
Make the ‘tens’ the stem and the ‘units’
the leaves.
Write a key.
Complete the plot.
Note: Complete the plot with the leaves in
each row in ascending order.
Key: 5 | 6 = 56 kg
Stem Leaf
4 5 9
5 6 7 8 4 8 2
6 5 1 9 1 4 6 5 6 0
7 2 0 1
Stem
4
5
6
7
Leaf
5 9
2 4 6 7 8 8
0 1 1 4 5 5 6 6 9
0 1 2
It is also useful to be able to represent data with a class size of 5. This could be done
for the previous stem-and-leaf plot by choosing stems 0*, 1, 1*, 2, 2*, 3 where the class
with stem 1 contains all the data from 1.0 to 1.4 and stem 1* contains the data from 1.5
to 1.9 etc. If stems are split in this way it is a good idea to include two entries in the
key. The stem-and-leaf plot for the ‘possum’ data would appear as follows:
Key: 1 | 1 = 1.1 kg
Stem
0*
1*
1*
2*
2*
3*
1* | 5 = 1.5
Leaf
7 9
1 4
5 6 7 8 8 9
1 2 2 3 3
5 7 7
0 2
A stem-and-leaf plot has the following advantages over a frequency distribution
table.
1. The plot itself gives a graphical representation of the spread of data. (It is
rather like a histogram turned on its side.)
2. All the original data are retained, so there is no loss of accuracy when
calculating statistics such as the mean and standard deviation. In a grouped
frequency distribution table some generalisations are made when these values
are calculated.
MQ Maths A Yr 11 - 09 Page 366 Wednesday, July 4, 2001 5:50 PM
366
M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d
WORKED Example 13
The following data give the length of gestation in days
for 24 mothers. Prepare a stem-and-leaf plot of the data
using a class size of 5.
280
288
292
281
287
273
288
292
285
295
279
268
276
279
281
282
266
284
270
275
292
271
278
281
THINK
1
2
3
WRITE
A group size of 5 is required. The smallest
piece of data is 266 and the largest is 295
so make the stems: 26*, 27, 27*, 28, 28*,
29, 29*.
The key should give a clear indication of
the meaning of each entry.
Enter the data piece by piece. Enter the
leaves in pencil at first so that they can be
rearranged into order of size. Check that
24 pieces of data have been entered.
Now arrange the leaves in order of size.
Key: 26* | 6 = 266 days
27 | 0 = 270 days
Stem
26*
27*
27*
28*
28*
29*
29*
Leaf
6 8
0 1 3
5 6 8 9 9
0 1 1 1 2 4
5 7 8 8
2 2 2
5
Since all the original data are recorded on the stem-and-leaf plot and are conveniently
arranged in order of size, the plot can be used to locate the upper and lower quartiles
and the median.
Notes
1. the median is the middle score or the average of the two middle scores
2. the lower quartile is the median of the lower half of the data
3. the upper quartile is the median of the upper half of the data.
Using the ‘possum’ mass data as an example:
Key: 0 | 7 = 0.7 kg
Stem
0
1
2
3
Leaf
7 9
1 4 5 6 7 8 8 9
1 2 2 3 3 5 7 7
0 2
MQ Maths A Yr 11 - 09 Page 367 Wednesday, July 4, 2001 5:50 PM
Chapter 9 Collecting and entering data
367
There were 20 records so the median is the
average of the 10th and 11th scores. Counting
each score as it appeared in the stem-and-leaf
plot we can see that the 10th score is the
number 1.9 and the 11th score is the
number 2.1.
1.9 + 2.1
Median = --------------------2
= 2.0 kg
The median divides the data into halves.
The lower quartile is the median of the lower half which has ten scores in it.
So the position of the lower quartile is given by the average of the 5th and 6th scores.
The 5th score is the number 1.5. The 6th score is the number 1.6.
1.5 + 1.6
The lower quartile = --------------------2
= 1.55 kg
The upper quartile is the median of the upper half which also has ten scores in it.
The 5th score in this half is the number 2.3. The 6th score is the number 2.5.
2.3 + 2.5
The upper quartile = --------------------2
= 2.4 kg
The interquartile range is the difference between the upper and lower quartiles.
WORKED Example 14
Find the interquartile range of the data presented in the following stem-and-leaf plot.
Key: 15 | 7 = 157 kg
Stem
15
16
17
18
19
20
Leaf
4 8 8
1 3 3 6 8
0 0 1 4 7 9 9 9
1 2 3 3 5 7 8 8 9
2 7 8
0 2
THINK
There are 30 scores and so the median
will be the average of the 15th and 16th
scores.
There
are 15 scores in each half and so
2
the lower and upper quartiles will be
the 8th score in each half.
3 The interquartile range is the difference
between the upper and lower quartiles.
Note: Remember to provide appropriate
units for the answer.
1
WRITE
179 + 179
Median = -----------------------2
Median = 179 kg
The lower quartile = 168 kg
The upper quartile = 188 kg
Interquartile range = upper quartile − lower
quartile
= 188 − 168
= 20 kg
MQ Maths A Yr 11 - 09 Page 368 Wednesday, July 4, 2001 5:50 PM
368
M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d
remember
remember
When presenting stem-and-leaf plots, observe the following points.
1. Always include a key to assist in the interpretation of the plot.
2. Choose a suitable class size. A class size of 5 is possible by using *notation on
class stems.
3. After initially recording each score, rearrange the leaves so that they appear in
ascending order.
9F
WORKED
Example
12
Stem-and-leaf plots
1 The data below give the number of errors made each week by 20 machine operators.
Prepare a stem-and-leaf diagram of the data using a stem of 0, 1, 2, etc.
6 15 20 25 28 18 32 43 52 27
17 26 38 31 26 29 32 46 13 20
2 The data below give the time taken (in minutes) for each of 40 runners on a 10 km fun
run. Prepare a stem-and-leaf diagram for the data using a class size of 10 minutes.
36
66
42
71
WORKED
Example
13
42
75
58
42
52
45
40
50
38
42
41
46
47
55
47
40
59
38
53
52
72
42
68
37
68
46
43
54
57
48
39
48
82
39
48
52
3 The typing speed (in words per minute) of 30 word processors is recorded below.
Prepare a stem-and-leaf diagram of the data using a class size of 5.
96 102
92 96 95 102 95 115 110 108
88 86 107 111 107 108 103 121 107
96
124
95 98 102 108 112 120
99 121 130
4 Twenty transistors are tested by applying increasing voltage until they are destroyed.
The maximum voltage that each could withstand is recorded below. Prepare a stemand-leaf plot of the data using a class size of 0.5.
14.8 15.2 13.8 14.0 14.8 15.7 15.5 15.6 14.7 14.3
14.6 15.2 15.9 15.1 14.3 14.6 13.9 14.7 14.5 14.2
WORKED
Example
14
5 The stem-and-leaf plot at right gives the exact
Key: 248 | 4 = 248.4 g
mass of 24 packets of biscuits. Find the interquartile
range of the data.
Stem Leaf
248 4 7 8
249 2 3 6 6
250 0 0 1 1 6 9 9
251 1 5 5 5 6 7
252 1 5 8
253 0
MQ Maths A Yr 11 - 09 Page 369 Wednesday, July 4, 2001 5:50 PM
Chapter 9 Collecting and entering data
6 The time taken (in seconds) for a test vehicle to
accelerate from 0 to 100 km/h is recorded during
a test of 24 trials. The results are represented by
the stem-and-leaf plot at right.
a Find the median of the data.
b Find the upper and lower quartiles of the data.
c Find the interquartile range of the data.
369
Key: 7 | 2 = 7.2 s
Key: 7* | 6 = 7.6 s
Stem
7*
7*
8*
8*
9*
9*
Leaf
2 4 4
5 5 7 9
0 0 1 2 4 4 4
5 5 6 8 9
2 2 3
5 7
Questions 7 to 10 refer to the stem-and-leaf plot below.
Key: 12 | 1 = 1210
Key: 12* | 5 = 1250
Stem
12*
12*
13*
13*
14*
14*
Leaf
1 2 4
5 7 7
0 1 1
5 6 6
0 2 3
6 7
9 9
2 3 4 4
7 9 9
4
7 multiple choice
The class size used in the stem-and-leaf plot is:
A 1
B 10
C 33
D 50
8 multiple choice
The number of scores that have been recorded is:
A 27
B 33
C 1210
D 1410
9 multiple choice
The median of the data is:
A 13.4
B 14
C 1335
D 1340
C 1290
D 1390
10 multiple choice
The interquartile range of the data is:
A 14
B 100
11 The maximum hand spans (in cm) of 20 male concert pianists are recorded as follows.
23.6
22.8
a
b
c
d
20.2
24.6
22.8 21.4
21.8 22.8
25.1
23.1
24.8 23.2
24.6 21.7
21.6
24.7
20.7 23.6
22.2 23.0
Complete a stem-and-leaf plot to represent the data.
Find the median of the data.
Find the upper and lower quartiles of the data.
Find the interquartile range of the data.
MQ Maths A Yr 11 - 09 Page 370 Wednesday, July 4, 2001 5:50 PM
370
M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d
12 The heights (in cm) of a sample of 30 plants are recorded as follows.
93 88 94 99 91 85 126 107 110 111
98 96 117 101
97 92 101 132 103
82
114 84 96 103 108 115 90 110 126
85
a Complete a stem-and-leaf plot to represent the data.
b Find the median of the data.
c Find the upper and lower quartiles of the data.
d Find the interquartile range of the data.
Boxplots (box-and-whisker plots)
In drawing a boxplot, five values are extracted from the data set. These values summarise the data and hence attract the name of a five-number summary.
This five-number summary consists of:
• lower extreme — the lowest score in the data set
• lower quartile — the middle score of the lower half of the data set
• median — the middle score
• upper quartile — the middle score of the upper half of the data set
• upper extreme — the highest score in the data set.
WORKED Example 15
For the set of scores below, develop a five-number summary.
12 15 46 9 36 85 73 29 64 50
THINK
WRITE
3
Re-write the list in ascending order.
Write the lowest score.
Calculate the lower quartile.
4
Calculate the median.
5
Calculate the upper quartile.
Write the upper extreme.
1
2
6
9 12 15 29 36 46 50 64 73 85
Lower extreme = 9
Lower quartile = 15
36 + 46
Median = -----------------2
= 41
Upper quartile = 64
Upper extreme = 85
Five-number summary = 9, 15, 41, 64, 85
Once a five-number summary has been developed, it can be graphed using a box-andwhisker plot, a powerful way to display the spread of the data.
The box-and-whisker plot consists of a central divided box with attached whiskers.
The box spans the interquartile range, the vertical line inside the box marks the median
and the whiskers indicate the range. Each section of the boxplot represents one quarter
of the scores of the data set.
1–
4 of scores
Lower
extreme
1–
4 of scores
Lower
quartile
1–
4 of scores
Median
1–
4 of scores
Upper
quartile
Upper
extreme
MQ Maths A Yr 11 - 09 Page 371 Wednesday, July 4, 2001 5:50 PM
Chapter 9 Collecting and entering data
371
Box-and-whisker plots are always drawn to scale. This can be drawn with the fivenumber summary attached as labels:
15
4
21 23
28
or with a scale presented alongside the box-and-whisker plot. (This representation is
preferable.)
0
5
10 15
20 25
30
Scale
WORKED Example 16
The box-and-whisker plot drawn at right
shows the marks achieved by students on
0
their end of year exam.
a State the median.
b Find the interquartile range.
c What was the highest mark in the class?
10
20 30
40 50 60 70
Marks achieved
THINK
WRITE
a The mark in the box shows the median
(72).
a Median = 72 marks
b
1
The lower end of the box shows the
lower quartile (63).
b Lower quartile = 63 marks
2
The upper end of the box shows the
upper quartile (77).
Upper quartile = 77 marks
3
Subtract the lower quartile from the
upper quartile.
Interquartile range = 77 − 63
= 14 marks
c The top end of the whisker gives the top
mark (92).
80 90 100
c Top mark = 92 marks
WORKED Example 17
After analysing the speed (in km/h) of motorists through a particular intersection, the
following five-number summary was developed.
The lowest score is 82 km/h.
The lower quartile is 84 km/h.
The median is 89 km/h.
The upper quartile is 95 km/h.
The highest score is 114 km/h.
Show this information in a box-and-whisker plot.
Continued over page
MQ Maths A Yr 11 - 09 Page 372 Wednesday, July 4, 2001 5:50 PM
372
M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d
THINK
1
WRITE
Draw a scale from 70 to 120 using
1 cm = 10 km/h.
70
2
Draw the box from 84 to 95.
3
Mark the median at 89.
4
Draw the whiskers to 82 and 114.
80 90 100 110 120
Speed in km/h
A graphics calculator can also be used to display data in the form of a boxplot.
WORKED Example 18
Use a graphics calculator to draw a boxplot displaying the data below.
3 6 4 8 17 12 9 7 13 13 5 9 7 2 1 7 5 4 2
THINK
1 Clear any previous functions.
(a) Press Y= .
(b) Press CLEAR to clear any functions.
2
Enter data.
(a) Press 2nd [STAT PLOT].
(b) Highlight 4:Plots Off.
(c) Press ENTER .
(d) Enter the list in L1 (press STAT , select
1:Edit... and press ENTER to access the
screen).
3
Draw the boxplot.
(a) Press 2nd [STAT PLOT] then select
1:Plot1.
(b) Press ENTER .
(c) Select Plot1, then On, then select the
5th icon along which indicates a boxplot;
then at Xlist: enter L1 (use 2nd [L1]),
and at Freq: enter 1.
(d) Press ZOOM and select 9:ZoomStat.
(e) Press ENTER and the boxplot will
appear.
(f) Use the trace key to locate the minimum,
maximum and quartile values.
DISPLAY
1
9
4
7
17
MQ Maths A Yr 11 - 09 Page 373 Wednesday, July 4, 2001 5:50 PM
Chapter 9 Collecting and entering data
es
in
ion v
t i gat
n inv
io
es
373
Exploring boxplots
Boxplots provide a powerful tool for comparing sets of data. It is essential,
therefore, to understand and interpret correctly the shape of a boxplot.
Situation 1
Consider the following situation.
Ten randomly chosen students from Class X and Class Y each sit for a test where
the highest possible mark is 10. The results of the ten students from the two classes
are:
Class X: 1 2 3 4 5 6 7 8 9 10
Class Y: 1 2 2 3 3 4 4 5 9 10
Note that the range of the marks is the same in both class samples (1 to 10). In fact,
the two highest and two lowest marks are the same in both cases.
Determining the five-number summaries for both class samples shows:
Class X 1, 3, 5.5, 8, 10
Class Y 1, 2, 3.5, 5, 10
You should confirm these figures.
If a boxplot was drawn for each set of data and the two plots were shown on the
same scale, we would observe the following:
Class Y
Class X
1
2
3
4
5
6
7
8
9
10
In interpreting the differences between the two plots, we must recall the following.
1. The boxplot is divided into four sections and each section represents one
quarter of the scores.
2. The median represents a score in the middle of the data.
3. A long ‘box’ or a long ‘whisker’ does not tell us that there are more scores in
that section. There is still one quarter of the scores in each whisker and one
half of the scores in the whole box. An increase in length indicates that the
scores are more spread out.
From the boxplots, answer the following questions:
1 The scores in the lower whisker of the Class X sample range from 1 to 3. What
is the range of scores for the lower whisker of the Class Y sample? How many
scores lie in each of these whiskers?
2 What is the range of scores in the upper whisker of the Class Y plot? How
many scores lie in this area? Why is the upper whisker longer than the lower
whisker?
3 It is true to say that three-quarters of the Class Y student sample performed
more poorly than half the student sample in Class X. Explain how the graph
shows this.
4 It is also true to say that more than half the student sample from Class X
performed as well as the top quarter of the student sample from Class Y.
Explain how the graph indicates this.
t i gat
MQ Maths A Yr 11 - 09 Page 374 Wednesday, July 4, 2001 5:50 PM
374
M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d
Situation 2
Ten randomly chosen students from Class Z sat for the same test with the following
results.
Class Z: 1 2 5 5 6 7 8 8 9 10
5 Determine the five-number summary for the Class Z sample.
6 Draw parallel boxplots of the three class samples, using the same horizontal
scale.
7 Write a report comparing the performances of the three classes on the test. You
must refer to specific data from the boxplots to support your conclusions.
remember
remember
1. A five-number summary is a summary set for a distribution.
2. The five numbers used in a five-number summary are the lower extreme, lower
quartile, median, upper quartile and upper extreme.
3. A box-and-whisker plot can be used to graph a five-number summary.
4. The box is used to show the interquartile range and the median is marked with
a line in the box. The whiskers then extend to show the range of the data set.
9G
Five-number summaries and
boxplots
A graphics calculator can be used for many of the following.
WORKED
Example
15
1 Write a five-number summary for the data set below.
15 17 16 8 25 18 20 15 17 14
2 For each of the data sets below, write a five-number summary.
a 23 45 92 80 84 83 43 83
b 2 6 4 2 5 7 1
c 60 75 29 38 69 63 45 20 29 93 8 29 93
WORKED
EXCE
Example
et
reads
L Sp he
Interquartile
range
16
3 From the five-number summary 6, 11, 13, 16, 32 find:
a the median
b the interquartile range
c the range.
4 From the five-number summary 101, 119, 122, 125, 128 find:
a the median
b the interquartile range
c the range.
MQ Maths A Yr 11 - 09 Page 375 Wednesday, July 4, 2001 5:50 PM
Chapter 9 Collecting and entering data
17
5 A five-number summary is given below.
Lower extreme = 39.2
Lower quartile = 46.5
Median = 49.0
Draw a box-and-whisker plot of the data.
L Spread
XCE
Upper quartile = 52.3
Upper extreme = 57.8
Boxplots
6 The box-and-whisker plot at right shows the distribution of final points scored by a
football team over a season’s roster.
a What was the team’s greatest points score?
b What was the team’s smallest points score?
50 70 90 110 130 150 Points
c What was the team’s median points score?
d What was the range of points scored?
e What was the interquartile range of points scored?
7 The box-and-whisker plot at right shows the
distribution of data formed by counting the
number of honey bears in each of a large
sample of packs.
In any pack, what was:
a the largest number of honey bears?
b the smallest number of honey bears?
c the median number of honey bears?
d the range of numbers of honey bears?
e the interquartile range of honey bears?
30
35
40 45
50 55 60 Scale
Questions 8, 9 and 10 refer to the box-and-whisker plot drawn below.
5
10
15 20
25 30 Scale
8 multiple choice
The median of the data is:
A 20
B 23
C 35
D 31
sheet
Example
E
WORKED
375
MQ Maths A Yr 11 - 09 Page 376 Wednesday, July 4, 2001 5:50 PM
376
M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d
9 multiple choice
The interquartile range of the data is:
A 23
B 26
C 5
D 20 to 25
10 multiple choice
Which of the following is not true of the data represented by the box-and-whisker
plot?
A One-quarter of the scores is between 5 and 20.
B One-half of the scores is between 20 and 25.
C The lowest quarter of the data is spread over a wide range.
D Most of the data are contained between the scores of 5 and 20.
11 The number of sales made each day by a salesperson is recorded over a fortnight:
GC pr
ogram
25, 31, 28, 43, 37, 43, 22, 45, 48, 33
UV
statistics
a Write a five-number summary of the data.
b Draw a box-and-whisker plot of the data.
12 The data below show monthly rainfall in millimetres.
Feb.
Mar.
Apr.
May
June
July
10
12
21
23
39
22
15
Aug. Sept. Oct.
11
22
37
Nov. Dec.
45
30
a Provide a five-number summary of the data.
b Draw a box-and-whisker plot of the data.
in
ion v
t i gat
n inv
io
es
t i gat
Work
9.2
es
ET
SHE
Jan.
Drawing conclusions from the
bottled-water investigations
Having considered ways of collecting, organising and displaying data, it is now
appropriate that you should draw together all your results from your investigations
on bottled water and form them into a report. You have collected data through:
1. observation
2. survey and
3. taste testing.
1 Consider each activity separately. For each:
a define the AIM of the investigation
b outline the PROCEDURE undertaken
c present a table of your RESULTS (preferably on a spreadsheet)
d represent your data in an appropriate GRAPHICAL FORM (preferably
using a computer or graphics calculator)
e summarise and draw some CONCLUSIONS from your investigation.
2 Present your whole investigation as a report to the class.
MQ Maths A Yr 11 - 09 Page 377 Wednesday, July 4, 2001 5:50 PM
Chapter 9 Collecting and entering data
377
summary
Classification of data
• Data can be classified as being categorical or numerical.
• Categorical data are data that are non-numerical. For example, a survey of car types
is not numerical.
• Numerical data are data that can be either counted or measured. For example a
survey of the daily temperature is numerical.
• Numerical data can be either discrete or continuous.
• Discrete data can take only certain values such as whole numbers.
• Continuous data can take any value within a certain range.
Collecting data
• Data can be collected by observation, survey or experiment.
• Observations can be made on categorical or numerical data.
• Surveys can be conducted through personal interview, telephone interview, or selfadministered questionnaire.
• Questionnaires require careful construction, particularly with question construction
and classification.
• Experiments must be capable of being reproduced for conclusions to be drawn.
• In all cases of collecting data, the quality of the data must be investigated before
any processing occurs.
Organising and displaying data
• Data are usually organised by first collating the records in the form of a table using
grouped or ungrouped data.
• Data are generally displayed graphically. Types of graphs include column, pie,
histogram, stem plots or boxplots.
Misrepresenting data graphically
• View graphical presentations carefully to detect signs of bias in impressions.
• Check that scales on both axes are linear and not lengthened or shortened.
• Check that picture graphs are not represented as volumes.
Histograms and frequency polygons
• Numerical data are best displayed by a frequency histogram and polygon.
• A frequency histogram is a column graph that is drawn with a 0.5-unit (half
column) space before the first column and no other spaces between the columns.
• A frequency polygon is drawn as a line graph from the corner of the axes to the
centre of each column.
Stem-and-leaf plot
•
•
•
•
A stem-and-leaf plot is like a histogram on its side.
The first part of the data forms the stem.
The last part of the data forms the leaves.
The data in the leaves must be in ascending order.
Boxplots (box-and-whisker plots)
• A five-number summary of a data set is the lower extreme, lower quartile, median,
upper quartile and upper extreme.
• A five-number summary can be graphed using a box-and-whisker plot.
• A box-and-whisker plot shows the spread of a data set on a scale.
MQ Maths A Yr 11 - 09 Page 378 Wednesday, July 4, 2001 5:50 PM
378
M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d
CHAPTER
review
9A
1 State whether each of the following data types is categorical or numerical.
a The television program that people watch at 7:00 pm
b The number of pets in each household
c The amount of water consumed by athletes in a marathon run
d The average distance that students live from school
e The mode of transport used between home and school
9A
2 For each of the numerical data types below, determine if the data are discrete or continuous.
a The dress sizes of Year 11 girls
b The volume of backyard swimming pools
c The amount of water used in households
d The number of viewers of a particular television program
e The amount of time Year 11 students spent studying
9B
3 Identify the areas of concern with the following questions, then rewrite each so that the
meaning is clear and understandable.
a Do you think that the school holidays are too long at Christmas time and that they
should be spread evenly over the year?
b Have you never said that you don’t like chocolate?
c Do you support the building of a fence around the duck pond to protect the poor
defenceless ducklings from the savage dogs in the area?
d Did you know that the BSSSS administer the QCS test?
e You support the police in their endeavour to reduce the road toll, don’t you?
9C
4 A survey is conducted on the number of people living in each household in a street. The
results are shown below.
1 4 5 2 2 3 4 6 1 2 5
6 4 4 6 3 2 3 5 1 3 4
3 3 4 2 2
Put these results into a table.
9C
5 A group of Year 11 students was asked to state the number of CDs that they had purchased
in the last year. The results are shown below.
12
9
12
1 13 20 5 22 35 12 17 20
5 11 0 14 25 3 8 10 9
6 18 7 10 9 6 23 14 19
Put the results into a table using the categories 0–4, 5–9, 10–14 etc.
9C
9C
6 Draw a column and a sector graph to represent the results to question 4.
7 Draw a column and a sector graph to represent the results to question 5.
MQ Maths A Yr 11 - 09 Page 379 Wednesday, July 4, 2001 5:50 PM
Chapter 9 Collecting and entering data
379
8 This table shows the number of students in each year level from Years 8 to 12.
Year
No. of students
8
200
9
189
10
175
11
133
12
124
9D
Draw two separate graphs to illustrate the following:
a The principal of the school claims a high retention rate of students in Years 11 and 12
(that is, most of the students from Year 10 continue on to complete Years 11 and 12).
b The parents claim that the retention rate of students in Years 11 and 12 is low (that is, a
large number of students leave at the end of Year 10).
9 The table below shows the number of sales made each day over a month in a car yard.
Number of sales
Frequency
0
2
1
5
2
12
3
6
4
2
5
0
6
1
9E
Show this information in a frequency histogram and polygon.
10 The frequency table below shows the crowds at football matches for a team over a season.
Class
Class centre
Frequency
5000–9999
1
10 000–14 999
5
15 000–19 999
9
20 000–24 999
3
25 000–29 999
2
30 000–34 999
2
9E
a Copy the frequency table
and complete the class
centre column.
b Show the information in a
frequency histogram and
polygon.
11 Display the following scores in a stem-and-leaf plot.
45 21 38 46 42 41 42 49 35 29 24 28
36 21 38 45 44 40 29 28 35 35 33 38
40 41 48 39 34 38 45 28 23 29 30 40
12 Use the stem-and-leaf plot drawn in the previous question to find:
a the range
b the median
c the interquartile range.
9F
9F
MQ Maths A Yr 11 - 09 Page 380 Wednesday, July 4, 2001 5:50 PM
380
M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d
9G
13 For the data set below, give a five-number summary.
9G
14 For the box-and-whisker plot
drawn at right:
0
a state the median
b calculate the range
c calculate the interquartile range.
9G
24 53 91 57 29 69 29 15 84 6
5
10 15
20 25
30
35 40
45 50
55 60
15 The number of babies born each day at a hospital over a year is tabulated and the fivenumber summary is given below.
Lower extreme = 1
Upper quartile = 16
Lower quartile = 8
Upper extreme = 18
Median = 14
Show this information in a box-and-whisker plot.
The stem-and-leaf plot below refers to questions 16 to 20. It represents the number of typing
errors recorded by a class of students in 1 page of typing.
Key: 1 | 2 = 12 1* | 5 = 15
Stem Leaf
0* 0 1 4
0* 6 7 8 9
1* 0 0 1 1 2 3 3 4
1* 5 6 8 9
2* 3
16 How many students are in the class?
17 What is the median number of errors?
18 State the value of the lower quartile.
19 Determine the interquartile range.
20 Draw a boxplot of the data.
The box-and-whisker plots at right
refer to questions 21 to 25.
They show the sales of two different
brands of washing powder at a
supermarket each day.
Brand A
Brand B
0
5
10
15
20 25
30 35
21 State the range for Brand A.
22 State the interquartile range for Brand A.
23 State the range for Brand B.
CHAPTER
test
yourself
9
24 State the interquartile range for Brand B.
25 Describe the spread of the sales for each brand of washing powder.
40
45
50
Scale