Survey Research

Dr. Stefan Wuyts
Associate Professor Marketing
Koç University
[email protected]
1

1. General issues in survey research

2. Measurement scales

3. Questionnaire design

4. Sampling methods

5. Sample size determination
2
Research Data
Secondary
Data
Primary
Data
Qualitative
Data
Quantitative
Data
Descriptive
Survey
Data
Causal
Other
Data
3
Experimental
Data
Errors reduce accuracy and quality of raw data

Response Biases
Strongly disagree
1 2 3 4 5 6 7
Strongly agree
◦ Acquiescence (yea-saying) and disacquiescence (nay-saying)
◦ Extreme response style (use extreme response categories)
◦ Midpoint responding (use middle-scale category)
◦ Noncontingent responding (respond randomly)
4

Social desirability bias

Errors in execution (e.g., problem definition)

Interviewer bias

Sample control problems
◦ Who completes the survey?
◦ Self-selection bias / Nonresponse bias
5
Incentives





Monetary incentive (typically $5-10)
Charity
Lottery
Offer choices
Promise summary report or benchmark
Actions



Inform respondents on beforehand
Involve (moral or hierarchical) authority
Reminders
6

Imagine you face the following situation:
You send out 1000 surveys
400 eligible respondents complete survey
You find out that 300 respondents were not eligible
Of the remaining 300, 100 people refused to
participate and the other 200 were not reached
(returned mail)
◦ What is your response rate?
◦
◦
◦
◦
7

CASRO response rate formula:
8



Person-administered: an interviewer reads
questions, either face-to-face or over the
telephone, to the respondent and records his
or her answers (in-home, mall intercept, inoffice, telephone center)
Computer-assisted: computer technology
plays an essential role in the interview work
(Computer Assisted Telephone/Personal
Interviewing: CATI & CAPI)
Self-administered: the respondent completes
the survey on his or her own (paper & pencil
(mail or drop-off), email link to online survey)
9
Errors
Respondent
Data
Economics
Response bias
Personal
Quality
Speed
Response rate
contact
control
Cost
Representativeness
Feedback
Adaptability
Real-time
Interviewer bias
Rapport
Graphics
Convenience
Sample control
Monitoring
Richness
Tabulation
Stimulation
Security
Target group
Negativity
Required skills
Anonymity
Pace
10
Measurement process:
Two essential steps:
◦ Construct development
◦ Operationalization (The process of assigning descriptors to
represent the range of possible responses to a question
about a particular object or construct)
Question-response format:
 The nature of the property being measured
 Previous research studies
 The data collection mode
 The ability of the respondent
 The scale level desired for analysis
11
Scaling involves creating a continuum upon
which measured objects are located
1. Description
The unique labels or descriptors that are used to
designate each value of the scale. All scales
possess description.
Male/female; yes/no
2. Order
The relative sizes or positions of the descriptors
are known. Order is denoted by descriptors such
as greater than, less than, and equal to.
12
3. Distance
Ability to express absolute differences between the
scale descriptors.
1 YTL difference between 25 YTL and 26 YTL
10 degrees difference between 25 C and 35 C
4. Origin
The presence of a unique or fixed beginning or
true zero point.
Zero market share, zero YTL, zero purchases
13
Primary
Scales
Nominal
Scale
Ratio
Scale
Ordinal
Scale
Interval
Scale
14

The scale affects what may or may not be said
about the property being measured.
– Examples:
• If you wish to calculate an
average, you must use an
interval or ratio scale.
• If you have a nominal or
ordinal scale, you must
summarize the results with a
percentage or frequency
distribution.
15
16

Labels for identifying and classifying objects.

No indication of amount/intensity of
characteristic.

Statistics: limited, frequency counts,
percentages, mode

Marketing examples: brand numbers, store
types
17
18

A ranking scale in which numbers are assigned to
objects to indicate the relative extent to which the
objects possess some characteristic.

Tells about more or less, not how much more or less.

Choice of numbers is irrelevant (1 2 3 or 11 12 13).

Statistics: nominal scale + percentile, quartile,
median

Marketing examples: market position, social class
19
20





Numerically equal distances represent equal values in
measured characteristic (distance can be compared)
The location of the zero point is not fixed.
Any positive linear transformation of the form y = a
+ bx will preserve the properties of the scale.
Statistics: nominal and ordinal + arithmetic mean,
standard deviation, and other (but no ratios)
Marketing examples: attitudes, opinions
21
22

All the properties of the nominal, ordinal, and
interval scales.

Absolute zero point.

Meaningful to compute ratios of scale values.



Only proportionate transformations of the form y
= bx, where b is a positive constant, are allowed.
Statistics: all can be applied to ratio data, including
ratios.
Marketing examples: age, income, sales
23
Q1. The following is a question on a survey:
Please check the appropriate price range that
indicates the amount you spend each week on
gasoline for your car:
_____
_____
_____
_____
1.
2.
3.
4.
$0.00 - $10.00
$10.01 - $20.00
$20.01 - $30.00
$30.01 - $40.00
Ordinal Scale
What is the level of measurement that is reflected
by the data collected by this question?
24
Q2. The number of children in a family is an
example of what kind of data?
RatioScale
Q3. In a survey of luxury car owners, respondents
were chosen from 4 states; California, New York,
Illinois, and Ohio. What is the level of
measurement that is reflected by the states the
owners were selected from?
Nominal Scale
25
Direct comparison of
stimulus objects; data
interpreted in relative
terms and have only
ordinal or rank order
properties
Each object is scaled
independently of the
others. Resulting data
are assumed to be
interval or ratio scaled.
Scaling
Techniques
Noncomparative
Scales
Comparative
Scales
Paired
Comparison
Constant
Sum
Itemized
Rating Scales
Continuous
Rating Scales
Rank
Order
Likert
26
Semantic
Differential
Comparison of two objects; ordinal data.
Ten pairs of shampoo brands: indicate which shampoo in the pair you prefer for personal
use.
Jhirmack
Jhirmack
Finesse
Vidal Sassoon
Head & Shoulders
Pert
# of times
preferred
Finesse
Vidal Sassoon
0
0
1
0
0
1
0
1
1
1
Head & Shoulders
1
1
0
0
0
1
1
0
1
2
0
4
3
Pert
0
1
Under assumption of transitivity, it is possible to convert
paired comparison data to a rank order.
27

Rank objects according to criterion; ordinal data.

Respondent may dislike brand 1 in absolute sense!
28

Allocate constant sum of units; twice as important then twice as many points;
difficult!

Respondent may dislike brand 1 in absolute sense!
29
Advantages
Drawbacks
Sensitive to small differences
Ordinal data
Same reference points for all
respondents
Restricted to stimulus objects,
not generalizable
Easy
Less halo effects
30
Placing mark at appropriate position on line that runs
from one extreme to the other.
Leads to interval data.
31

Number or brief description for each category

Likert: agreement or disagreement

Reverse-code negative items

Respondent may dislike brand 1 in absolute sense!
32

7-point rating scale, bipolar labels; positive and negative adjectives; -3 to +3
or 1 to 7

Respondent may dislike brand 1 in absolute sense!
33
A questionnaire


Is a formalized set of questions for obtaining
information from respondents.
Translates information needed into set of
specific questions that respondents can and will
answer.

Must motivate respondent to be involved in,
cooperate, and complete the interview.

Should minimize response error.
34
Pretesting the questionnaire
Focus on content, wording, order, layout,
difficulty, but also respondent’s reactions to
the survey (via personal interviews).
Important: take respondents from same
population as the final survey for pretesting
the survey instrument.
35
The Funnel Approach to Ordering Questions
Broad or General Questions
Narrow or Specific Questions
36
Q: What’s wrong?
“Do you think the distribution of soft drinks is
adequate?”
A: Simplify language:
“Do you think soft drinks are readily available
when you want to buy them?”
37
Q: What’s wrong?
“Do you think Coca-Cola is a tasty and
refreshing soft drink?”
A: Double-barreled question: two or more
questions are combined into one. Two distinct
questions :
“Do you think Coca-Cola is a tasty soft drink?” and
“Do you think Coca-Cola is a refreshing soft drink?”
38
Q: What’s wrong?
“How many liter of soft drinks did you
consume during the last four weeks? ”
A: Does the respondent remember that? Alternative:
How many liter of soft drinks do you consume in a
typical week?
1.
___ Less than 1
2.
___ 1 to 3 liter per week
3.
___ 4 to 6 liter per week
4.
___ 7 or more liter per week
39
Q: What’s wrong?
Please consider the last technology innovation
project that you were involved in […]
“At the beginning of this technology innovation
project, how well did you consider alternative
technological options?”
A: Does the respondent remember that?
Hindsight bias colors the responses!
40
Q: What’s wrong?
“Which brand of shampoo do you use?”
A: Define the issue in terms of who, what, when,
and where:
“Which brand or brands of shampoo have you
personally used at home during the last month?
In case of more than one brand, please list all the
brands that apply”
41
The W's
Defining the Question
Who
The Respondent
It is not clear whether this question
relates to the individual respondent
or the respondent's total household.
What
The Brand of Shampoo
It is unclear how the respondent is
to answer this question if more
than one brand is used.
When
Unclear
The time frame is not specified in
this question. The respondent
could interpret it as meaning the
shampoo used this morning, this
week, or over the past year.
Where
Not Specified
At home, at the gym, on the road?
42
Q: What’s wrong?
In a typical month, how often
do you shop in department stores?
_____ Never
_____ Occasionally
_____ Sometimes
Better:
_____ Often
In a typical month, how often
_____ Regularly
do you shop in department stores?
_____ Less than once
_____ 1 or 2 times
_____ 3 or 4 times
_____ More than 4 times
A: The scale is unnecessarily ambiguous.
43
Q: What’s wrong?
“Do you think that patriotic Americans should
buy imported automobiles when that would
put American labor out of work?”
A: The question clues the respondent to what
the answer should be. Better:
“Do you think that Americans should buy
imported automobiles?”
44
Q: What’s wrong?
“What do you think about the Philips
Streamium?”
A: First need a filter question to measure
familiarity and past experience; or include a
don’t know option.
45
Q: What’s wrong?
“Describe the atmosphere of a department
store”
A: You will need to
help the respondent,
for example by
showing pictures or by
providing descriptions
to help them articulate
their responses.
46
Q: What’s wrong?
“Please list all departments from which you
purchased merchandise on your most recent
shopping trip to a department store”
A: too much effort; simplify the task:
47
Sometimes respondents are not willing to answer
because topic of question is sensitive,
embarrassing, related to prestige…
To overcome unwillingness to answer:




Make question appropriate given context, legitimate
it: explain why you ask the question;
Move sensitive questions toward the end of the
questionnaire;
If question is about embarrassing behavior,
underscore that such behavior is common;
Third-person technique.
48
Q: What’s wrong?
“Do you like to fly when traveling short
distances?”
A: Alternative is not explicitly expressed.
Better:
“Do you like to fly when traveling short
distances, or would you rather drive?”
49
Q: What’s wrong?
“Are you in favor of a balanced budget?”
A: Questions should not be worded so that the
answer is dependent upon implicit assumptions
about what will happen as a consequence.
Better:
“Are you in favor of a balanced budget if it
would result in an increase in the personal
income tax?”
50
Q: What’s wrong?
“What is the annual per capita expenditure on
groceries in your household?”
A: Less difficult to assess when translated into
two different questions :
“What is the monthly expenditure on groceries in
your household?”
&
“How many members are there in your
household?”
51
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
Use ordinary words
Are several questions needed?
Can the respondent remember?
Define the issue
Use unambiguous words
Avoid leading or biasing questions
Is the respondent informed?
Can the respondent articulate?
Is the respondent willing to answer?
Avoid implicit alternatives
Avoid implicit assumptions
Avoid generalizations and estimates
52
Define the Population
Determine the Sampling Frame
Select Sampling Technique(s)
Determine the Sample Size
Execute the Sampling Process
53
Extent:
Domestic
United States
Time
Frame:
Upcoming
Summer
Sampling
Unit:
Households
with 18 year
old females
Element: 18
year old
females
54
Target Population:
Single parent households
in Chicago
Sampling
Frame Error
Sampling Frame:
List supplied by a
commercial vendor
55


Probability samples: members of the
population have a known and equal
probability of being selected into the
sample
Non-probability samples: probability of
selecting members from the population
into the sample are unknown
56
Nature of research
Statistical considerations
Economic considerations
Probability Sampling Techniques
Simple Random
Sampling
Systematic
Sampling
Stratified
Sampling
Cluster
Sampling
Nonprobability Sampling Techniques
Convenience
Sampling
Judgmental
Sampling
Quota
Sampling
Snowball
Sampling
57
?

With nonprobability sampling methods, selection is
not based on equal chance.
◦
◦
◦
◦

Convenience sampling
Judgment sampling
Quota sampling
Snowball sampling
Still used very often. Why?
◦ Decision makers want fast, relatively inexpensive answers…
nonprobability samples are faster and less costly than
probability samples.
◦ Nature of research, statistical considerations
58
2.1 Convenience samples: drawn at convenience of interviewer
◦ Error occurs in the form of members of the population who
are infrequent or nonusers of that location
2.2 Judgment samples: require judgment or “educated guess” as
to who should represent the population
◦ Subjectivity enters in here, and certain members will have a
smaller chance of selection than others
2.3 Quota samples: use a specific quota of certain types of
individuals to be interviewed
◦ Often used to ensure that convenience samples will have
desired proportion of different respondent classes
2.4 Snowball samples: require respondents to provide the names
of additional respondents
◦ Members of the population who are less known, disliked, or
whose opinions conflict with the respondent have a low
probability of being selected
59

Simple random sampling: the probability of being
selected into the sample is “known” and equal for all
members of the population
◦ E.g., Blind Draw Method
◦ Random Numbers Method
60

Advantage:
◦ Known and equal chance of selection

Disadvantages:
◦ Complete accounting of population needed
◦ Cumbersome to provide unique designations to
every population member
◦ Sample might not be representative
61

Systematic sampling: way to select a random sample
from a directory or list that is much more efficient than
simple random sampling
◦ Skip interval=population list size/sample size
62

Advantages:
◦ Approximate known and equal chance of
selection…it is a probability sample plan
◦ Better than SRS when sampling frame is organized
in relevant and systematic way
◦ Efficiency…do not need to designate every
population member (as opposed to SRS)

Disadvantages:
◦ Small loss in sampling precision
◦ Worse than SRS in case sampling frame is cyclical in
nature.
63

When the researcher knows the answers to
the research question are likely to vary by
subgroups… identify strata that are internally
homogeneous and that differ from other
strata on relevant variables.
◦ Question: “To what extent do you value your college
degree?”
 We would expect more agreement (less variance) as
classification goes up. That is, seniors should pretty
much agree that there is value. Freshmen will have
less agreement.
64
We expect this question to be answered differently depending
on student classification. Not only are the means different,
variance is less as classification goes up. Seniors agree more
than Freshmen.
65

Stratified sampling: method in which the population is
separated into different strata and a sample is taken
from each stratum
◦ Proportionate stratified sample
◦ Disproportionate stratified sample
66
◦ Stratified sampling allows the researcher to allocate
more sample size to strata with more variance and
less sample size to strata with less variance. Thus,
for the same sample size, more precision is
achieved.
◦ This is normally accomplished by disproportionate
sampling. Seniors would be sampled LESS than their
proportionate share of the population and freshmen
would be sampled more.
67

Advantage:
◦ More accurate overall sample of skewed population

Disadvantage:
◦ More complex sampling plan requiring different
sample size for each stratum
68

Cluster sampling: method in which the population is
divided into groups, any of which can be considered a
representative sample (so: internally heterogeneous, no
differences between clusters). E.g. area sampling.
69



In cluster sampling the population is divided into subgroups,
called “clusters.”
Each cluster should represent the population.
Area sampling is a form of
cluster sampling –
the geographic area
is divided into clusters.
70

Advantage:
◦ Economic efficiency…faster and less
expensive than SRS

Disadvantage:
◦ Cluster specification error…the more
homogeneous the clusters, the more
precise the sample results
71
Imagine the following information flows among consumers:
Kelly Pete
Kelly
Pete
Sandy
Ian
Sarah
Bas
Lynn
Don
Gary
Ann
Jack
John
Matt
Bill
Jane
Erik
Dawn
Dave
Ruth
Mark
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
1
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
Sandy Ian
Sarah Bas Lynn Don Gary Ann Jack John Matt Bill
Jane Erik Dawn Dave Ruth Mark
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
1
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
1
0
0
1
0
0
1
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
1
1
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
1
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
Who is in a favorable position in this information
exchange network?
72
Kelly
Jack
Mark
Ian
Sandy
Ruth
Bas
Dave
Lynn
Sarah
Pete
Dawn
Ann
Bill
Erik
Jane
Don
John
Matt
73
Gary
Sandy
Sarah
Erik
Mark
Ruth
Jack
Bas
Dave
Kelly
Lynn
Jane
Dawn
Don
Pete
Bill
Matt
John
Ann
Gary
Ian
74
The determination of sample size is a function
of both qualitative and quantitative factors.
Qualitative factors in determining sample size:
◦ Related to the analysis:
 the nature of the research
 the nature of analysis
 the number of variables
◦ Related tot the firm:
 the importance of the decision
 resource constraints
75

What is the required sample size?
◦ Management wants to know customers’ level
of satisfaction with their service. They propose
conducting a survey and asking for satisfaction
on a scale from 1 to 10.
◦ Management wants to be 99% confident in the
results and they do not want the allowed error
to be more than ±.5 scale points.
◦ What is n?
76

Quantitative logic relies on three
determinants of sample size:
◦ Variability
◦ Accuracy
◦ Confidence
77
Larger variability: need for larger sample
78


Refers to how close a random sample’s
statistic is to the true population’s value it
represents
Important points:
◦ Sample size is not related to representativeness
(the sampling technique, on the other hand, is
related to representativeness)
◦ Sample size is related to accuracy
Larger desired level of precision: need for larger sample
79
e
e
rr
rr
o
o
rr
sample size
80

The Confidence Interval Method of
Determining Sample Size: confidence
interval represents an area under the
normal distribution (e.g., 95% confidence
interval)
0.475
_
XL
0.475
_
X
_
XU
For higher confidence: need for larger sample
81



The Confidence Interval Method of
Determining Sample Size is based upon the
Central Limit Theorem…
Central limit theorem: a theory that holds that
values (such as mean attitude levels) taken
from repeated (large) samples of a population
are distributed according to a normal curve
More formally: as the sample size increases,
the distribution of the sample mean of a
randomly selected sample approaches the
normal distribution
82
83
Definitions and symbols:



A parameter is a summary description of a fixed
characteristic or measure of the target population.
A parameter denotes the true value which would be obtained
if a census rather than a sample was undertaken.
A statistic is a summary description of a characteristic or
measure of the sample. The sample statistic is used as an
estimate of the population parameter.
Random sampling error: The error when the sample selected
is an imperfect representation of the population of interest.
84



Accuracy or precision level: When estimating a population
parameter by using a sample statistic, the precision level is
the desired size of the estimating interval. This is the
maximum permissible difference between the sample statistic
and the population parameter.
Confidence interval: The confidence interval is the range into
which the true population parameter will fall, assuming a
given level of confidence.
Confidence level: The confidence level is the probability that a
confidence interval will include the population parameter.
85
____________________________________________________________
Variable
Population
Sample
____________________________________________________________
Mean

X
Variance
2
s
Standard deviation

s
Size
N
n
x
X –

Sx
X –X
Sx
Standard error of the mean
Standardized variate (z)
2
___________________________________________________________
86



Sampling distribution of the mean is a normal
distribution;
The mean of the sampling distribution of the
mean = population parameter μ;
Standard deviation of sampling distribution
= standard error of mean
x 


n
z-value:
z
X 
x
87
Reflects variability
Reflects confidence
 z
2 2
n
D
2
Reflects precision
88
Standard Normal Probabilities
StandardNormal Distribution
0.4
f(z)
0.3
0.2
0.1
{
1.56
0.0
-5
-4
-3
-2
-1
0
Z
1
2
3
4
Look in row
labeled 1.5 and
column labeled
.06 to find
P(0  z  1.56)
= .4406
5
z
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2.0
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
2.9
3.0
.00
0.0000
0.0398
0.0793
0.1179
0.1554
0.1915
0.2257
0.2580
0.2881
0.3159
0.3413
0.3643
0.3849
0.4032
0.4192
0.4332
0.4452
0.4554
0.4641
0.4713
0.4772
0.4821
0.4861
0.4893
0.4918
0.4938
0.4953
0.4965
0.4974
0.4981
0.4987
.01
0.0040
0.0438
0.0832
0.1217
0.1591
0.1950
0.2291
0.2611
0.2910
0.3186
0.3438
0.3665
0.3869
0.4049
0.4207
0.4345
0.4463
0.4564
0.4649
0.4719
0.4778
0.4826
0.4864
0.4896
0.4920
0.4940
0.4955
0.4966
0.4975
0.4982
0.4987
.02
0.0080
0.0478
0.0871
0.1255
0.1628
0.1985
0.2324
0.2642
0.2939
0.3212
0.3461
0.3686
0.3888
0.4066
0.4222
0.4357
0.4474
0.4573
0.4656
0.4726
0.4783
0.4830
0.4868
0.4898
0.4922
0.4941
0.4956
0.4967
0.4976
0.4982
0.4987
.03
0.0120
0.0517
0.0910
0.1293
0.1664
0.2019
0.2357
0.2673
0.2967
0.3238
0.3485
0.3708
0.3907
0.4082
0.4236
0.4370
0.4484
0.4582
0.4664
0.4732
0.4788
0.4834
0.4871
0.4901
0.4925
0.4943
0.4957
0.4968
0.4977
0.4983
0.4988
.04
0.0160
0.0557
0.0948
0.1331
0.1700
0.2054
0.2389
0.2704
0.2995
0.3264
0.3508
0.3729
0.3925
0.4099
0.4251
0.4382
0.4495
0.4591
0.4671
0.4738
0.4793
0.4838
0.4875
0.4904
0.4927
0.4945
0.4959
0.4969
0.4977
0.4984
0.4988
.05
0.0199
0.0596
0.0987
0.1368
0.1736
0.2088
0.2422
0.2734
0.3023
0.3289
0.3531
0.3749
0.3944
0.4115
0.4265
0.4394
0.4505
0.4599
0.4678
0.4744
0.4798
0.4842
0.4878
0.4906
0.4929
0.4946
0.4960
0.4970
0.4978
0.4984
0.4989
89
.06
0.0239
0.0636
0.1026
0.1406
0.1772
0.2123
0.2454
0.2764
0.3051
0.3315
0.3554
0.3770
0.3962
0.4131
0.4279
0.4406
0.4515
0.4608
0.4686
0.4750
0.4803
0.4846
0.4881
0.4909
0.4931
0.4948
0.4961
0.4971
0.4979
0.4985
0.4989
.07
0.0279
0.0675
0.1064
0.1443
0.1808
0.2157
0.2486
0.2794
0.3078
0.3340
0.3577
0.3790
0.3980
0.4147
0.4292
0.4418
0.4525
0.4616
0.4693
0.4756
0.4808
0.4850
0.4884
0.4911
0.4932
0.4949
0.4962
0.4972
0.4979
0.4985
0.4989
.08
0.0319
0.0714
0.1103
0.1480
0.1844
0.2190
0.2517
0.2823
0.3106
0.3365
0.3599
0.3810
0.3997
0.4162
0.4306
0.4429
0.4535
0.4625
0.4699
0.4761
0.4812
0.4854
0.4887
0.4913
0.4934
0.4951
0.4963
0.4973
0.4980
0.4986
0.4990
.09
0.0359
0.0753
0.1141
0.1517
0.1879
0.2224
0.2549
0.2852
0.3133
0.3389
0.3621
0.3830
0.4015
0.4177
0.4319
0.4441
0.4545
0.4633
0.4706
0.4767
0.4817
0.4857
0.4890
0.4916
0.4936
0.4952
0.4964
0.4974
0.4981
0.4986
0.4990
Standard Normal
Probabilities
z
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2.0
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
2.9
3.0
.00
0.0000
0.0398
0.0793
0.1179
0.1554
0.1915
0.2257
0.2580
0.2881
0.3159
0.3413
0.3643
0.3849
0.4032
0.4192
0.4332
0.4452
0.4554
0.4641
0.4713
0.4772
0.4821
0.4861
0.4893
0.4918
0.4938
0.4953
0.4965
0.4974
0.4981
0.4987
.01
0.0040
0.0438
0.0832
0.1217
0.1591
0.1950
0.2291
0.2611
0.2910
0.3186
0.3438
0.3665
0.3869
0.4049
0.4207
0.4345
0.4463
0.4564
0.4649
0.4719
0.4778
0.4826
0.4864
0.4896
0.4920
0.4940
0.4955
0.4966
0.4975
0.4982
0.4987
.02
.03
0.0080 0.0120
0.0478 0.0517
0.0871 0.0910
0.1255 0.1293
0.1628 0.1664
0.1985 0.2019
0.2324 0.2357
0.2642 0.2673
0.2939 0.2967
0.3212 0.3238
0.3461 0.3485
0.3686 0.3708
0.3888 0.3907
0.4066 0.4082
0.4222 0.4236
0.4357 0.4370
0.4474 0.4484
0.4573 0.4582
0.4656 0.4664
0.4726 0.4732
0.4783 0.4788
0.4830 0.4834
0.4868 0.4871
0.4898 0.4901
0.4922 0.4925
0.4941 0.4943
0.4956 0.4957
0.4967 0.4968
0.4976 0.4977
0.4982 0.4983
0.4987 0.4988
.04
.05
0.0160 0.0199
0.0557 0.0596
0.0948 0.0987
0.1331 0.1368
0.1700 0.1736
0.2054 0.2088
0.2389 0.2422
0.2704 0.2734
0.2995 0.3023
0.3264 0.3289
0.3508 0.3531
0.3729 0.3749
0.3925 0.3944
0.4099 0.4115
0.4251 0.4265
0.4382 0.4394
0.4495 0.4505
0.4591 0.4599
0.4671 0.4678
0.4738 0.4744
0.4793 0.4798
0.4838 0.4842
0.4875 0.4878
0.4904 0.4906
0.4927 0.4929
0.4945 0.4946
0.4959 0.4960
0.4969 0.4970
0.4977 0.4978
0.4984 0.4984
0.4988 0.4989
.06
.07
0.0239 0.0279
0.0636 0.0675
0.1026 0.1064
0.1406 0.1443
0.1772 0.1808
0.2123 0.2157
0.2454 0.2486
0.2764 0.2794
0.3051 0.3078
0.3315 0.3340
0.3554 0.3577
0.3770 0.3790
0.3962 0.3980
0.4131 0.4147
0.4279 0.4292
0.4406 0.4418
0.4515 0.4525
0.4608 0.4616
0.4686 0.4693
0.4750 0.4756
0.4803 0.4808
0.4846 0.4850
0.4881 0.4884
0.4909 0.4911
0.4931 0.4932
0.4948 0.4949
0.4961 0.4962
0.4971 0.4972
0.4979 0.4979
0.4985 0.4985
0.4989 0.4989
.08
.09
0.0319 0.0359
0.0714 0.0753
0.1103 0.1141
0.1480 0.1517
0.1844 0.1879
0.2190 0.2224
0.2517 0.2549
0.2823 0.2852
0.3106 0.3133
0.3365 0.3389
0.3599 0.3621
0.3810 0.3830
0.3997 0.4015
0.4162 0.4177
0.4306 0.4319
0.4429 0.4441
0.4535 0.4545
0.4625 0.4633
0.4699 0.4706
0.4761 0.4767
0.4812 0.4817
0.4854 0.4857
0.4887 0.4890
0.4913 0.4916
0.4934 0.4936
0.4951 0.4952
0.4963 0.4964
0.4973 0.4974
0.4980 0.4981
0.4986 0.4986
0.4990 0.4990
90
Thus:
1. Specify level of precision
2. Specify level of confidence
3. Determine z (1.96 for 95%; 2.58 for
99%, see Table in book appendix)
4. Determine σ
5. Determine n
6. Once sample is drawn, s can be used
to approximate σ, leading to new
confidence interval (or different
precision given particular level of
confidence)
91

What is the required sample size?
◦ Management wants to know customers’ level
of satisfaction with their service. They propose
conducting a survey and asking for satisfaction
on a scale from 1 to 10. (range = 9)
◦ Management wants to be 99% confident in the
results and they do not want the allowed error
to be more than ±.5 scale points.
◦ What is n?
92
 z
2 2
n




σ = 9/6 or 1.5
z = 2.58 (99% confidence)
D = .5 scale points
n = 60
93
D2
N=60

What does this mean?
◦ After the survey, management may make the
following statement: (assume satisfaction mean
is 7.3)
◦ “Our most likely estimate of the level of
consumer satisfaction is 7.3 on a 10-point
scale. In addition, if s equals 1.5, we are 99%
confident that the true level of satisfaction in
our consumer population falls between 6.8 and
7.8 on a 10-point scale”
94
• Note that if s ǂ 1.5 we need to recalculate the
precision of the results, using the same formula:
D
 *z
1.6 * 2.58

 0.533
n
60
95
!
Make sure you reach the required size:
Incidence rate refers to the rate of occurrence or the
percentage of persons eligible to participate in the
study.
In general, if there are c qualifying factors with an
incidence of Q1, Q2, Q3, ...QC, each expressed as a
proportion,
Incidence rate = Q1 x Q2 x Q3....x QC
Initial sample size =
Final sample size
Incidence rate x Completion rate
96