Ch01_06

Lecture Notes for
BUSINESS STATISTICS - BMGT 571
Chapters 1 through 6
Professor Ahmadi, Ph.D.
Department of Management
Revised May 2005
Chapter 1
Glossary of Terms:

Statistics

Data

Data Set

Elements

Variable

Observations

Sample and Population

Descriptive Statistics

Statistical Inference

Qualitative and Quantitative Data
Scales of Measurement:

Nominal Scale

Ordinal Scale

Interval Scale

Ratio Scale
Professor Ahmadi’s Lecture Notes
Page 2
Chapter 2
Summarizing Quantitative Data
Daily earnings of a sample of twelve individuals are shown below:
100, 126, 138, 142, 148, 150, 168, 182, 191, 193, 195, 199
Summarize the above data by constructing:
a.
b.
c.
d.
e.
f.
a frequency distribution
a cumulative frequency distribution
a relative frequency distribution
a cumulative relative frequency distribution
a histogram
an ogive
Class
100 - 119
120 - 139
140 - 159
160 - 179
180 - 199
frequency
cumulative
frequency
relative
frequency
cumulative
relative frequency
DOT PLOT
In a recent campaign, many airlines reduced their summer fares in order to gain a larger share of
the market. The following data represent the prices of round-trip tickets from Atlanta to Boston
for a sample of nine airlines:
120
140
140
160
160
160
160
180
180
Construct a dot plot for the above data.
STEM-AND-LEAF DISPLAY
The test scores of 14 individuals on their first statistics examination are shown below:
95
75
87
63
52
92
43
81
77
83
84
91
78
88
a. Construct a stem-and-leaf display for these data.
b. What does the above stem-and-leaf show?
Professor Ahmadi’s Lecture Notes
Page 3
CROSSTABULATION
The following is a crosstabulation of starting salaries (in $1,000's) of a sample of business school
graduates by their gender.
Starting Salary
Gender
Less than 30
30 up to 35
35 and more
Total
Female
12
84
24
120
Male
20
48
12
80
Total
32
132
36
200
a. What general comments can be made about the distribution of starting salaries and the gender
of the individuals in the sample?
b. Compute row percentages and comment on the relationship between starting salaries and
gender.
SCATTER DIAGRAM
The average grades of 8 students in professor Ahmadi’s statistics class and the number of
absences they had during the semester are shown below:
Student
Number of
Absences
(x)
Average
Grade
(y)
1
2
3
4
5
6
7
8
1
2
2
1
3
4
8
3
94
78
70
88
68
40
30
60
Develop a scatter diagram for the relationship between the number of absences (x) and their
average grade (y).
Professor Ahmadi’s Lecture Notes
Page 4
Chapter 3 Formulas
Ungrouped Data
SAMPLE
POPULATION
Mean
 Xi
X
n
where n = sample size
 Xi
N
where N = size of population

Interquartile Range
IQR = Q3 - Q1
(Same as for sample)
where: Q3 = third quartile (i.e., 75th percentile)
Q1 = first quartile (i.e., 25th percentile)
S2 
 X i  X 
Variance
2
 
2
n -1
or:
 X i  
2
N
or:
S 
2
 X 2i
 nX
n -1
2
 X 2i  N 2
 
N
2
Standard Deviation
S  S2
  2
Coefficient of Variation (C.V.)
 
C.V.    100
 
 S
C.V.    100
 X
Covariance
S xy

( X i  X)(Yi  Y)
n 1
Professor Ahmadi’s Lecture Notes
 XY 
( X i   X )(Yi   Y )
N
Page 5
Pearson Product Moment Correlation Coefficient
rXY
SAMPLE
S
 XY
S XSY
POPULATION
 XY 
where
r XY = Sample correlation coefficient
S XY = Sample covariance
SX = Sample standard deviation of X
S Y = Sample standard deviation of Y
 XY
 XY
where
 XY = Population correlation coefficient
 XY = Population covariance
 X  Population standard deviation of X
 Y  Population standard deviation of Y
Weighted Mean
X
 w i Xi
 wi

 w i Xi
 wi

 fi M i
N
where
Xi = data value i
wi = weight for data value i
Grouped Data
X
Mean
 fi M i
n
where
fi = frequency of class i
Mi = midpoint of class i
S 
2

 fi M i  X

Variance
2
 fi  M i  
)
 
N
2
2
n 1
or
 f i M i  nX
S 
n 1
2
2
2
Professor Ahmadi’s Lecture Notes
2 
 f i M i  N
N
2
2
Page 6
Chapter 3
Measures of Location & Dispersion (Ungrouped Data)
Hourly earnings (in dollars) of a sample of eight employees of Ahmadi, Inc. is shown below:
Individual Earning (X)
1
12
2
15
3
15
4
17
5
18
6
19
7
22
8
26
I. Measures of location
a. Compute the mean and explain and show its properties.
b. Determine the median and explain its properties.
c. Determine the 70th percentile.
d. Determine the 25th percentile.
e
Find the mode.
Professor Ahmadi’s Lecture Notes
Page 7
II. Compute the following measures of dispersion for the above data:
a. Range
b. Interquartile range
c. Variance & the Standard deviation
d. Coefficient of variation
e. A sample of Chatt, Inc. employees had a mean of $21 and a standard deviation of $5. Which
company shows a more dispersed data distribution?
f. Use “Descriptive Statistics” in Excel and determine all the statistical measures.
Professor Ahmadi’s Lecture Notes
Page 8
Chapter 3
Five-Number Summary
The weights of 12 individuals who enrolled in a fitness program are shown below:
Individual
1
2
3
4
5
6
7
8
9
10
11
12
Weight (Pounds)
100
105
110
130
135
138
142
145
150
170
240
300
a. Provide a five-number summary for the data.
b. Show the box plot for the weight data.
Professor Ahmadi’s Lecture Notes
Page 9
Chapter 3
Covariance & Coefficient of Correlation
The average grades of a sample of 8 students in professor Ahmadi’s statistics class and the
number of absences they had during the semester are shown below.
Student
Number of
Absences
( X i)
Average
Grade
( Yi )
1
2
3
4
5
6
7
8
TOTAL
1
2
2
1
3
4
8
3
24
94
78
70
88
68
40
30
60
528
a. Compute the sample covariance and interpret its meaning.
b.
Compute the sample coefficient of correlation and interpret its meaning.
Professor Ahmadi’s Lecture Notes
Page 10
Chapter 3
Weighted Mean
The Michael Ahmadi Oil Company has purchased barrels of oil from several suppliers. The
purchase price per barrel and the number of barrels purchased are shown below.
Supplier Price Per Barrel ($)
A
B
C
D
Number of Barrels
17
19
18
16
4,000
3,000
9,000
20,000
Compute the weighted average price per barrel.
Professor Ahmadi’s Lecture Notes
Page 11
Chapter 3
Measures of Location & Dispersion (Grouped Data)
The yearly income distribution for a sample of 30 Ahmadi, Inc. employees is shown below.
Yearly Income Frequency
(In $10,000)
fi
4- 6
7- 9
10 - 12
13 - 15
16 - 18
Totals
2
6
7
10
5
n = 30
a. Compute the mean yearly income.
b. Compute the variance and the standard deviation of the sample.
c. A sample of Chatt, Inc. employees had a mean income of $132,000 with a standard deviation
of $36,000. Which company shows a more dispersed income distribution?
Professor Ahmadi’s Lecture Notes
Page 12
Chapter 4 Formulas
Counting Rule for Multiple-step Experiments:
Total number of outcomes =
 n1  n 2  n k 
The number of Combinations of N objects taken n at a time:
 N
N!
 
 n  n! N - n!
Sum of the probability of Event A and its Complement:
P(A) + P(Ac) = 1.0
Addition Law (the probability of the union of two events):
P(A
B) = P(A) + P(B) - P(A  B)
Multiplication Law (the probability of the intersection of two events):
P(A  B) = P(A) P(B|A)
or
P(A  B) = P(B) P(A|B)
Two Events A and B are Independent if:
P(A|B) = P(A)
or
Multiplication Law for Independent Events:
P(B|A) = P(B)
P(A  B) = P(A) P(B)
Conditional Probability:
P(A|B) =
P(A  B)
P(B)
or
P(B|A) =
P(A  B)
P(A)
Bayes' Theorem in General:
P(Ai|B) =
P(A i ) P(B|A i )
P(A 1 ) P(B|A 1 ) + P(A 2 ) P(B|A 2 ) +...+ P(A n ) P(B|A n )
Summary of Bayes' Theorem Calculations:
Event
Prior
Probabilities
P(Ai)
Professor Ahmadi’s Lecture Notes
Conditional
Probabilities
P(B|Ai)
Joint
Probabilities
P(Ai  B)
Posterior
Probabilities
P(Ai|B)
Page 13
Chapter 4
Basic Probability Concepts
1.
Assume you have applied to two different universities (let's refer to them as universities A
and B) for your graduate work. In the past, 25% of students (with similar credentials as
yours) who applied to university A were accepted; while university B had accepted 35% of
the applicants (Assume events are independent of each other).
a. What is the probability that you will be accepted in both universities?
b. What is the probability that you will be accepted to at least one graduate program?
c. What is the probability that one and only one of the universities will accept you?
d. What is the probability that neither university will accept you?
2.
In the two upcoming basketball games, the probability that UTC will defeat Marshall is 0.63,
and the probability that UTC will defeat Furman is 0.55. The probability that UTC will
defeat both opponents is 0.3465.
a. What is the probability that UTC defeats Furman given that they defeat Marshall?
b. Are the outcomes of the games independent? Explain and substantiate your answer.
c. What is the probability that UTC wins at least one of the games?
d. What is the probability of UTC winning both games?
Professor Ahmadi’s Lecture Notes
Page 14
Chapter 4
Conditional Probability
A research study investigating the relationship between smoking and heart disease in a sample of
500 individuals provided the following data:
Record of Heart
Disease
No Record of Heart
Disease
Total
Smoker
50
Nonsmoker
40
Total
90
100
310
410
150
350
500
a. Show the joint probability table.
b. What is the probability that an individual is a smoker and has a record of heart disease?
c. Compute and interpret the marginal probabilities.
d. Given that an individual is a smoker, what is the probability that this individual has heart
disease?
e. Given that an individual is a nonsmoker, what is the probability that this individual has heart
disease?
f. Does the research show that heart disease and smoking are independent events? Use
probabilities to justify your answer.
g. What conclusion would you draw about the relationship between smoking and heart disease?
Professor Ahmadi’s Lecture Notes
Page 15
Chapter 4
BAYES' THEOREM
When Ahmadi, Inc. sets up their drill press machine, 70% of the time it is set up correctly. It is
known that if the machine is set up correctly it produces 90% acceptable parts. On the other
hand, when the machine is set up incorrectly, it produces 20% acceptable parts. One item from
the production is selected and is observed to be acceptable.
a. What is the probability that the machine is set up correctly? That is, we are interested in
computing:
P(Correct set up  Acceptable part).
Let the following symbols represent the various events:
E1 = Correct set up
E2 = Incorrect set up
G = Good part (i.e., Acceptable part)
With the above notations we want to determine P(E1  G).
b. Compute all the posterior probabilities.
Professor Ahmadi’s Lecture Notes
Page 16
Chapter 5 Formulas
Required Conditions for a Discrete Probability Function
f(x) > 0
 f(x) = 1
Discrete Uniform Probability Function
f(x) = 1/n
where
n = the number of values the random variable may assume
Expected Value of a Discrete Random Variable
E(x) = µ =  (x f(x))
Variance of a Discrete Random Variable
Variance (x) =  2 =  (x - µ) 2 f(x
Number of Experimental Outcomes Providing Exactly x Successes in n Trials
 n
  =
 x
where
n!
x!( n - x )!
n! = n (n - 1) (n - 2) . . . (2)(1)
(Remember: 0! = 1)
Binomial Probability Function
f(x) =
n!
p x (1 - p) n – x
x!( n - x )!
where x = 0 ,1, 2, ..., n
The Mean of a Binomial Distribution
µ=np
The Variance of a Binomial Distribution
 2 = n p (1 - p)
Professor Ahmadi’s Lecture Notes
Page 17
Chapter 5
Discrete Probability Distributions
The manager of the university bookstore has kept records of the number of diskettes sold per day.
She provided the following information regarding diskettes sales for a period of 60 days:
Number of
Diskettes Sold
0
1
2
3
4
5
Number
of Days
6
9
12
18
12
3
a. Identify the random variable
b. Is the random variable discrete or continuous?
c. Develop a probability distribution for the above data.
d. Is the above a proper probability distribution?
e. Develop a cumulative probability distribution.
f. Determine the expected number of daily sales of diskettes.
g. Determine the variance and the standard deviation.
h. If each diskette yields a net profit of 50 cents, what are the expected yearly profits from the
sales of diskettes?
Professor Ahmadi’s Lecture Notes
Page 18
Chapter 5
Introduction to Binomial Distribution
A production process has been producing 10% defective items. A random sample of four items
is selected from the production process.
a. What is the probability that the first 3 selected items are non-defective and the last item is
defective?
b. If a sample of 4 items is selected, how many outcomes contain exactly 3 non-defective items?
c. What is the probability that a random sample of 4 contains exactly 3 non-defective items?
d.
Determine the probability distribution for the number of non-defective items in a sample of
four.
e. Determine the expected number (mean) of non-defectives in a sample of four.
f. Find the standard deviation for the number of non-defectives.
Professor Ahmadi’s Lecture Notes
Page 19
Chapter 5
POISSON PROBABILITY DISTRIBUTION
During the registration period, students consult their advisor for course selection. A particular
advisor noted that during each half hour an average of eight students came to see him for
advising.
a. What is the probability that during a half hour period exactly four students will consult him?
b. What is the probability that during a half hour period less than three students will consult
him?
c. What is the probability that during an hour period ten students will consult him?
d. What is the probability that during an hour and fifteen minute period thirty students will
consult him?
Professor Ahmadi’s Lecture Notes
Page 20
Chapter 6 Formulas
Uniform Probability Density Function for a Random Variable x:
 1
b - a

f(x) = 
0

for a  x  b
elsewhere
Mean and Variance of a Uniform Continuous Probability Distribution:
a + b
2
 =
2 
(b - a) 2
12
The Z Transformation Formula:
z=
(x -  )

Solving for x using the Z transformation formula:
x    Z
Professor Ahmadi’s Lecture Notes
Page 21
Chapter 6
Continuous Probability Distributions
I. - The Uniform Distribution
The driving time for an individual from her home to her work is uniformly distributed between
300 to 480 seconds.
a. Give a mathematical expression for the probability density function.
b. Compute the probability that the driving time will be less than or equal to 435 seconds.
c. Determine the probability that the driving time will be exactly 400 seconds.
d. Determine the expected driving time.
e. Determine the standard deviation of the driving time.
Professor Ahmadi’s Lecture Notes
Page 22
Chapter 6
II. - The Normal Distribution
1.
2.
Given that Z is the standard normal random variable, give the probabilities associated with
the following:
a.
P(Z < - 2.09) =
?
b.
P(Z > -0.95) =
?
c.
P(-2.55 < Z < -2.33) =
?
Z is a standard normal variable. Find the value of Z in the following:
a.
The area between -Z and zero is 0.4929.
b.
The area to the right of Z is 0.0192. Z =
c.
The area between -Z and Z is 0.668.
Professor Ahmadi’s Lecture Notes
Z=
?
?
Z=
?
Page 23
3.
The weight of certain items produced is normally distributed with a mean weight of 60
ounces and a standard deviation of 8 ounces.
a.
What percentage of the items will weigh between 50.4 and 72 ounces?
b.
What percentage of the items will weigh between 42 and 52 ounces?
c.
What percentage of the items will weigh at least 74.4 ounces?
d.
What are the minimum and the maximum weights of the middle 60% of the items?
Professor Ahmadi’s Lecture Notes
Page 24
4.
Sun Love grapefruit growers have determined that the diameter of their grapefruits is
normally distributed with a mean of 4.5 inches and a standard deviation of 0.3 inches. (You
can find the step-by-step solution to this problem in my workbook.)
a.
What is the probability that a randomly selected grapefruit will have a diameter of at
least 4.14 inches?
b.
What percentage of grapefruits has a diameter between 4.8 to 5.04 inches?
c
Sun Love packs their largest grapefruits in a special package called "Super Pack." If
5% of all their grapefruits are packed in "Super Packs," what is the smallest diameter of
the grapefruits, which are in the "Super Packs?"
d
In this year's harvest, there were 111,500 grapefruits, which had a diameter over 5.01
inches. How many grapefruits has Sun Love harvested this year?
Professor Ahmadi’s Lecture Notes
Page 25
5.
In grading eggs, 30% are marked small, 45% are marked medium, 15% are marked large,
and the rest are marked extra-large. If the average weight of the eggs is normally distributed
with a mean of 3.2 ounces and a standard deviation of 0.6 ounces:
a
What are the smallest and the largest weights of the medium size eggs?
b
What is the weight of the smallest egg, which will be in the extra-large category?
Professor Ahmadi’s Lecture Notes
Page 26