Probability The first restaurant in the world opened in Paris in 1765

CHAPTER 5 – Probability
The first restaurant in the world opened in Paris in 1765.
In 1900, 2% of meals were eaten outside the home. In 2012, 45% were eaten away from home.
5.1 - Probability Rules

Probability is the basis for statistical inference. It’s super important!
Probability – A number which describes the likelihood that an event (or series of events) will
occur. A probability is always between 0 and 1, inclusive.
Experiment – An activity (or series of activities) that results in an outcome. So an outcome is the
result of an experiment.
Event – Any collection of possible outcomes of an experiment.
Simple event – An event with one outcome.
Sample space – The collection of all possible simple events of an experiment.
A table of simple events and their corresponding probabilities is called a probability distribution
or a probability model (book).
 A probability distribution is exactly like a relative frequency distribution, because relative
frequencies and probabilities are both numbers between 0 and 1, inclusive and all relative
frequencies or probabilities always sum to one over their entire distribution.
Rules/properties of probabilities
 0 ≤ 𝑃(𝐸) ≤ 1 for any event E.
 If P(E) = 0, event E is impossible. The closer a probability is to 0, the LESS likely the
event is to occur.
 If P(E) = 1, event E is certain to occur. The closer a probability is to 1, the MORE likely
the event is to occur.
 The sum of the probabilities over a sample space must always equal 1.
An unusual event is an event that has a small probability of occurring. In our text, the author
calls an event unusual if its associated probability is 0.05 or less.

There are two ways to generate probabilities.
The Emperical approach to probability (observational) – Conduct (or observe) an experiment a
large number of times and count the times that the event of interest occurs.
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑖𝑚𝑒𝑠 𝑡ℎ𝑒 𝑒𝑣𝑒𝑛𝑡 𝑜𝑐𝑐𝑢𝑟𝑠
The probability of that event = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑖𝑚𝑒𝑠 𝑡ℎ𝑒 𝑒𝑥𝑝𝑒𝑟𝑖𝑚𝑒𝑛𝑡 𝑖𝑠 𝑟𝑒𝑝𝑒𝑎𝑡𝑒𝑑
The Classical approach to probability (probability for equally likely outcomes) – If an
experiment has n equally likely outcomes and an event can occur in m different ways,
The probability of that event =
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑤𝑎𝑦𝑠 𝑡ℎ𝑒 𝑒𝑣𝑒𝑛𝑡 𝑐𝑎𝑛 𝑜𝑐𝑐𝑢𝑟
𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑒𝑞𝑢𝑎𝑙𝑙𝑦 𝑙𝑖𝑘𝑒𝑙𝑦 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠
=
𝑚
𝑛
The Law of Large Numbers says that if an experiment is repeated infinitely many times, its
empirical probability will converge to the classical probability.
5.2 - The Addition Rule and Complements
Two events are mutually exclusive or disjoint (book) if they have no outcomes in common or, if
they cannot both occur when an experiment is performed.
Addition rule (for mutually exclusive/disjoint events) – If events A and B are mutually
exclusive/disjoint,
P(A or B) = P(A) + P(B)
This calculation can be extended for 3 or
more mutually exclusive/disjoint events.
General Addition rule – If A and B are any two events,
P(A or B) = P(A) + P(B) – P(A and B)
The complement of an event, denoted P(Ec) – Consists of all outcomes in which event E does not
occur.
Relationships between an event and its complement:
 P(E) + P(Ec) = 1
 P(E) = 1 – P(Ec)
 P(Ec) = 1 – P(E)
A contingency table is a cross-classification of two variables/categories.
5.3 - Independence and the Multiplication Rule
Events are independent if their outcomes do not affect each other.
The Multiplication Rule for independent events - If A and B are independent events,
P(A and B) = P(A)


P(B)
This calculation can be
extended for 3 or more
independent probabilities.
Probability calculations that include the phrase “at least” generally use the complement.
5.4 – Conditional Probability and the General Multiplication Rule
Conditional probability - The probability of an event occurring given that another event has
already occurred. Denoted P(B|A) for the probability of event B given that event A has occurred.
 So event A has an effect on the outcome of event B.
General Multiplication Rule – The probability of a sequence of two events occurring is
P(E and F) = P(E)

P(F|E)
The right side reads, “the probability of event E occurs and the probability that event F occurs
given that event E has occurred.”

Note that if E and F are independent, then P(E and F) = P(E)  P(F)
You can find a conditional probability (From the Multiplication Rule) by:
𝑃(𝐹|𝐸) =
𝑃(𝐸 𝑎𝑛𝑑 𝐹)
𝑃(𝐸)
5.5 – Counting Techniques
Factorials and combinations only.
“Fact” and “comb” in StatCrunch
CHAPTER 6 – Discrete Probability Distributions
6.1 – Discrete Random Variables (general)

Recall: discrete versus continuous. (from 1.1)
A random variable – A numeric value for each outcome of an experiment. Its value depends on
chance. Random variables are denoted by capital letters, like X.
A discrete probability distribution – All possible values of a random variable and their
corresponding probabilities of occurrence. Probability distributions (in general) can be in the
form of a table, graph, or formula.
Rules for any probability distribution:
1. The probability of each value of the random variable is a number between 0 and 1
inclusive, or 0 ≤ 𝑝(𝑋) ≤ 1.
2. The sum of the probabilities over the entire distribution is always equal to one, or
∑ 𝑝𝑟𝑜𝑏 ′ 𝑠 = 1.
Probability histogram – A graph of a probability distribution that displays the values of the
random variable on the horizontal axis and their corresponding probabilities on the vertical axis.
(Just like a relative frequency histogram) The bars are displayed touching. (Although they don’t
necessarily in StatCrunch)

The mean and standard deviation of a random variable are µ and σ, not 𝑥̅ and s!
The mean of a discrete random variable X, denoted 𝜇𝑥 (or just µ) is defined by:
x   x  P(X  x)
This means that if an experiment is repeated n independent times and the value of the random
variable X is recorded, as the number of trials increases, the mean of the n trials will approach
𝜇𝑥 .

Because of this, this mean is often called the expected value, especially when money is
involved.
The standard deviation of a discrete random variable X, denoted 𝜎𝑥 (or just 𝜎) is defined by:
x 
(x   )
x
2
 P( x)
And the variance of a discrete random variable X, denoted 𝜎 2 , is the square of the standard
deviation.
Probability Histogram in StatCrunch:
 Graph, Chart, Columns, “Select col” = p(x), “Row labels in” = x. Plot = “Vertical
bars, stacked”
Mean and standard deviation in StatCrunch:
Stat, calculators, custom
6.2 - The Binomial Probability Distribution

The binomial is one of many specific, discrete probability distributions.
Binomial experiments must have all four of these characteristics:
1. A fixed number of trials.
2. All trials independent of each other.
3. All outcomes of each trial must be classified as either “success” or “failure”.
4. The probability of success must be equal for each trial.
The binomial probability formula – calculates the probability of exactly x successes occurring in
n trials of a binomial experiment with success probability p:
P( X  x) n Cx p x (1  p) n x 
n!
p x (1  p) n x
x!(n  x)!
Where:



n = the fixed number of trials
p = the success probability
x = the number of successes in the n trials
The mean and standard deviation of a binomial distribution can be found with these formulas,
which are derived from the formulas for a generic discrete random variable (section 6.1) and the
characteristics of a binomial:
 x  binom  np
 x   binom  np(1  p)
Binomial histograms are the same as general probability histograms, (relative frequency
histograms) except the variable is specifically binomial.
About binomial probability histograms:
 If p = 0.5, the shape of the graph is perfectly symmetric.
 The smaller p is, the more right-skewed the graph.
 The larger p is, the more left-skewed the graph.
 For any fixed p, as the number of trials increases, the distribution of X eventually
becomes approximately bell-shaped. (pg. 337)

Summary/review of inequality symbols and related phrases on pg. 332
StatCrunch:
Stat, Calculators, Binomial
6.3 – The Poisson Probability Distribution
A Poisson experiment - computes the probability of occurrences of an event within a specified
interval. (Usually time or space)
Poisson experiments must have all three of these characteristics
1. The probability of two or more successes in any “small” interval is zero.
2. The probability of success is the same for any two intervals of equal length.
3. The number of successes in any one interval are independent of the number of successes
in any other interval if the intervals are not overlapping.
The Poisson probability formula calculates the probability of obtaining X successes in an interval
of fixed length t and is given by

t x  t
P( x) 
e
x!
Where
o x = 0, 1, 2, 3,…
o 𝜆 (lambda) = the rate of occurrences in some interval of length 1
o e is an irrational constant, e  2.71828…
The mean and standard deviation of a Poisson distribution can be found with these formulas:
 x  t
StatCrunch:
Stat, Calculators, Poisson
 x  t   x
CHAPTER 7 – THE NORMAL PROBABILITY DISTRIBUTION
7.1 - Properties of the Normal Distribution

The normal is a type of specific, continuous probability distributions. It is the most
common of all probability distributions.
A probability density function (pdf) - is how we find probabilities of continuous random
variables. These are generally not user-friendly, so we’ll let StatCrunch do the work. 
All continuous random variables have:
1. A total area of 1 under their graph/curve
2. Each point along the graph/curve has a vertical height > 0
The uniform distribution - A specific, continuous probability distribution where each value of the
random variable has an equally likely chance of occurring.
Recall from chapter 6, AREA = PROBABILITY
About normally distributed variables:
If the population is normally distributed, we say that the variable “has a normal distribution”. If
we determine that the sample comes from a normally distributed population, we say the sample
data is “approximately normally distributed”.
The equation of the normal curve/function is:
1
f ( x,  ,  ) 
e
 2

 ( x ) 2
2 2
where 𝜇 is the mean and 𝜎 is
the standard deviation.
(These are parameters)
Each individual normal curve is determined/defined by its mean and standard deviation!
Just like with a discrete variable, area under the curve of a continuous variable corresponds
directly to probability. In the case of the normal distribution, there’s a different normal
curve/graph for every different combination of 𝜇 and 𝜎.
7.2 – Applications of the Standard Normal Distribution
The standard normal distribution – is a special normal distribution. It has mean 𝜇 = 0 and
standard deviation 𝜎 = 1. A standard normal variable is always denoted by the variable Z so it’s
often referred to as the “Z distribution”. Z is only used to denote a standard normal variable.
IMPORTANT TO NOTE! Because any normal distribution is symmetric and the mean and
standard deviation of the standard normal are 0 and 1, 99.7% or ALMOST ALL Z-scores are
within +/- 3 standard deviations of the mean by the Emperical Rule.
There is special notation for a standard normal random variable (Z) corresponding to area, 𝛼
under the curve to its right.
Z
(Said, “Z of alpha” or “Z sub alpha”) - denotes the Z-score having area 𝛼 specifically to its
right under the standard normal curve. (𝛼 is usually a teeny, weeny amount of probability)
We can use the normal calculator in StatCrunch backwards to find z sub alpha. 
In StatCrunch:
Stat, Calculators, Normal
7.3 – Assessing Normality

The distribution of a sample tends to/should mimic the same characteristics as the
distribution of the population it came from. This fact is used to decide if a random
variable is normally distributed.
If you were to construct a Normal Probability Plot by hand: (We’ll do it in StatCrunch)
1. Order the data.
2. Match each data value with a normal score based on the sample size (from a table on the
internet).
3. Plot the data values and normal scores like ordered pairs.
Normal scores are values that you would expect (for a specific sample size) if the data were
normally distributed. If the data does come from a normal population, the normal scores and the
observed values very closely match and the coordinates line up.
In StatCrunch:
Graph, QQ Plot