STAB22 section 5.1

STAB22 section 5.1
5.5 To solve (a), look at Table C, page T-7. Find the rows of the
table with n = 4 (the third block down), and the column with
p = 0.3. The numbers in that column give the probabilities of
observing each possible value k in a binomial distribution with
that n and that p. Thus for n = 4 and p = 0.3, the probabilities
of observing 0, 1, 2, 3, 4 successes are 0.2401, 0.4116, 0.2646,
0.0756 and 0.0081 respectively. P (X = 0) you just read off the
table as 0.2401, and for P (X ≥ 3) you add up the ones you want:
P (X ≥ 3) = P (X = 3) + P (X = 4) = 0.0756 + 0.0081 = 0.0837.
5.1 We go through the 1500 students one at a time and ask each one
“did you use the Internet to find a place to live?”. So n, the
number of trials, is 1500. Each trial gives us a “success” or a
“failure”, which we are free to define however we like: we could
define a success as a Yes answer, or we could define a success as
a No answer. Since we are really interested in the Yes answers
here, it makes more sense to define success as Yes. In that case,
X is the number of Yes answers observed, which, in this sample,
is 525. Then p̂ = 525/1500 = 0.35. (If you chose to define success
as No, your X is the number of No answers, 975, and your p̂ is
975/1500 = 0.65.)
If you are going to use Table C for (b), you’ll need to arrange
things so that the success probability is 0.5 or less. What you do
is to interchange successes and failures: subtract p from 1 to get
1 − 0.7 = 0.3, and subtract the numbers of successes from n so
that 4 successes in the question becomes 4 − 4 = 0 successes in
your calculation, and X ≤ 1 becomes X ≥ 4 − 1 = 3. (Note that
the ≤ becomes ≥). Since the values for n, p, k are now the same
as in (a), the answers will be the same too.
5.2 200 seniors were questioned, so n = 200. p̂ is the fraction in your
sample that were successes (said that they had taken a statistics
course): 40% or 0.40. The number of successes in your sample
must have been 40% of 200, 80, which is your value of X.
The thought process in doing part (b) by Table C explains why
the answers are the same: whatever is not a success is a failure,
and you can count either.
5.3 Each coin toss is independent, with the same probability 0.5 of
getting a head each time. So the number of heads in 20 tosses
has a binomial distribution with n = 20 and p = 0.5. When you
actually do it, there’s no knowing what number of heads you’ll
get, but the values near the middle (10) are more likely: you’d
“expect” to get about half heads and half tails. So 11 heads is
more likely than 19, though both are possible.
You can use Minitab instead of Table C. Select Calc, Probability
Distributions and Binomial. Fill in the Number of Trials (n) and
the Probability of Success (p). At the top you can select either
Individual Probability (for working out P (X = 0)) or Cumulative
Probability (for working out P (X ≤ 1): always ≤). If you are
working out something like ≥, you have to rewrite what you want
as a ≤. For instance, P (X ≥ 3) = 1 − P (X ≤ 2) since X has to
be either 3 or bigger, or 2 or smaller. Finally, at the bottom of
the dialog box click on Input Constant, and put in your value of
k. So for the first part of (a), click on Probability, put in n = 4,
p = 0.3, click on Input Constant and enter 0 in the box. My result,
0.2401, is shown in Figure 1. For the second part, get the dialog
box again, click on Cumulative Probability, make sure n and p
are correct, and enter 2 next to Input Constant (which should
5.4 According to the genetic theory, each child inherits genes from
its parents independently of other children. So we have n = 4
“trials” (children) who each have probability 0.25 of having type
O blood, so the number of children who actually do end up having
type O blood has a binomial distribution with n = 4 and p = 0.25.
(You say that something has a particular distribution before you
observe any data; once you have the data, you have a value like
“2 children with type O blood”, not a distribution.)
1
middle of page 320: the count of
still be selected). This gives 0.9163, so the answer you want is
q the number of heads has mean
1 − 0.9163 = 0.0837. For (b), first click on Probability again,
np = (100)(0.5) = 50 and SD (100)(0.5)(0.5) = 5. This says
enter n = 4 and p = 0.7 now, and next to Input Constant enter 4.
that the number of heads should be relatively close to 50, which
This again gives 0.2401. Then click on Cumulative Probability,
is what you’d expect.
ensure that n = 4 and p = 0.7 are correct, and enter 1 (for
≤ 1) next to Input Constant. This gives 0.0837, and there’s no 5.7 The first thing we need when using a normal approximation is the
mean and SD of the thing we’re normal-approximating, here the
subtracting from 1 this time. These answers are all the same as
proportion of heads in 100 tosses. We can use the answers from
you get from Table C. The advantage of using Minitab is that it
5.6 for this: mean 0.5, SD 0.05. Then turn the given values into
will give you answers for all combinations of n and p, not just
z-scores and use Table A as in §1.3.
the ones that happen to appear in Table C. In fact, Minitab can
handle large values of n as well, so that even if you are using the
0.4 has z-score (0.4 − 0.5)/0.05 = −2, and 0.6 has z-score
normal approximation by hand, you can get Minitab to give you
(0.6 − 0.5)/0.05 = 2. These give 0.9772 and 0.0228 in Table A, so
the exact answer (and you can see how good your approximation
subtract these to give the probability: 0.9772 − 0.0228 = 0.9544.
was).
There is a better than 95% chance that the proportion of heads
after 100 tosses will be between 0.4 and 0.6, which may strike you
Probability Density Function
as surprisingly high, but that’s the way it works.
Binomial with n = 4 and p = 0.3
x
0
For (b), follow the same steps: 0.45 has z-score (0.45−0.5)/0.05 =
−1 and 0.55 has z-score (0.55 − 0.5)/0.05 = 1, so the chance of
ending up between these is 0.8413 − 0.1587 = 0.6826.
P( X = x )
0.2401
You might also notice that “between 0.4 and 0.6” is “within 2 SDs
of the mean”, and “between 0.45 and 0.55” is “within 1 SD of the
mean”, so that the 68-95-99.7 rule gives the answers (without
using a table) as “about 95%” and “about 68%” respectively.
Figure 1: Minitab binomial probability output
5.6 Here n = 100 and p = 0.5 (fair coin). Use the formulas in the
box
at the top of page 322 to find that µp̂ = p = 0.5 and σp̂ =
q
(0.5)(0.5)/100 = 0.05: that is, the fraction of heads you’ll get
will be close to 0.5 (50%) almost certainly, because the SD of p̂ is
small.
Since we know we’re tossing the coin 100 times, the question could
also have asked “find the probability that the number of heads is
between 40 and 60, between 45 and 55”, which we would have
expected to give the same answers. To do it this way, we use
the mean and SD of the count of heads, 50 and 5, as I found at
the end of 5.6. Using these figures with the proper mean and SD
gives the same z-scores as working with the proportions, and so
the same answers.
This is not the same as the mean and SD of the count of the
number of heads, because the proportion of heads will be about
0.50 (regardless of the number of times you toss the coin), whereas
the number of heads will be about half the number of tosses —
if you toss the coin 100 times, you’d expect about 50 heads. If
you want the mean and SD of the count, use the formulas in the
You can get the exact answers from Minitab, for comparison with
your normal-approximation answers, but you have to work with
2
one doesn’t make much difference.)
the success counts (you can’t work with the sample proportions).
Use Cumulative Probability with n = 100 and p = 0.5. Before
you fill in the dialog box, though, you can make your life easier by 5.9 (a) A properly tossed fair coin has no memory, so that individual
tosses are independent: what happened in the past (three confilling column C1 with the values 40, 60, 45 and 55 (the values you
secutive
heads) has no influence over what’s going to happen this
want the probabilities for). Then you go to the dialog box (Calc,
time. “Tails are due” is a fallacy. (b) has the same reasoning: the
Probability Distributions, Binomial), select Cumulative Probabilprobability is still exactly 0.5. (c) p̂ is the sample proportion: you
ity, enter n and p, and at the bottom click on Input Column and
perform
the experiment and see how many successes you get, so
type C1 into the box. This gets you all four probabilities of ≤
that p̂ is a number that you know afterwards. This is unlike p, one
all at once. See Figure 2. Then to find the probabilities of “beof
the parameters of the binomial distribution, the probability of
tween”, you subtract: the chance of being between 40 and 60 is
success, which would be known before you toss any coins, roll any
0.982400 − 0.028444 = 0.953956 and the chance of being between
dice, etc.
45 and 55 is 0.864373 − 0.184101 = 0.680272. These are close
to what we found by the normal approximation; indeed, they are
5.10 (a) X is the number of successes, a number like 19, and not a
close to the 68% and 95% we got without using any tables or
proportion at all (which would be a number like 19/50 = 0.38).
software at all.
(b) is wrong two ways: the given quantity is an SD not a variance
(because of the square root), and it is the SD of the proportion
Cumulative Distribution Function
and not the count. (c) The accuracy of the normal approximation
Binomial with n = 100 and p = 0.5
depends on p as well as n: if p is very close to 0 or 1, even
n = 10000 might not be large enough. (If p is 0.5, even a small n
x P( X <= x )
like
n = 20 would do. Try this n and p, and also n = 10000, p =
40
0.028444
60
0.982400
0.0001, in the rule of thumb in the box on page 323.)
45
55
0.184101
0.864373
5.11 (a) If the poll is a simple random sample, this one will be OK,
with n = 200 and p being some reasonable value for the probability of a randomly chosen student being “usually irritable in the
Figure 2: Cumulative binomial probabilities for 5.7
morning”. (b) is no good because the number of trials (tosses) is
not fixed: every time you do this experiment, you’ll need a dif(You might be wondering whether we are finding the probability of
ferent number of tosses. (c) is OK, again because it is a random
“between 40 and 60 inclusive”, or if we are omitting the endpoints.
sample, with n = 500 and p = 1/12.
This is a discrete distribution, so it makes a difference — recall,
for instance, exercise 4.55. Actually, here, we are including 60 5.12 (a) There is no notion of “success” here. If a count were made of
the number of students with mean systolic blood pressure greater
and excluding 40; if we were going to include 40 as well, we’d
(or less) than some target value, that count could be binomial. (b)
have to get P (X ≤ 39) and subtract that. Because n is so large,
looks OK (random sampling) with a fixed sample size (20), and
though, the individual probabilities are very small (you can get
a clear definition of “success” (defect) and “failure” (no defect).
Minitab to show you how small), so including or excluding just
3
(c) also looks OK, for the same reason: a student will either 5.14 The number of visitors has a binomial distribution with n = 15
and p = 0.5, approximately.
report that they eat the required amount of fruits and vegetables
(success), or report that they don’t (failure).
Consult Table C, page T-9, with n = 15 and use the p = 0.5
column. Take the probabilities for k = 8 onwards, and add them
5.13 The number of errors caught will have a binomial distribution
up. This gives 0.1964+0.1527+0.0916+0.0417+0.0139+0.0032+
with n = 20 and p = 0.7. The number of errors missed also has
0.0005 = 0.5000. (Because p = 0.5, the numbers in Table C are
a binomial distribution with n = 20 and p = 0.3, interchanging
the same ones going up and down, and they add up to 1, and the
successes and failures.
numbers you want are just the half that go down. So you could
(You wouldn’t tell the student proofreader that there were 20
guess that the answer is 0.5 without adding them up.)
errors, because he or she might keep trying until finding all 20,
but from your point of view there are 20 opportunities to catch 5.15 The mean is np. For the number of errors caught, this is
an error, and each time the student may or may not succeed.
(20)(0.7) = 14 and for the number of errors missed the mean
Some errors might be easier to catch than others, which would
is (20)(0.3) = 6. (These add up to 20 as they should.)
q
make the probability of success at each trial unequal, but we’re
The
SD
of
the
number
of
errors
caught
is
np(1 − p) =
not worrying about that here.)
q
(20)(0.7)(0.3) = 2.05. (The SD of the number of errors missed
In (b), we’re counting the number of errors missed, so p = 1 −
is the same, because the formula has the same numbers multiplied
0.7 = 0.3. Table C (starting at page T-6 in the back of the
together
in a different order.)
textbook) has binomial probabilities; the second table on page
q
T-9, with n = 10 and values of p from 0.10 to 0.50, is the one you
need. Look in the p = 0.30 column, and add up the probabilities
from 4 on (to get “4 or more errors missed”): this is 0.2001 +
0.1029 + 0.0368 + 0.0090 + 0.0014 + 0.0001 = 0.3503. This is quite
high, because 10 errors isn’t very many, and it’s quite likely to
have this poor a performance by chance.
If p goes up to 0.9, the SD becomes (20)(0.9)(0.1) = 1.34, which
is smaller. If p goes up to 0.99, the SD decreases further to 0.44.
If the probability of success gets closer and closer to 1, the proofreader will make fewer and fewer mistakes, so the number of errors caught will get (almost certainly) closer and closer to 20.
The spread will decrease to nothing, so the SD should (and does)
approach zero.
Notice that Table C doesn’t have any probabilities for p > 0.5.
This is because you can always rephrase a problem to use a p less
than 0.5. Another way to ask the question in (b) is: “how likely 5.16 The mean of the count is np = (15)(0.5) = 7.5. The mean of the
proportion p̂ is np/n = p = 0.5 no matter what n is.
is it that the proofreader will catch 6 or more errors of the 10?”.
When n = 150, the mean count of people visiting is np =
The connection is: interchange successes and failures (6 errors
(150)(0.5) = 75, and the mean proportion of people visiting is
caught is 10 − 6 = 4 errors missed), and replace p (here 0.7) with
p = 0.5. When n = 1200, the mean count of people visiting
1 − p (1 − 0.7 = 0.3). Either you’ll have a p you can use directly,
is np = (1500)(0.5) = 750, and the mean proportion of people
or you can get one by this recipe.
visiting is still p = 0.5.
If you want to, you can do this question using Minitab. Page
96–97 of the Minitab manual shows you how.
The mean count of people visiting goes up as n goes up, but the
4
mean proportion of people visiting is constant at 0.5. (If you work
out the standard deviations using the formulas on pages 320 and
322, you’ll find that the ones for the counts go up, and the ones
for the proportions go down. With a larger n, that is, a larger
sample, the sample proportion becomes more predictably close to
0.5.)
When you take an actual sample of male internet users, the actual
behaviour is not quite as predictable as this as n increases, but
you can say, almost certainly, that the mean count of successes
will increase as n increases, and the mean proportion of successes
will head towards 0.5 as n increases.
Figure 3: Probability histogram for blood type data
5.19 For “0” in the question, read “any particular digit, such as a
5”. The number of 5’s in a group of 5 digits has a binomial
distribution with n = 5 and p = 0.10. So go into Table C (page
T-7). Probability of at least one five is one minus probability of
no five; this is 1 − 0.5905 = 0.4095. (Or take the probabilities for
k = 1, 2, 3, 4, 5 and add them up. Or, if you really want to, use
the binomial probability formula to get 1 − (0.1)5 = 0.4095.)
q
σp̂ = (0.49)(0.51)/1016 = 0.0157. A sample proportion of 0.46
gives a z of (0.46 −0.49)/0.0157 = −1.91, and 0.52 gives z = 1.91.
Table A gives the chance of being between these values as 0.9713−
0.0287 = 0.9426. (To use Minitab, 1016(0.46) = 467 successes,
and 1016(0.52) = 528 successes. In a binomial distribution with
n = 1016 and p = 0.49, the chance of a number of successes
between these values is 0.972824 − 0.028387 = 0.944437, so the
normal approximation is very close.)
In lines 40 digits long, n = 40, so the mean number of fives is
(40)(0.1) = 4. About one-tenth of all digits are fives, so in a line
of 40 digits, about 4 of them will be fives. (This doesn’t mean
that exactly 4 will be fives; occasionally you’ll get 4, but usually
you’ll get more or fewer.)
The chance of getting a sample proportion here between 0.46 and
0.52, that is, within 0.03 of the true value 0.49 of p, is high, about
95%. But it is not a certainty. Some of the possible samples
you could draw will have a sample proportion less than 0.46 or
bigger than 0.52 (that is, outside the poll’s stated margin of error
of 0.03). We will see in §6.1 that it is impossible to be certain,
because anything could happen in a sample, so what we do is to
offer something like a margin of error that is correct “19 times
out of 20”, that is, it has probability 0.95 of being correct, over
all the possible samples that we could take. If we wanted to have
a higher chance of being correct, say 99%, we’d have to accept a
larger margin of error.
5.21 n = 4 and p = 0.25 (one quarter). Copy the five entries from
Table C for n = 4, p = 0.25 (page T-7), and draw yourself the
histogram as shown in Figure 3. The vertical scale (labelled “frequency”) is actually the probability times 10000; I had to do this
to get Minitab to plot it.
The mean is np = (4)(0.25) = 1, which goes right under the 1 bar
on the histogram; this bar happens to be the tallest.
5.22 n is large and p is near 0.5, so use the normal approximation.
The mean (for the sample proportion) is µp̂ = 0.49 and the SD is
5
5.23 The calculations here are like those in 5.22: use the normal
approximation to the binomial. Here, n is large but p is not
that close to 0.50, so we should check the rules of thumb: for
p = 0.3, np = 1011(0.3) = 303.3 and n(1 − p) = 1011(0.7) =
707.7, both of which are safely greater than 10, and for p = 0.06,
np = 1011(0.06) = 60.66 and n(1 − p) = 1011(0.94) = 950.34, so
this is also safe.
Figure 4: Spreadsheet for calculations of 5.24
When
q n = 1011 and p = 0.30, the mean and SD (for p̂) are 0.30
and (0.3)(0.7)/1011 = 0.0144. The z values for 0.28 and 0.32
are ±(0.32 − 0.30)/0.0144 = ±1.39, so the probability of being
between is 0.9177 − 0.0823 = 0.8354.
with a larger sample, the sample proportion is more likely to
be close to the population proportion. If you were able to take
an infinitely large sample (or sample the whole population), you
would be certain to get p̂ = p. In real life, though, you’ll have to
accept that your sample proportion won’t be exactly equal to the
population proportion.
q
If p = 0.06, the mean and SD are 0.06 and (0.06)(0.94)/1011 =
0.0075. The z-values for 0.04 and 0.08 are ±(0.08−0.06)/0.0075 =
±2.68, so the probability of being between is 0.9963 − 0.0037 =
5.25 The sample proportion is 140/200 = 0.7 or 70%. You can use
0.9926.
a normal approximation to find the chance that 140 or more stuThe probability of being within 0.02 of p appears to be getting
dents in a sample of 200 would support the crackdown, if p = 0.67.
larger as p gets smaller. You can reason this out in a couple of
Or you can pull the answer out of Minitab: the probability of 139
ways: first, as p gets closer to 0, σp̂ is getting closer to 0 as well,
or less is 0.7951, so the chance of 140 or more is 0.2049. A normal
which means that p̂ is almost certainly close to p, and the chance
approximation gives 1 − 0.8165 = 0.1835, which is not bad.
of being “within” anything will get closer to 1. Second, as p gets
The upshot of this calculation is that if the proportion of students
closer to 0, the chance of observing any successes at all in your
favouring the crackdown is really 0.67, it is quite likely (the probsample gets smaller, and thus the sample proportion p̂ is more
ability is about 0.20) that you will get as many as 140 = 70% in
likely to be close to 0 (ie. very close to p) as well.
favour in your sample, just by chance. So this, by itself, is not
evidence that the proportion of students in favour at “your col5.24 The calculations here are the same idea again: use the normal
lege” is higher than 0.67. (Your letter needs to make this point:
approximation to the binomial and the mean and SD of p̂ to
because of random sampling, the result that was observed could
get z-values and probabilities for the various values of n. Since
easily have happened by chance.)
you have to do the calculations several times over, you can use
a spreadsheet to do the repetitive calculations for you. Mine is
If you really wanted evidence that “your college” was different,
shown in Figure 4. If you can follow the calculations in 5.22, you’ll
you would have to either (a) get a sample proportion quite a bit
be able to see what I’m doing here.
bigger, or (b) get the same result (70%) as here with a bigger
sample. With a bigger sample, a sample proportion as high as
The probability of getting a sample proportion within 0.03 of
the true p appears to be (and is) heading towards 1. That is,
0.70 becomes progressively less likely if p = 0.67, and so with a
6
bigger sample you would be more entitled to conclude that p is not
equal to 0.67 after all. (This is the logic of a test of significance,
which we’ll see a lot more of in §6.2).
From Table A, the probability of “less than” is 0.9996, so the
probability of “more than” is 1 − 0.9996 = 0.0004. So the college
will rarely get caught out: most of the time, they won’t end up
with more than 950 students following this strategy.
5.27 (a) There are four shapes, of which the subject guesses one,
so p = 41 . (b) The number of shapes guessed out of 20 has a
binomial distribution with n = 20 and p = 41 = 0.25, so from
Table C (page T-10), the probability of 10 or more correct guesses
is 0.0099 + 0.0030 + 0.0008 + 0.0002 = 0.0139. (c) This is just
the mean and SD of the binomial distribution here, ie. mean is
np = 20(0.25)
√ = 5, variance is np(1 − p) = 20(0.25)(0.75) = 3.75,
and SD is 3.75 = 1.94 guesses. (d) Knowing that the deck has
exactly 5 of each card might change things: for instance, if the
subject hasn’t seen a star in the first 10 cards, he/she knows that
5 of the last 15 cards are stars and may start guessing stars, with
a higher chance of being correct (that is, the chance of guessing
a card correctly isn’t constant all the way through, and therefore
a binomial distribution is no good.) This is the same strategy as
“counting face cards” if you are playing blackjack; counting cards
is a winning strategy in a casino if you are discreet enough not to
get thrown out!
5.28 The mean is np = 1200(0.75)
= 900,
√
1200(0.75)(0.25) = 225, so SD is 225 = 15.
We’re using the normal approximation to the binomial here because n is large (1200) and p is not too far from 0.5. The rule of
thumb on page 323 says this will be OK if np ≥ 10 and n(1 −p) ≥
10. Here np = 1200(0.75) = 900 and n(1−p) = 1200(0.25) = 300,
so we are safe.
(You might be concerned that the binomial distribution deals with
whole numbers, whereas the normal distribution deals with fractional numbers. Or you might be thinking that “at least 950” and
“more than 950” are different — in the first you include 950, and
in the second you don’t. But the normal approximation above
treats them the same way. You can get a more accurate answer
using a “continuity correction”: ask yourself “what decimal number would round off to the whole number I want?” In this case,
“more than 950” means “bigger than 950.5”, so use 950.5 instead
of 950 to get z = (950.5 − 900)/15 = 3.37. With large n, this
often doesn’t make much difference; here the probability a little
smaller, but is the same to 4 decimals. If you are not concerned
about this, don’t worry; you don’t need to know continuity corrections in this course.)
variance is
Change n to 1300, so the mean changes to np = 1300(0.75) = 975,
variance
becomes np(1 − p) = 1300(0.75)(0.25) = 243.75 and SD
√
is 243.75 = 15.6125. Now z = (950 − 975)/15.6125 = −1.60, so
prob. is 1 − 0.0548 = 0.9452. The college is now very likely to end
up with too many students.
Here, n = 1200 is larger than Table C has, so we need to use
the normal approximation. The idea is to find the mean and
SD (as we just did) and “pretend” that the count has a normal
distribution with this mean and SD. (It actually has a binomial
distribution, of course, but with a large n we can often get away
with it. See the “rule of thumb” calculations below.)
Or use the continuity correction and start from 950.5, so get z =
(950.5−975)/15.6125 = −1.57 and a prob. of 1−0.0582 = 0.9418.
This time the continuity correction makes more of a difference
(though still not much).
950 is a “value”, so turn it into a z and then look it up in Table
A, using the mean and SD you just found. This gives:
z=
950 − 900
= 3.33.
15
5.29 The success prob. is
7
1
5
= 0.2, so the mean is 900(0.2) = 180, the
√
variance is 900(0.2)(0.8) = 144, and the SD is 144 = 12. For
the proportion, which you get by dividing the count by n, divide
the mean and SD by n as well to get a mean of 200/1000 = 0.2
and an SD of 12/1000 = 0.012. (Or you can use the formulas for
p̂ in the box on page 323).
chance of her getting 80% or lower, which is less than she would
expect, goes down compared with n = 100.
To cut the SD for the proportion
in half, n has to be multiplied
√
2
by 2 = 4 (because of the n on the bottom of the formula). Try
it by calculation or algebra if you don’t believe me. You’d need
to solve
s
(0.85)(0.15)
0.0357
=
n
2
for n. That means that 400 questions would be needed (but there
is the little matter of how long a 400-question exam would take!)
This holds true for any p, including the p = 0.75 of Laura in part
(d). To do that one by calculation, figure out Laura’s
SD for 100
q
questions, divide that in half, and put it equal to (0.75)(0.25)/n,
solving for n.
For (c), use the mean and SD you got in (b) along with the normal
approximation, so z = (0.24 − 0.2)/0.012 = 3.33 and the prob.
is 1 − 0.9996 = 0.0004. You may not think that 24% is a very
impressive performance, but see the discussion below.
The last part has you working backwards, from the table to a
z to a proportion. The probability to be looked up backwards
in the table is 1 − 0.01 = 0.99 (we want “this well or better”),
which goes with z = 2.33 (the closest value). Turn this back into
a value by multiplying by the SD and adding the mean, to get
(2.33)(0.012) + 0.2 = 0.22796; that is, a subject must get 23% or
more successes, or 206 out of 900, to have evidence of ESP. (You 5.34 For this question, the binomial distribution no longer applies
because we no longer have a fixed number of trials: the number
might think that this is not much more than the 20% a subject
of rolls of the die is whatever it needs to be to get a 1. Thus for
could get by guessing, but with so many attempts (trials), it’s
5
25
(a), 65 × 16 = 36
; (b) is 65 × 56 × 61 = 216
; and for (c) the answers are
very unlikely that someone could do this well by guessing alone.)
5
5
5
1
5
5
5
5
1
× 6 × 6 × 6 and 6 × 6 × 6 × 6 × 6 . To get the first 1 being on
6
Sanity-checking: the chance of at least 24% successes is 0.0004,
the k-th roll, you need k − 1 non-1’s followed by a 1, and this has
which is less than 0.01, the chance of at least 23% successes.
probability ( 56 )k−1 ( 16 ). This distribution for the number of rolls to
get the 1st 1 is called a geometric distribution; see exercise 5.35
5.33 You can use the normal approximation to the binomial for this
for more.
one; we are dealing with proportions, so be sure to use the right
formulas for the mean and SD.
5.35 Y could be 1, 2, 3, 4 and so on (it could be very large because
For Jodi, n = 100 and p = 0.85, so for the proportion she gets
you might wait a long time for the first success). Using the same
correct, the mean is p = 0.85, and the variance
is
p(1
−
p)/n
=
ideas as in 5.34, Y = 1 if you get a success on the first trial, which
√
(0.85)(0.15)/100 = 0.001275, so the SD is 0.001275 = 0.0357.
happens with probability p; Y = 2 if you get a failure followed by
To get 80% or lower, her z is z = (0.80 − 0.85)/0.0357 = −1.40,
a success, which happens with probability (1 − p)p, and Y = k in
giving a probability from the table of 0.0808.
general if you get k − 1 failures followed by a success in that order
(because if you get your success any earlier, Y is not equal to k
(b) is the same thing, but with n = 250, so mean
√ is 0.85, variance
any more), which has probability (1 − p)k−1p. These probabilities
(0.85)(0.15)/250 = 0.00051. z = (0.80−0.85)/ 0.00051 = −2.21,
form an “infinite series”, because Y could be as big as you like;
so probability is now 0.0136. With more questions, Jodi’s proporall the infinite number of probabilities add up to 1, as they should
tion of correct answers should be closer to her p of 0.85, so the
8
(the rationale being that you can get the total as close to 1 as you
like by taking enough probabilities and adding them up).
9