Chapter 6: Modeling Random Events: Normal and Binomial Models

Chapter 6: Modeling Random Events: Normal and Binomial Models InChapters2and3,weexploredtheideaofadistributionfromasampleofdata,which
tellsusthevaluesinthesampleandtheirfrequency.Fornumericalvariables,weused
histograms,dotplotsorstemplotstovisualizethedistribution,andwelearnedhowto
describethedistributionintermsofshape,center,andspread.Nowweturnourattention
todistributionsofrandomvariables,whichtellusthepossiblevalues(outcomes)ofan
experimentandtheirprobability.
6.1: Probability Distributions are Models of Random Experiments Arandomvariableisanumericalvariablethatassumesvaluesassociatedwiththe
randomoutcomesofanexperiment.Randomvariablesmaybeeitherdiscreteor
continuous.Discreterandomvariablescanassumevaluesthatyoucanlistorcount.
Continuousrandomvariablesmaytakeonanyrealnumberinsomerangeorinterval.
Thedifferencebetweencontinuousanddiscreteisanalogoustothedifferencebetween
buyingcoffeebythecuporbythepound.Forexample,youmightaskabaristafortwocups
ofcoffee(oneforyouandoneforafriend).Thenumberofcupsofcoffeeisadiscrete
randomvariable.Itwouldnotmakesensetoaskfor1.64cupsofcoffee.Butifyouare
buyingcoffeebeans,thecoffeeisweighedandyoumightendupwithanumberlike1.64
pounds.Thenumberofpoundsofcoffeeisacontinuousrandomvariable.
Ex1: Determinewhethereachofthefollowingrandomvariablesisdiscreteor
continuous.
a. Shoesizeofaperson
b. Weightofaperson
c. Lengthofyourlastphonecall
d. Numberoftextsyousenttoday
e. PercentofyesterdayspentwatchingTV
Aprobabilitydistribution,alsoknownasaprobabilitydistributionfunction,tellsusall
thepossibleoutcomesforarandomvariable,andtheprobabilitiesassociatedwiththose
outcomes.
Allprobabilitydistributionsmustmeettworequirements.Wewillassumexisarandom
variablewithprobabilitydistributionfunctionp(x).
1.
2.
Foranyx,p(x) 0.
Thetotaloftheprobabilitiesforallpossibleoutcomesisequalto1.
1
Discrete Probability Distributions: Theprobabilitydistributionofadiscreterandomvariablecanbegivenasagraph,table,or
formula.
Ex2: Considertheexperimentofflippingacointhreetimesandlettherandomvariable representthenumberofheadsrecorded.
a. Findaprobabilitydistributionforthisexperimentintableform.Thengraphthe
distributionasastickgraphandasahistogram.
Samplespace:
x
p(x)
b. Find
1 ,thatis,theprobabilityofgettingatleastoneheads.
c. Findandinterpretp(2).
Ex3: Amicrochipmanufacturerknowsthat,basedonqualitycontrolstudies,two
microchipsoutofeverylotof100chipsproducedwillbedefective.Supposethat
onemicrochipisrandomlyselectedfromeachoftworandomlotsof100.Letxbe
thetotalnumberofdefectivechipsselected.
a. Findaprobabilitydistributionforthisexperimentintableform.
x
p(x)
b. Verifythatthetablemeetsbothrequirementsforaprobabilitydistribution.
2
Continuous Probability Distributions: Acontinuousrandomvariable hasaprobabilitydistributionthatisasmoothcurve(as
opposedtoastickgraphorhistogramwithbars).Thecurveiscalledaprobabilitydensity
function.Probabilitiesarerepresentedasareasunderthiscurve.Thismeansthatthetotal
areaunderthecurvemustbeequalto1.
Thereisnoareaunderthecurveatasinglepoint,say
.So,
0.Thismeans
thatthereisnoprobabilityofobtainingaspecificvalue inacontinuousprobability
distribution,wehavetotalkabouttheprobabilityofobtainingsome inarangeofvalues.
TheareaAunderthecurvebetweenthetwopoints
and
istheprobabilitythat
assumesavaluebetween and (
.Also,since
0,this
meansthat
isthesameas
(itmakesnodifferenceifyousee or .)
The Uniform Distribution: Continuousrandomvariablesthathaveequallylikelyoutcomesovertheirrangeofpossible
valuespossessauniformprobabilitydistribution.Thefunctionthatrepresentsthe
probabilitydistributionisaconstantfunction.
Ex4: ConsiderthespinningwheelfromthegameshowWheelofFortune.Wecan
simulatespinningthewheelbyimaginingthatthewheelitselfisfixedandthesmall
arrowindicatingthefinalwheelpositionspinsaroundthewheelandhasanequally
likelychanceoflandinginanypositionbetween0°and360°.(Assume0°isatthe
topofthewheelandxismeasuredclockwise.)

 
Density

1/360
x
ArrowPosition(Degrees)
Theprobabilitydensityfunctionisshownnexttothewheel.Theprobabilitydensity
curveisaconstantfunction(horizontalline)at1/360.
a. Findthetotalareaunderthedensitycurve.
b. Findaprobabilitythatx=90°.
3
c. SupposetheBankruptsectionoccupiestheportionofthewheelbetween285°
and300°.FindtheprobabilityofspinningBankrupt.Showthisprobabilityasan
areaunderthedensitycurve.

 
Density

1/360
x

 
ArrowPosition(Degrees)
d. Thefoursegmentsbetweenandincludingthetwo$900slotscorrespondto
valuesofxbetween225°and285°.The$5000sectoroccupiestheportionofthe
wheelbetween90°and105°.Findtheprobabilitythataspinlandsineitherone
oftheseregions.Showthisprobabilityasanareaunderthedensitycurve.
Density

1/360
ArrowPosition(Degrees)
x
4
6.2: The Normal Model Nowwewillinvestigatethemostwidelyusedprobabilitymodelforcontinuousrandom
variables.Manynumericalvariablesinrealexperimentshavedistributionsforwhichthis
particulardensitycurveprovidesaveryclosefit.
Ex5: FromtheLinkspage,gototheRossman/ChancesiteandloadtheOneProportion
simulation.Makesureprobabilityofheadsis0.5,animationisoff,andselect
proportionofheads.
a. Runthesimulation100timeswith10tosses.Describethedistribution.
b. Runthesimulation10,000timeswith50tosses.Describethedistribution.
c. TurnonSummaryStats.Runthesimulation100,000timeswith100tosses.
Describethedistribution.
WeobserveinExample5thatasweincreasethenumberoftosses,thedistributionstarts
tolooklesslikeaseriesofseparatebarsandfillsin,morelikeasolidarea.Also,theshape
ofthedistributionlookslessjaggedandmorelikeasmooth,unimodalsymmetriccurve.
WhatweareseeingisanapproximationoftheNormaldistribution,sometimesreferred
toasthebellcurve.
Ourexperimentshowsameanof x  0.5 andstandarddeviationof s  0.05 .Forthe
correspondingNormaldistribution,wewouldusetheGreeklettersμandσ.TheNormal
distributionwithameanof   0.5 andstandarddeviation   0.05 isdenotedas
N 0.5,0.05 .Hereisagraphofthedensitycurve.
Density
Proportionofheads 100flips 5
Ex6: Supposeweflipacoin100times.UsetheN(0.5,0.05)Normaldistributionto
representtheprobabilitythat:
a. theproportionofheadswouldbebetween0.45and0.5.
b. theproportionofheadswouldbegreaterthan0.55.
Ex7: A1992study(https://www.ncbi.nlm.nih.gov/pubmed/1302471)showedthat
normalbodytemperatures(indegreesFahrenheit)forpeopleaged18‐40are
approximatelyNormallydistributedwithmeanμ=98.2andstandarddeviation
σ=0.73.Supposeweselectarandompersoninthisagebracket.UseMinitabtofind
theprobabilitythattheirbodytemperaturewillbebetween98.2and98.6.
OpenMinitab.Type98.2and98.6intothefirsttwocellsincolumnC1.
SelectCalc,ProbabilityDistributions,Normal.ChooseCumulativeprobability,and
enterthemeanandstandarddeviation.SelectInputcolumnandenterC1,thenC2for
Optionalstorage.ClickOK.
SubtractthevaluesnowshowninC2.
TheexactshapeoftheNormaldistributionisdeterminedbythemeanμ(whereitis
centered)andstandarddeviationσ(howwideandtallitis),andisgivenbythefollowing
equation(whichyoudonotneedtomemorize):
N(μ,σ): p ( x) 
2
2
1
e  ( x   ) /(2 )  2
6
TheNormaldistributionhavingmean   0 andstandarddeviation   1 ,N 0,1 ,iscalled
thestandardNormaldistribution.WecanconvertanyNormaldistributiontothestandard
Normaldistributionbycomputing ‐scores:
z
x

ThestandardNormaldistributionhasarandomvariablewhichistypicallydenoted (thinkz‐scores).HereisagraphofthestandardNormaldistribution:
Distribution Plot
Normal, Mean=0, StDev=1
0.4
Density
0.3
0.2
0.1
0.0
-3
-2
-1
0
X
z
1
2
3
Finding Probabilities for the Standard Normal Distribution: FindingareasunderthestandardNormalcurverequirescalculusortechnology.However,
weuseatabletofindareas(probabilities)underthiscurve.Table2inAppendixAofthe
textisonesuchtable.TheonethatIwilluseisfoundonmywebsite(Handoutspage),and
isalsolocatedinmyfolderintheQ‐drive.Thefirstcolumngivesyouthefirst2digitsofthe
variablez,andthecorrespondingrowgivesyouthethirddigit(onlyaccuratetothe
hundredthplace).
Ex8: InExample7weweretoldthatnormalbodytemperatures(indegreesFahrenheit)
forpeopleaged18‐40areapproximatelyNormallydistributedwithmeanμ=98.2
andstandarddeviationσ=0.73.Supposeweselectarandompersoninthisage
bracket.
a. Whatistheprobabilitythepersonwillhaveabodytemperaturelessthan
98.2°F?SketchaNormalcurve,labelthehorizontalaxiswithz‐scoresand
temperatures,andthenshadethearearepresentingthisvalue.
z
7
b. Usethetabletofindtheprobabilitythepersonwillhaveabodytemperatureless
than98.6°F.Shadethearearepresentingthisvalue.
z
c. Findtheprobabilitythepersonwillhaveabodytemperaturemorethan98.6°F.
Shadethearearepresentingthisvalue.
z
d. Whatistheprobabilitythepersonwillhaveabodytemperaturebetween98.2°F
and98.6°F?Shadethearearepresentingthisvalue.
z
e. Whatistheprobabilitythepersonwillhaveabodytemperaturebetween97.5°F
and98.9°F?Shadethearearepresentingthisvalue.Isthisnumberfamiliar?
z
8
f. Whatistheprobabilitythepersonwillhaveabodytemperaturebelow96.7°For
above99.7°F?Shadethearearepresentingthisvalue.
z
InChapter3wediscussedtheEmpiricalRule,whichcanbeusedwithanydistributionthat
isunimodalandsymmetric.ThiscertainlyappliestotheNormaldistribution,onwhichthe
ruleisactuallybased.
Finding Measurements from Percentiles (Inverse Normal): SometimeswemayneedtousetheNormaldensitycurveinreverse;thatis,tofinda
measurement(orz‐score)correspondingtoagivencumulativeprobability.Wecallthis
probabilityapercentile.
9
Ex9: Whatz‐scorecorrespondstothe85thpercentile?Inotherwords,whatz‐score
captures85%oftheareaunderthestandardNormalcurve(toitsleft)?
ShowthisareaonthestandardNormalcurve.
z
Ex10: Thequantitativeportionofthe2016SATexamhasameanof510andastandard
deviationof103.AssumeSATscoresareNormallydistributed.
c. UseMinitabtofindthisscore.
a. Wouldyouratherhaveascoreatthe10thpercentileor90thpercentile?Why?
b. UsethetabletofindtheSATscorecorrespondingtothe90thpercentile.
OpenMinitab.Type0.90intothefirstcellincolumnC1.
SelectCalc,ProbabilityDistributions,Normal.ChooseInversecumulative
probability,andenterthemeanandstandarddeviation.SelectInputcolumnand
enterC1,thenC2forOptionalstorage.ClickOK.
10
6.3: The Binomial Model TheNormaldistributionisagoodmodelformanysituationsinvolvingacontinuous
randomvariable.Forexperimentsinvolvingadiscreterandomvariable,wherethe
outcomeoftheexperimentinvolvesacount,thebinomialprobabilitydistributionis
oftenabettermodel.
Thebinomialdistributioncanbeusedwhentheexperimentmeetsallofthefollowing
conditions:
1. Thereisafixed,predeterminednumberofidenticaltrialsn.
2. Onlytwooutcomesarepossibleineachtrial,whicharesuccess(S)andfailure(F).
3. Theprobabilityofsuccess,p,isthesameineachtrial.
4. Thetrialsareindependent(theoutcomeofonehasnoeffectontheothers).
5. Therandomvariablexisthenumberofsuccessesinntrials.
Theshapeofthebinomialdistributionisdeterminedbythevaluesofnandp.Wewilluse
thenotationb(n,p,x)torepresentthevalueofthebinomialdistributionwithntrials,
probabilityofsuccessp,andxsuccesses.Thevaluesaregivenbytheformula:
b(n, p, x) 
n!
p x (1  p ) n  x forx=1,2,3,…,n
x !(n  x)!
Thedistributionwillbesymmetricifp=0.5,andskewedwhenp 0.5.
However,evenifp 0.5thedistributionbecomesapproximatelysymmetricforlarge
valuesofn.
Ex11: Whichofthefollowingdefinexasabinomialrandomvariable.
a. Iflipacoin20times,andxisthenumberofheadsobserved.
b. Iflipacoinuntilatailsisflipped,andxisthenumberofheadsobserved.
11
Ex12: Whichofthefollowingdefinexasabinomialrandomvariable.
a. Idraw10cardsatrandomfromadeckofcards(withoutreplacement),andxis
thenumberoffacecardsdrawn.
b. Idraw13cardsatrandomfromadeckofcards(withreplacement),andxisthe
numberofheartsdrawn.
NOTE:Replacementisanimportantissuewiththebinomialdistribution,becauseinmany
experimentsitallowsustomeetCondition3.However,ifthepopulationsizeisatleast10
timesbiggerthanthesamplesize,thenthedifferencebetweenreplacementornon‐
replacementissmallenoughtobepracticallyinsignificant.InExample12a,thepopulation
size(52cards)isnotlargeenoughcomparedtothesamplesize(10cards)forthistobe
true,soreplacementisnecessary.
Finding Probabilities for the Binomial Distribution: Althoughitisnotterriblydifficulttocalculateb(n,p,x)directlyfromtheformula,wewill
mostlyrelyontablesortechnologytodeterminevaluesforthebinomialdistribution.Table
3inAppendixAofthetextisonesuchtable.TheonethatIwilluseisfoundonmywebsite
(Handoutspage),andisalsolocatedinmyfolderintheQ‐drive.Inthefirstcolumn,locate
theappropriatevalueofnandx.Thenlocatethecorrectcolumntotherightbasedonthe
valueofp.
Ex13: Findtheprobabilityofobserving7headsifafaircoinisflipped10times.
Ex14: Findtheprobabilityofobservingatleast7headsifafaircoinisflipped10times.
12
Ex15: APewResearchstudyin2015foundthatabout60%ofalladultsintheU.S.owna
smartphone.
a. If100Americanadultswererandomlyselected,howmanyofthemwouldyou
expecttoownasmartphone?
b. If12Americanadultsarerandomlyselected,whatistheprobabilitythatexactly
5ownasmartphone?
c. If12Americanadultsarerandomlyselected,whatistheprobabilitythatno
morethan10ownasmartphone?
Ex16: UseMinitabtodeterminetheprobabilityforpart(b)ofExample15.
OpenMinitab.Type5intothefirstcellincolumnC1.
SelectCalc,ProbabilityDistributions,Binomial.ChooseProbability,andenter12for
theNumberofTrialsand0.60forEventprobability.SelectInputcolumnandenterC1,
thenC2forOptionalstorage.ClickOK.
Ex17: UseMinitabtodeterminetheprobabilityforpart(c)ofExample15.
OpenMinitab.Type10intothefirstcellincolumnC1.
SelectCalc,ProbabilityDistributions,Binomial.ChooseCumulativeprobability,and
enter12fortheNumberofTrialsand0.60forEventprobability.SelectInputcolumn
andenterC1,thenC2forOptionalstorage.ClickOK.
13
Ex18: InExample15,theactualpercentagewas64%.Usetheformulafindtheprobability
thatexactly5outof12randomlyselectedAmericanadultsownasmartphone.
Center and Spread for the Binomial Distribution: Themeanandstandarddeviationforabinomialdistributioncanbecalculatedusingsome
simpleformulas:
Mean:   np Standarddeviation:   np(1  p)   (1  p) Themeanofanyprobabilitydistributionisreferredtoastheexpectedvalue.Because
mostoutcomeswillliewithinonestandarddeviationofthemean,weoftensayweexpecta
valueof    .
Ex19: TheoddsinwinningacashprizeinDiamondsScratchers,aCalifornialotterygame,
is1in8.51.Supposeyoubuy100tickets(at$1each).Considersuccesstomeanthat
aticketisawinner(cashprize).
a. Doesthisexperimentmeettheconditionsforthebinomialdistribution?Why?
b. Findthemeannumberofwinningtickets.
c. Findthestandarddeviationforthenumberofwinningtickets.
d. Howmanyticketsoutof100shouldyouexpecttobewinners?
14