Making Sense of Randomness: Probability Distributions Many things that we observe in nature or in human affairs seem to be random. How can we understand things that occur randomly? We look for patterns in how frequently we observe certain values (or values within a specific interval). This leads to a probability distribution. 1 Probability Distributions Various kinds of random events follow different probability distributions. For example, the time between eruptions of Old Faithful geyser in Yellowstone National Park varies from just a few minutes to a couple of hours. It is the subject of continuous monitoring by the National Park Service. During the year 2011, the number of times corresponding to each time interval measured in 12 second intervals were counted and then converted to percentages. Here is a histogram of the percentages: 2 Data collected by Ralph Taylor. 3 Other Probability Distributions Other kinds of random events follow different probability distributions. Some probability distributions are similar to each other, and we may even give them names, for example the binomial distribution, as we discussed last time. We flip a coin 10 times, and count how many heads we get, somewhere between 0 and 10. The binomial distribution is actually a family of distributions that depend on how many “trials”, and on the probability of a “success”. If, instead of 10 trials, we flip the coin 20 times, we would get a similar pattern of numbers between 0 and 20. The probability of a success on each trial is still 1/2. 4 The Binomial Probability Distributions What about if we roll a die several times and count how many times we get a 3? The probability of a success on each trial is now 1/6. If we roll the die 10 times and count the 3’s, then roll it 10 times again, then again over and over, each time counting the number of 3’s, we might get a histogram that looks like this: 5 150 100 50 0 Frequency 200 250 300 Number of Times a "Three" Appeared in 10 Trials Repeated 1,000 Times 0 1 2 3 4 5 6 Number of Times a "Three" Appeared in Ten Trials The experiment was run 1,000 times. 6 We see that there were no “threes” about 175 times out of a total of 1,000 times the die was rolled 10 times. (That sentence is a little hard to read; note the different things that the word “times” refers to.) We also see that a “three” never occurred more than 6 times out of 10 trials repeated 1,000 times. According to the binomial probability function, the probability of getting a “three” 7 times out of 10 trials is 10 (1/6)7(5/6)3 7 This is approximately 0.00024807. (Check it out in Matlab. Use binopdf(7,10,1/6); see lecture notes of March 24.) 7 Other Probability Distributions Other kinds of random events follow “symmetric” probability distributions that are concentrated near the center and fall off toward the ends. For example, if we have a hundred or so female hamsters (or some other small animal) of about the same age, and we weight each one and count how many are less than 18 grams, how many are between 18 and 20 grams, and so on, we might get counts that look something like this: 8 10 5 0 Frequency 15 20 Histogram of Weights 18 20 22 24 26 Weights 9 Normal Probability Distribution The frequencies of a normal probability distribution have a general shape of a “bell curve”. 0.2 0.1 0.0 Frequency 0.3 0.4 Bell−Shaped Curve 10 Normal Probability Distribution Lots of random things follow a frequency distribution that is similar to a normal probability distribution. We study the random processes in nature by using a probability distribution model, and then we study the probability model. There are various parameters that describe a particular probability distribution. For example, normal probability distributions have two properties that distinguish one from another: mean – the “average value” variance – how spread out are the values. 11 Theoretical Probability Distributions and Data We make mathematical models of probability distributions. These probability models describe idealized frequencies of “populations”. These probability models have parameters that distinguish one population from another. 12 Parameters The meaning of the parameters depends on the type of probability distribution. As I mentioned, the normal probability distribution has two parameters that distinguish one from another, the mean and the variance. A uniform distribution also has two parameters, but these are the minimum and maximum limits of the distribution. Other distributions may have a parameter that tells how skewed they are. 13 Studying Random Events and the Probability Distributions that Govern Them In science, we often study random phenomena by use of samples of data. The science of statistics provides us with tools for estimating parameters of distributions. To estimate a mean of a population, for example, we may just use the mean of a sample of data. This is pretty obvious, but there are many more interesting questions that arise when we use a sample to make inferences about a population. Statisticians address these questions. 14 The Variance and the Standard Deviation The variance is an average of the square of the difference of each point and the mean value. The square root of the variance is called the standard deviation. 15 M=0,V=1 M=5,V=1 M=0,V=9 M=5,V=9 0.2 0.1 0.0 Frequency 0.3 0.4 Normal Probability Distributions −10 −5 0 5 10 15 16 Generating Random Numbers There is a wealth of statistical theory that guides the scientist in collecting data and in using that data for making inferences about the probability distribution of a full population. In this course, we will consider some simple aspect fo this process – but with a twist. Instead of collecting “real” data, we will simulate random data on the computer. How to do this?? 17 Generating Random Numbers The computer performs computations (almost) exactly – and deterministically. That is, it computes the same thing every time. There are, however, ways of generating “random numbers” on the computer. Because they are not really random, we call them “pseudorandom” numbers. These ways have been studied and evaluated by mathematicians and statisticians for several years now. We will not go into the details of how this is done, but we want to learn to use some of them in Matlab. How it gets different numbers each time is by use of the system clock. 18 Generating Random Numbers in Matlab There are four simple functions in Matlab to generate random numbers: • rand - “pseudorandom” uniform values between 0 and 1 • randi - “pseudorandom” integers • randn - “pseudorandom” normal values • randperm - “pseudorandom” permutations There is another more general one called random, which can generate random numbers from several specific probability distributions. It is in the Simulink Toolbox. 19 Generating Random Numbers in Matlab X = rand(n) returns an n-by-n matrix. X = rand(n,1) returns a column matrix with n elements. X = rand(1,n) returns a row matrix with n elements. X = randn(n) returns an n-by-n matrix of pseudorandom normal values. X = randn(n,1) returns a column matrix with n elements. X = randn(1,n) returns a row matrix with n elements. X = randi([imin imax],m,n]) p = randperm(n) returns a row vector containing a random permutation of the integers from 1 to n inclusive. p = randperm(n,k) returns a row vector containing k unique integers selected randomly from 1 to n inclusive. 20 Generating Random Numbers in Matlab There is also a Matlab function that controls how the random number generators work: rng rng(seed) rng(’shuffle’) rng(seed, generator) rng(’shuffle’, generator) rng(’default’) scurr = rng rng(s) sprev = rng(...) 21 Simulation In applied mathematics and science and engineering, we often study random processes by simulating the random events. For example, we might simulate a population of some type of wild animal by beginning with a population with some given size, and then assuming that some random proportion of the population dies each year and that there is some random number of births proportional to the number of females in the population each year. Obviously, such a simple simulation model would not be very good. (In the context of a simulation model, “very good” means corresponding closely to reality.) 22 Simulation Models We might add components to the model that help to account for the random differences in weather and/or food supply. We might add components to the model that represent populations of some other types of wild animals that live in the same area (predators or prey). ... and so on and on ... 23 Monte Carlo Simulation “Monte Carlo” methods are simple techniques that use simulation to answer difficult questions. 0.2 0.1 0.0 y 0.3 0.4 For example, what’s the area under the curve shown here: 0 1 2 3 4 x 24 Monte Carlo Simulation We can easily figure out the area of the rectangle: 4.0 × 0.4, or 1.6 square units. Suppose we could simulate a bunch on random points that are uniformly distributed over the rectangle. The ratio of the area under the curve to the area of the rectangle should be about the same as the ratio of the points that are under the curve to the total number of random points. 25 0.4 Monte Carlo Simulation + * + + + + + 0.3 + ++ ++ ** * + * * * * ** ** * + + + + + + + * * 0.2 * 0.1 * * * * ** * * *** * * * * * * * * ** * * * * 0 + + + + + + ++ + + + + + + + + + + + 1 * ++ + ++ + + + + * * ** + + + * * * * * 2 ++ + + + + + + + ** + + + + + + ++ + * + + + + ++ + + * * * + + + ++ * + + ++ + + + + + + + + + + ++ +++ ** ** + + + * * * 0.0 y * + + ++ + + + + + + + + + + + + + + + + + + ++ + + 3 4 x 26 Monte Carlo Simulation I generated a total of 200 points and 66 of them were below the curve. We therefore estimate that area as (4.0 × 0.4) × 66/200. 27
© Copyright 2026 Paperzz