• EDSA: Lecture 2 • Further info on probability distributions • We typically use probability distributions because they work (i.e., they fit lots of data in real world) • Remember Discrete vs Random • Bernoulli Random Variables – – • Variable with only two possible outcomes • Success (S) or death (D) etc. • Failure (F) or alive (A) etc. Examples • Heads or Tails • Newborn baby sex (male or female)…ok not a good example • Live or die (although funny story about this…) Lets define binomial random variable X to be the # of successful results in n independent Bernoulli trials (i.e. replicated) – Let p = probability of successes – Let q = 1-p = probability of failure – If n=1, then binomial random variable X is = to Bernoulli trial • What is the probability of obtaining x successes in n trials? • Example – What is the probability of obtaining 2 tails from a coin that was tossed 5 times? 5 P(TTHHH) = (1/2) = 1/32 • But there are more possibilities: TTHHH THTHH THHTH THHHT HTTHH HTHTH HTHHT HHTTH HHTHT HHHTT P(2 tails) = 10 × 1/32 = 10/32 • In general, if trials result in a series of success and failures, FFSFFFFSFSFSSFFFFFSF… Then the probability of x successes in that order is P(x) =qqpq n–x =p q x • However, if order is not important, then where is the number of ways to obtain x successes in n trials, and i! = i (i – 1) (i – 2) … 2 1 • Where X ~ Bin(n,p) – P(X) = n! x!(n-x)! – where P(X) is the probability of exactly x successes, n is the # of trials, p is the probability of success in any one trial x n-x • The probability of obtaining x successes and (n-x) failures is p (1-p) • n!/(x!(n-x)!) is the number of ways to obtain x success in n trials, accounts for controls double counting, e.g. (1,0)=(0,1) • Example • We shall use the tails example • We will generate X ~ Bin (11, 0.5) • Now lets play in excel. – Muck with sample size and n. – What happens to the distributions when p ≠ 0.5 – Lets do this in excel. – (8, 11) (1, 11) (9, 11) (3, 11) (5, 11) • We have now purposefully skewed the data: • Skew: describes distribution asymmetry • Two peaks • Leptukurtic: more observations (scores) in tails than predicted • Platykurtic: less observations (scores) in tails than predicted • Biology deals with tail probabilities • What is the probability of obtaining 9 or 10 tails out of 19 coin flips? – Answer = 17.6% – All P-values for statistical tests are tail probabilities • Other types of distribution • Poisson -- # of instances of an event recorded in a sample of fixed intervals or areas. – Usually distribution of rare events or counts • Poisson distribution is applied where random events in space or time are expected to occur • Deviation from Poisson distribution may indicate some degree of non-randomness in the events under study • Investigation of cause may be of interest – • X ~ Poisson(λ) P(x) = λ x x! • – where λ is the average value of the # of occurrences of the event in each sample – e is a constant, the base of the natural logarithm (~2.71828) If the average # of seedlings found in a quadrat is 0.75, what are the chances a quadrat will have 4 seedlings? – 4 P(4 seedlings) = (0.75 x e -0.75 )/4! = 0.0062 • Poisson Distributions • Example Emission of -particles • Observed vs Expected Poisson • A unweighted arithmetic mean can be misleading – – e.g. Binomial variable that can take on values of 0 and 50 with probabilities of 0.1 and 0.9, respectively • Arithmetic mean = (0 + 50)/2 = 25 • But, the most probably value is 50 • Arithmetic mean is not be useful for skewed distributions • Expected value of a discrete random variable X that can take on values a1 …an, with probabilities p1….pn, respectively, and accurately describes the average or central tendency of the distribution E(X) = ∑aipi = a1p1 + a2p2 +...+ anpn • Limits of Central Tendencies • Averages, or central tendencies, give no insight into spread, or variation. Need both. – e.g. Binomial variable that can take on -10 or 10 vs one that can take on -1000 or +1000, each with a p=0.5 • Both have E(X)=0, but in neither distribution will 0 ever be generated • The observed values in the first case are closer to E(X) than in the second • • Variance of a random variable = 2 2 2 – σ (X) = E[X – E(X)] = ∑ pi (ai - ∑ aipi) – E(X) is the expected value of X, the ai’s are the different possible values of X, each of which occurs with probability pi Interpretation – We calculate E(X), subtract this from X, and square this difference – Because there are many possible values of X, we repeat this process for each value – Each squared deviate is weighted by its probability of occurrence, pi, and then they are all summed • Binomial vs Poisson • Binomial depends on both n and p and Poisson depends on only λ • Binomial is always bounded by 0 and n, whereas the right-hand tail of the Poisson is not bounded • E(X) = σ (X) for Poisson whereas E(X) typically does not equal σ (X) for a binomial • Normal Distribution Length of Fish • A sample of rock cod in Monterey Bay suggests that the mean length of these fish is = 40 in. 2 and = 4 in. • Assume that the length of rock cod is a normal random variable • If we catch one of these fish in Monterey Bay, 2 – What is the probability that it will be at least 41 in. long? – That it will be no more than 42 in. long? – That its length will be between 36 and 39 inches? 2
© Copyright 2026 Paperzz