10 Equations in Biology Series Seminar #11: Hidden Markov Models Nov. 28, 2012 1. Basics of Probability Probability measures the chance that a specific event will occur. By definition, an event’s probability must lie between 0 (no chance of occurring) and 1 (100% chance of occurring). The probability of an event E is usually written as P(E). For example, if R represents the occurrence of rain on a particular day, then P(R) = 0.4 means a 40% chance of rain on that day. The alternative notations p(E), Pr(E), and Prob(E) are also common: all of these mean exactly the same as P(E). Table 1: Rules for working with probabilities A complex event may consist of a Event Probability Assumptions combination of several simpler NOT A 1 – P(A) none events. Table 1 summarizes useful A OR B P(A) + P(B) A and B are mutually exclusive rules for relating the complex event’s A AND B P(A) P(B) A and B are independent probability to that of the simpler events. For example, if there is a 40% chance of rain today and a 30% chance of rain tomorrow, and if today’s weather has no effect on tomorrow’s, then the probability of rain on both days is: 𝑃(𝑅 𝑇𝑜𝑑𝑎𝑦 𝐴𝑁𝐷 𝑅 𝑇𝑜𝑚𝑜𝑟𝑟𝑜𝑤) = 𝑃(𝑅 𝑇𝑜𝑑𝑎𝑦) ∙ 𝑃(𝑅 𝑇𝑜𝑚𝑜𝑟𝑟𝑜𝑤) = 0.40 ∙ 0.30 = 0.12. 2. Conditional Probability and Bayes’ Theorem Two events A and B are said to be independent if the outcome of one event has no effect on the probability of another. For example, consider two flips of a fair coin. Let event F be “The 1st flip comes up heads” and S be “The 2nd flip comes up heads.” Define P(S|F) as the conditional probability of event S given event F; in other words, the probability that the 2nd flip comes up heads once the first flip has already come up heads. Because each flip’s probability is independent of all other flips, P(S|F) = P(S) = 0.50. In general, two events A and B are independent if and only if P(A|B) = P(A) and P(B|A) = P(B). Many important events are NOT independent of each other. For Table 2: Height distribution example, imagine choosing one student at random from a biology in a biology class class (see Table 2). Let event T be “The chosen student is at least 6’ < 6’ ≥ 6’ Total tall” and W be “The chosen student is a woman.” Because women # Women 110 10 120 are shorter than men on average, P(T|W) < P(T). The probability # Men 60 20 80 that a randomly chosen student is a woman who is at least 6’ tall is Total 170 30 200 then given by: 120 10 𝑃(𝑊 𝐴𝑁𝐷 𝑇) = 𝑃(𝑊) ∙ 𝑃(𝑇|𝑊) = 200 ∙ 120 = 0.05. Moreover, two events A and B can be considered in either order, so 𝑃(𝐴 𝐴𝑁𝐷 𝐵) = 𝑃(𝐵 𝐴𝑁𝐷 𝐴). Therefore, 𝑃(𝐴) ∙ 𝑃(𝐵|𝐴) = 𝑃(𝐵) ∙ 𝑃(𝐴|𝐵) , which can be rewritten as 𝑃(𝐴|𝐵) = 𝑃(𝐴) ∙ 𝑃(𝐵|𝐴) . (Bayes’ Theorem) 𝑃(𝐵) For example, if a particular student in the biology class is known to be at least 6’ tall, the likelihood of that student’s being a woman is given by: 𝑃(𝑊|𝑇) = 𝑃(𝑊) ∙ 𝑃(𝑇|𝑊) (120/200) ∙ (10/120) = = 0.33. 𝑃(𝑇) 30/200 Page 1 of 2 10 Equations in Biology Series Seminar #11: Hidden Markov Models Nov. 28, 2012 3. Probability vs. Likelihood The words probability and likelihood may seem synonymous, but there is actually a key distinction between them: probability refers to potential outcomes of a future event, whereas likelihood applies to a present state. For example, there is a 50% probability that flipping a fair coin will yield heads. Once the coin has been flipped and the result noted, the flip would be described as having had a 50% likelihood of yielding that result. Similarly, the previous page gave the example of randomly choosing a student from a biology class. Once a student has been chosen, the experiment no longer involves any degree of randomness: the student is either male or female, and either short or tall. Therefore, as soon as any definite information is available about the particular student, any subsequent inferences about that student are statements about likelihood. Hidden Markov Models are typically applied to cases in which researchers are attempting to infer the specific process that produced a given set of observations (e.g., the position of introns and exons within a DNA sequence). Each possible process usually has a known probability of yielding a particular observation. The researchers then use these probabilities to calculate the likelihood that a specific set of observations (e.g., a known DNA sequence) resulted from a particular process. Logarithms Many likelihood models, including HMMs, employ logarithms as a convenient way of comparing very small numbers. When combining large numbers of probabilities via the AND rule (previous page), the additive property of logarithms is also very useful: log(𝑎𝑏) = log(𝑎) + log(𝑏). Logarithms are the inverse function of exponentiation: if 10𝑎 = 𝑏, then log10 (𝑏) = 𝑎. However, although biologists may be most familiar with base-10 logarithms, any base can be used. In mathematics, the most common base is e (2.718…). Logarithms can be converted from one base to another using the formula log 𝑏 (𝑥) = log 𝑎 (𝑥)/log 𝑎 (𝑏). Unfortunately, authors do not always specify which logarithmic base they are using. When in doubt, it is often best to redo an author’s calculations using several different bases to determine which one he or she is using. Page 2 of 2
© Copyright 2026 Paperzz