To answer this type of question: You can use the following probability distribution(s): Remarks / assumptions / restrictions What is the probability of getting x times a particular result in a series of n consecutive trials* ? Binomial -Each trial has only two possible outcomes -The intrinsic probabilities p and q of each result are known -Use the Binomial approximation when the probability of each new result is unaffected by results of previous trials (p and q do not vary: sample size is very large relative to n) or Hypergeometric -Each trial has only two possible outcomes -The initial probabilities p and q of each result are known -Use the Hypergeometric approximation when the probability of each new result is influenced by results of previous trials (p and q change: sample size is limited relative to n) What is the probability of getting a particular result for the xth time after n consecutive trials ? Negative binomial -The intrinsic probability p of that specific result is known What is the probability that a certain event (or result) will occur x times in a certain interval t (of time, or of space) ? Poisson distribution -The average rate of occurence of that particular result (or event) is known and is assumed to be constant in time. or space. What is the probability that a certain interval t of time (or space) will pass between two consecutive events (or results) ? Exponential -The average rate of occurence of that particular result (or event) is known and is assumed to be constant in time. or space. * A 'trial' can mean a test (example: does this object have a certain quality ?), a measurement, an observation, etc. To answer this type of question: you can use the following probability distribution(s): Remarks / assumptions / restrictions What is the probability that a trial will produce a particular result, if the result of this trial is subject to random (stochastic) variations (or errors) that are equally likely to be positive or negative ? Normal (Gaussan) -Use the Gaussian approximation when the estimate of the mean result is based on a large number of trials (sample size is large) or Student t or Lognormal -Use the Student t approximation when the estimate of the mean result is based on a limited number of trials (sample size is limited) -Use the Lognormal approximation if it is the log of the result that is subject to random variations. Probability distribution(s): Typical examples of usage in the Earth and environmental sciences Normal (Gaussian) -Experimental (instrumental) measurements are often subject to random, independent errors, therefore, the probability distribution of quantities that are measured can often be assumed to be Normal (Gaussian), and the mean of the distribution represents the most likely result. -Many meterological and hydrological phenomena are subject to random variations, for example the air temperature on a certain day of the year can vary from year to year around a certain mean value. The probability distribution of such variations is very often a Normal one. Lognormal -This is often used to describe the probability distribution of certain physical quantities that are subject to cumulative (dependent) random variations. Very often with such quantities, the higher the value, the less frequent it is. A typical example is the particle size distribution of sediments like colluvium (produced by mass wasting deposits) in which there are relatively few large blocks and comparatively much more finer material. The relative concentration of analytes (ions, elements, etc.) in natural media such as soils or water is very often log-normally distributed. -The Normal and Lognormal distributions are closely related to each other: If the probability distribution of certain variable x is Lognormal, then the probability distribution of the log of that variable (i.e., log(x)) is Normal. Binomial Hypergeometric -These probability distributions are used to estimate the probability of getting a specific result (example: finding a gold nugget) in a succession of n trials (example: sieving a river sediment sample). They have obvious, important applications for geological exploration and mining, among other fields. The main difference between the Binomial and the Hypergeometric is that the Binomial is used when the probability of a certain result is always the same for each separate trial, whereas the Hypergeometric is used when the probability of a certain result in a trial is influenced by the previous results. Binomial Hypergeometric (continued) -Whether the probability of successive trials are independent, or not, depends on the sample size examined in the trials. If the sample size (n) is small relative to the total population size (N), then the most appropriate probability distribution to use is the Binomial. But if n is large relative to N, the Hypergeometric is the more appropriate distribution to use*. The Negative Binomial distribution is related to the Binomial, but it is used instead to estimate the probability that a certain result will be obtained x times after n trials. This can be used, for example, to figure out how many tests are needed to ensure a certain sucess rate of getting a certain result (an important thing to know, if performing many tests is expensive !) *For a good clarification, see: Wroughton & Cole (2013) Journal of Statistics Education 21 (1), pp. 1-16. doi: http://www.amstat.org/publications/jse/v21n1/wroughton.pdf. Student t Fisher F -These two distributions are used in many statistical tests. The Student t probability distribution is often used when the mean of groups of observations are compared. The Fisher F probability distribution is used when it is the variance of the groups of observations that are compared. -In both cases, there is an assumption that the observations, taken from a limited sample, are a subset of a larger, Normally-distributed population.
© Copyright 2026 Paperzz