An extremely brief and specious history of Bayesian thinking

Why consider Bayes?
• In past seminars, we’ve read and
An extremely brief and subjective
history of Bayesian thinking
•
Part 1: Bayes to 1900
•
Michael Friendly
SCS Study Group
Oct. 8, 2015
•
•
Bayes’ Q: probabilities of causes
• We know how to solve Pr(effect | cause)
ƒ Pr;džͮ‫Ϳצ‬ĨŽƌŽďƐĞƌǀĂƚŝŽŶŽĨdž
ƒ E.g.,
ଵ
Pr;,,,,ͮ‫צ‬с )
ଶ
=
ଵ ସ ଵ
=
ଶ
ଵ଺
• How to solve for Pr(cause | effect)?
ƒ Called “inverse probability”, Pr;‫ͮצ‬džͿ
ƒ E.g., Toss 4 coins and get HHHH. What is
ƉƌŽďĂďŝůŝƚLJƚŚĂƚƚŚĞĐŽŝŶƐĂƌĞďŝĂƐĞĚ͕Ğ͘Ő͕͘‫צ‬сϬ͘ϰ͍
• P(džͮ‫ = )צ‬likelihood; P(‫ͮצ‬dž) = posterior
discussed a number of Bayesian books
(e.g., Gill; Gelman et al.)
I could understand the general idea,
posterior = prior * likelihood
But Bayesian methods always seemed
so daunting. What was the payoff?
McGrayne writes as if Bayes was the
Theory of Everything, so maybe worth
another look.
Where did it start? How did it
develop? Why should I be interested?
Bayes’ balls
Bayes imagined an experiment:
• A blue cue ball is tossed on a billiard
•
•
•
•
•
table, unseen by him.
How to estimate the position (0-1)?
&ŝƌƐƚŐƵĞƐƐ͗‫צ‬сh΀Ϭ͕ϭ΁;ƉƌŝŽƌͿ
Colleague throws a red ball, reports
whether it is Left or Right of the cue
If to the right, Bayes realizes the cue
ball is more likely toward the left side
of the table.
hƉĚĂƚĞďĞůŝĞĨ͗‫צ‬фϬ͘ϱ
0
p
1
Bayes’ balls
Bayes’ balls: animation
Toss more balls:
• As more and more balls are thrown,
each new piece of information made his
imaginary cue ball wobble back and
forth within a more limited area.
• Each new red ball gave more info on ‫װ‬
• Could use this to narrow the range of
ƉŽƐƐŝďůĞǀĂůƵĞƐĨŽƌ‫צ‬
• Basic idea:
0
1
Initial Belief + New Data -> Improved Belief.
• Now say:
Prior + likelihood -> Posterior
Bayes’ balls in modern terms
• N Observations X (“L”, “R”) have conditional distribution
W;yͮ‫Ϳצ‬сŝŶ;E͕‫͕Ϳצ‬where unknown ‫צ‬ŝŶ΀Ϭ͕ϭ΁
• Want to calculate W;‫ |צ‬X)
• Note that
ܲ ܺ, ‫ܺ ܲ ܺ צ ܲ = צ ܲ צ ܺ ܲ = צ‬
• Solve:
௉ ௑,‫צ‬
௉ ௑‫ צ‬௉ ‫צ‬
ܲ ‫= ܺצ‬
=
௉ ௑
௉ ௑
• Bingo: Bayes rule!
ƒ
ƒ
ƒ
ƒ
P(Xͮ‫ –Ϳצ‬easy: P(X | ‫ =Ϳצ‬Bin(N͕‫Ϳצ‬--- likelihood
W;‫Ϳצ‬ŝƐƵŶŝĨŽƌŵĚĞŶƐŝƚLJ͕h΀Ϭ͕ϭ΁--- prior
P(X): normalizing constant = ‫ )צ(ܲ צ ܺ ܲ ׬‬d‫צ‬
їW;‫ͮצ‬X)=posterior density of ‫ ~צ‬Beta(X+1, N-X+1)
Who discovered Bayes Theorem?
• Bayes died in 1761, left his manuscript
unpublished
• His friend, Richard Price discovered the ms.,
edited it and submitted it to the Royal Society,
Philosophical Transactions in 1763: “An Essay
toward solving a Problem in the Doctrine of
Chances”
• Virtually no one seemed to read or notice it
until years later.
• Stephen Stigler (1983) has another answer: The
same idea was first stated by Nicholas
Saunderson, a Cambridge mathematician ~
1749
Stigler, S. Who discovered Bayes’s Theorem, The American Statistician, 1983, 37(4), 290-296
Criticisms to last 200 years
• There were two enduring criticisms:
ƒ Mathematicians were horrified to see something
as whimsical as a guess play any role in rigorous
mathematics.
ƒ Bayes said that if he didn't know what guess to
make, he'd just assign all possibilities
equal probability to start.
ƒ For most mathematicians, this problem of
priors was insurmountable, and would remain so.
Laplace: Memoir
• 1774: Memoir on the Probability of Causes of Events
ƒ If an event can be produced by a number n of different
௉ ா ஼೔
௉ ா ஼భ ା‫ڮ‬ା௉ ா ஼೙
(The general formula was
only written in 1810-1814)
ƒ Gives a way to evaluate the relative strength of “causes”
(hypotheses), given some data
ƒ “Shape” of the earth
ƒ Orbits of planets
ƒ Distance of earth from sun
ƒ Making Newton precise!
• Messy data:
ƒ Lots of data: 1000s of years
ƒ hŶŬŶŽǁŶĞƌƌŽƌƐ͕ƵŶĐĞƌƚĂŝŶƚLJ
• How to resolve uncertainty:
ƒ Reads de Moivre: The Doctrine of Chances
ƒ Maybe Probability is the answer?
ƒ Re-discovers Bayes
Events
Causes
C1
Pr(E1|C1)
Pr(Ek|C1)
Cn
E1
E2
C2
…
ƒ ܲ ‫ܥ‬௜ ‫= ܧ‬
• Big problems:
…
causes, the probabilities of these causes given the event
are to each other as the probabilities of the event given
the causes
ƒ The probability of each of these is equal to the probability
of the event given the cause, divided by the sum of all the
probabilities of the event given each of these causes.
Enter: Laplace (1749-1825)
Ek
Pierre-Simon Laplace
Now, we observe E2
Memoir: Applications
• Finding the mean of 3
C1
Pr(C1|E2)
E2
C2
…
Pr(Cn|E2)
Given an event (data), we can
now calculate the
probabilities of one or more
causes (hypotheses)
Cn
Wagers: Quantifying uncertainty
• Laplace had discovered a method to estimate an
unknown (“cause”) from data, and also to quantify his
degree of belief---in the form of a wager on error bounds
• Motions and masses of Jupiter & Saturn– big problem:
Kepler (tables), Euler (theory), Bouvard (better tables)
ƒ In Mécanique Céleste, Laplace offers a famous bet: 11,000 to 1
odds that Bouvard's results for Saturn were within 1% of the
correct answer, and a million to one odds for Jupiter.
ƒ Nobody took Laplace's bet, but today's technology confirms that
Laplace would have won both bets.
• This is the early root of what Bayesians call “high density
intervals.”
astronomical observations
(e.g., transit times of Venus)
• Problem III: Find the point V
on the line AB to fix the
“mean” of observations a, b,
and c
ƒ What is the law of likelihood?
ƒ What is the location of highest
probability?
ƒ How to characterize
uncertainty?
Bayes Rule is buried
• Laplace (1812) generalized the central limit theorem
(from de Moivre, on the binomial)
ƒ Now, there is an objective rationale for the mean
ƒ Turns to “big data” problems, e.g., estimating the population of
France, the human sex ratio (M/F births)
ƒ Adopts a frequentist approach (judging an event's probability by
frequency among many observations)
• Laplace’s faith in pure mathematics:
ƒ Napoleon: “Newton spoke of God in his book. I have perused yours but failed
to find his name even once. Why?”
ƒ Laplace: “Sire, I have no need of that hypothesis.”
Bayes Rule is buried
• The Age of Data (1800 – 1850)
ƒ Widespread collection of official statistical data
ƒ Population distributions, occurrence of crime, suicide,
literacy, poverty, prostitution, …
Who is buried in Bayes’ Tomb?
• Bayes remained largely ignored
• Laplace’s foray into subjective
probability was denigrated
• Even in 1891, the Scottish mathematician
• “Statistics” becomes the collection of objective facts
ƒ John Stuart Mill denounced probability as “ignorance...
coined into science.”
ƒ Florence Nightingale was chided: “The statistician should
have nothing to do with causation”
ƒ Bayes theorem particularly BAD: “subjectivity” (prior)
became a naughty word.
ƒ Idea of a uniform prior VERY VERY BAD
'ĞŽƌŐĞŚƌLJƐƚĂůƵƌŐĞĚ͗Η΀/ŶǀĞƌƐĞ
ƉƌŽďĂďŝůŝƚLJ΁ďĞŝŶŐĚĞĂĚ͕΀ŝƚ΁ƐŚŽƵůĚďĞ
decently buried out of sight, and not
embalmed in text-books and examination
papers... The indiscretions of great men
should be quietly allowed to be
forgotten.“
• But, Bayes would rise again!
• Later, people would confess to be “bornagain Bayesians”
Bunhill Fields, London, traditional
burial site of “nonconformists”.
(Restored by International Bayesian Society)
Bayes slightly rises: The Dreyfus affair
• Alfred Dreyfus, a Jewish French officer was convicted of
treason in 1894
• There were several trials and re-trials, centering on evidence
whether Dreyfus had written a damming letter, the
“bordereau” giving military secrets to Germany
Favorite popular history:
Richard Harris, An Officer
and a Spy
Bayes slightly rises: The Dreyfus affair
• Henri Poincaré was called to the stand.
ƒ Poincaré was a frequentist
ƒ But, when asked whether Dreyfus had written the
letter, Poincaré invoked Bayes' Theorem as the only
sensible way for a court of law to update a hypothesis with
new evidence
ƒ He proclaimed that the prosecution's discussion of
probability was nonsense.
ƒ Bayes is only now beginning to be argued in courts of law:
DNA evidence
Pr(DNA | guilty)
vs.
Pr(guilty | DNA)
See: Skorupski & Wainer (2015) The Bayesian flip: Correcting the prosecutor’s fallacy, Significance.
Dreyfus affair: Resolution
• In the end, politics was more
persuasive than Bayes
• Emile Zola wrote J’Accuse,
accusing the military general
staff of a massive cover-up
• Dreyfus was pardoned two
weeks later
Guns and roses
• Bayes finds a haven in the French military (1870-1915)
• Into the breech: Joseph Louis François Bertrand
ƒ Resurected Bayes for artillery field officers dealing with many
uncertainties: the enemy’s precise location; air density; wind
direction; variations among their hand-forged cannons; and the range,
direction, and initial speed of projectiles.
• General Jean Baptiste Eugène Estienne
ƒ developed elaborate Bayesian tables telling field officers how to aim
and fire.
ƒ Also developed a Bayesian method for testing ammunition:
ƒ Instead of testing a fixed number of shells, could stop testing as soon
ĂƐĞǀŝĚĞŶĐĞŝŶĚŝĐĂƚĞĚƚŚĞůŽƚǁĂƐK<΀EŽǁĐĂůůĞĚsequential testing΁