Bayes for Days - Mawer Investment Management

Bayes for Days
What to do with Signal
The morning of September 11, 2001, Christian Siya-Jothy and his colleagues at Goldman
Sachs heard that a plane had crashed into the World Trade Center. Siya-Jothy, a pilot as well
as a trader, turned to his colleagues and told them this was no accident; the sky was clear,
the attack must be purposeful. He immediately began closing out of the positions that would
likely fall after a terrorist attack while purchasing options that would give him the right to buy
government bonds in the future.1 In the days and weeks that followed, investors flocked to
government bonds as Siva-Jothy had expected.
The story above touches on an important consideration for investors: how we incorporate
signal. Investors are constantly receiving new information, updating their beliefs based
on this information, and then reflecting their views in the positions they take. However,
investors often err in how they weigh new information and, consequently, make sub-optimal
decisions. This may sound like no big deal until you add up the cumulative consequences of
many poor micro decisions over time. We need to ask: what is the best way to deal with new
information?
Fortunately, there is a tool that can help us with this problem: Bayes’ Theorem (Bayes). An
equation invented in the 18th century, Bayes is a formula that provides a rational way to
update beliefs based on new evidence. Unfortunately, this tool is not widely used; even for
those who are aware of it, Bayes can seem unwieldy and unintuitive.
P (H | E) = P (H) x
1
x P (E | H)
P (E)
This is the equation. Don’t worry if the math looks a bit daunting, it’s far more
straightforward than it seems.
Page 1
This needn’t be the case. Even when it is not used in a strict mathematical sense, Bayes can
be a powerful mental model for checking assumptions. On our team, we use Bayes to help
us distinguish signal vs. noise and update our beliefs once new information comes to light.
In fact, when you deconstruct the equation, you realize the whole thing is basically asking
three simple questions—ones that investors should always be asking themselves as they are
excellent gut checks.
While deconstructing a math equation may seem dry at first (“boring,” if you will), it is actually
fascinating and—as you will soon see—fun. Moreover, Bayes provides a framework for clear,
level-headed thinking (a rarity these days, it seems).
Enter Bayes
Imagine that your friend has been dating a girl he met online. He describes his new girlfriend
as nice, pretty and introverted. Your friend is very excited, as he is quiet and loves to spend
Sunday morning reading. Based on this information, do you believe your friend’s new girlfriend
is most likely:
a) a bartender,
b) librarian, or a
c) sales person?
If you answered librarian, you would be among the majority. You would also most likely be
wrong. When most people read this question and see the girl’s description, “librarian” seems
to be the logical choice. However, the actual probability that she is a librarian is quite low
as there simply aren’t that many in the economy (it’s a small pool vs. bartenders and sales
people).* This example demonstrates how we might benefit from an understanding of Bayes.
Before we examine the equation in-depth, we need to take a 10,000 foot view of the theory.
Remember: ultimately, Bayes is a tool that helps us update our beliefs (expressed in a
probability term), when given pieces of information. Using Bayes, one might answer the
following: What is the chance that Donald Trump will win Michigan and Pennsylvania, given
that he just won Virginia?
Here is the beauty of Bayes; the equation just asks three questions:
1) What is the base rate?
2) How rare is the evidence?
3) How relevant is the evidence to the hypothesis?
That’s it.
* Psychologists Kahneman and Tversky have called this representativeness heuristic (i.e. when people tend to
determine the likelihood of an event occurring based on assumptions or their previous experiences).
Page 2
Question 1: What is the base rate?
A base rate is the prior probability of something happening (often referred to simply as
“priors”). This is an important number to understand, as it has a significant impact on the end
result.
In the librarian example, we would ask: what is the probability that anyone is employed as a
librarian? In this case the probability your friend’s girlfriend is a librarian would be 0.11%. This
is because there are very few librarian jobs (166,164)2 in the overall job market (146 million)3.
Now you have discovered the base rate: 0.11%.
The equation looks like this:
P (H) = 166,164 librarians = 0.11%
146,135,000 jobs
Another example could be trying to determine the chance your child has done her homework
at the end of the night. Absent any other information, you would reference her historical
pattern to arrive at your conclusion. So, if she finished her homework 95 times in the last 100
days, the base rate would be 95% that she completed her homework this evening.
It’s critical when using Bayes’ to start with these historical figures. It is only after achieving
the base rate that additional evidence is incorporated.
Question 2: How rare is this evidence?
When signals are rare, it is far more likely that the evidence will be strong.
For example, in the question about your friend’s new girlfriend, we offered several pieces of
evidence about the girl. We said that she was (1) a girl, (2) nice, (3) someone he met online,
(4) pretty, and (5) introverted. In sum, we gave five pieces of evidence.
But was any of this evidence predictive of her being a librarian? Of course not, it’s all way too
common. So what if the girl is introverted? Psychologists expect that as many as 50% of the
population is introverted.4 Moreover, it is unremarkable for someone to be a girl, described as
nice and pretty, and for people to meet online.
We can easily see how this is applicable to investing. Imagine you want to invest in a wealthcreating business and are looking for management teams that will increase the odds of this.
What kind of signals could you look for?
According to Bayes, you want to look for signals that are rare. If a CEO is under investigation
by the authorities for something as significant as fraud, this could be important—and it’s rare.
In comparison, knowing whether or not a CEO drinks water is superfluous.
Page 3
Returning to the example of your child’s homework, you know that your child’s base rate for
completing her homework is 95%. However, what if you learned that she forgot her backpack
on the school bus? This would be an abnormal event—rare enough that it probably is worth
paying attention to.
P (E) = "Is the evidence rare?"
When evidence is rare, P(E) ends up being smaller—its mathematical contribution drives up
the chance of your hypothesis being true.
Question 3: How relevant is this evidence?
Clearly, strong signals are going to be relevant to the hypothesis in question.
Returning to the librarian example, it’s obvious that our pieces of evidence are not really
relevant to the hypothesis. What does a subjective assessment that his girlfriend is “nice”
have to do with the probability that she is also a librarian? Very little.
Likewise, in the example of the search for management teams that would best create
wealth, it’s easy to see how some signals could be relevant versus others. A CEO that is
being investigated for fraud is going to be an important (relevant) signal, whereas it would be
immaterial to learn they are vegan and despise steak. Even if you are passionate about red
meat, it’s pretty hard to draw the conclusion that a CEO is going to run a company poorly
because they don’t eat meat or dairy (unless, perhaps, they are running McDonald’s).
And if your child forgot her backpack on the school bus? This development seems likely to be
relevant, given that the homework is presumably inside said backpack.
P (E|H) = "Is the evidence relevant to the hypothesis??"
How do you actually figure this number out? You need to examine cases where the outcome/
hypothesis (H) is true and see how many times the signal pops up.
For example, in the question about your friend’s new girlfriend, we would examine whether
introversion is related to being a librarian. Thus, we’d ask: in all the times when the hypothesis
was true (in the population of librarians), how many are introverted?
Page 4
In the example of your child and their homework, we would ask: in all the times when the
hypothesis was true (homework was done) were there any cases this occurred when there
WASN’T a backpack present?
This last example is interesting because it shows just how powerful the signal portion of
the equation can be. If your child has never had a situation where they completed their
homework AND her backpack was missing, then the numerator would be zero, and the
entire Bayes equation would collapse to zero. Thus, the probability of your kid finishing their
homework tonight, given that she left her backpack behind, would be zilch.
Page 5
The Equation
It’s time to put it all together.
Bayes equation:
P (H | E) = P (H) x
1
x P (E | H)
P (E)
And here it is, deconstructed:
Probability of hypothesis
given evidence
P (H | E)
What is the base rate?
=
P (H)
x
How rare is the
evidence?
1
How relevant is
the evidence?
x
P (E | H)
P (E)
“Probability that someone
“Probability that she’s a
“Probability that a person
is an introvert, given that
librarian, given that she’s an
is a librarian”
she’s a librarian”
“Probability of a person
introvert”
being an introvert”
What is the probability that your friend’s new girlfriend is a librarian? Below are the steps
using Bayes.
Step 1: Come up with our best estimate of the probabilities.
We need:
•
•
•
The base rate of the hypothesis, P(H). The % of librarians in the population.
P(E). The % of introverts in the general population.
P(E|H). The % of librarians who are introverts.
To arrive at these probabilities, we’d need to do some research.
For example, to find the base rate—the % of librarians in the population—we could
go to the American Library Association. As of April 2015, the ALA put the number
of librarians in the U.S. at 166,164. This compares to the 146 million jobs in the U.S.
This would make the base rate 0.11%.
To find how rare our evidence is, P(E), we’d look at the % of introverts in the
population. According to the 1998 MBTI Manual, which surveyed Myers-Briggs
personalities in the U.S., nearly half the population is introverted. Thus, we take a
value of 50% for P(E).5
Page 6
Finally, to find the relevancy of the evidence, P(E|H), we’d want to look at the % of
librarians who are introverts. To find this number, we might review the 1992 study by
Mary Alice Scherdin which tested 1,600 librarians to determine their Myers-Briggs
Type Indicator. This survey found that 63% of librarians are introverts.6
Step 2: Run the math.
Next, we’d plug these numbers into the equation. We’d get:
P (H | E) = 0.11% x
1
x 63%
50%
P (H | E) = 0.14%
And now we have an answer. The probability that your friend’s girlfriend is a librarian
given that she is an introvert would be less than 1%!
What useful insights can we make with this example? First and foremost, notice
how the base rate simply dominates the answer. The very low base rate—nearly
zero—drives the result. This is one reason why it’s so important to pay attention to
base rates instead of getting swept away in a narrative about a particular “signal.”
Second, in this instance, we saw that the evidence slightly increased the chances
that she was a librarian (although not by much). This actually tells us quite a bit
about how the math works for the right hand side of the equation. The more
a piece of evidence is related to the hypothesis being true, and the rarer the
evidence, the more likely the hypothesis is true.
An investing case study: U.S. yield curves
How applicable is this to investing? Very. Every single day we receive new data and must
integrate this information into our beliefs. Bayes is an integral tool in our arsenal for doing so
in a rational manner.
For example, let’s say that you are evaluating the probability of a U.S. recession. With no other
information, what would be your best estimate of the probability of recession next year? The
base rate, of course! In this case, we might say the base rate is 13% (since 1962, the time
period we will use in this example, there have been 7 instances of recessions in the U.S. over
55 years).*
* Rounding this number
Page 7
But what if I told you the yield curve was inverted and it had been for months? (FYI reader—
this is not actually the case!) How would you update your beliefs?
If you were using Bayes to answer this question, you would then evaluate the evidence
component and use it to adjust your base rate.
First, you might look at P(E): the probability of the evidence. How often has the yield curve
inverted in the U.S. since 1962? In this case, you’d find that the yield curve has inverted
meaningfully only 12 times in that time period—22%. Among market signals, this would be
uncommon.
Second, you might look at P(E|H): the relationship between the hypothesis and the evidence.
In how many U.S. recessions, did the yield curve invert beforehand? Here, we would find that
a relationship seems to exist: in the 7 recessions since 1962, the yield curve was inverted
every time. This would give us a P(E|H) of 100%.
Therefore, the probability of a recession in the next year, given the yield curve is inverted,
would be 59%. This would be an important signal.
Solving for probability of recession, given yield curve inversion:
P (H | E) = 13%
x
1
x 100%
22%
P (H | E) = 59%
Lessons, Caveats and Final Thoughts
Bayes is an extremely useful mental model for investors. Not only does it provide a
theoretically sound process for updating beliefs, it helps us better understand signal vs. noise
and what kind of evidence to look for. Moreover, any investor can gain from systematically
looking for base rates and questioning the worth of the evidence before them. It is significant
that the best forecasters and machine learning in the world is fundamentally Bayesian.
As with almost everything in life, there are challenges with using Bayes. One of the greatest
is that the probabilities we seek are often not easy, or possible, to observe. In these cases,
Bayes can quickly become too speculative. The way we counteract this challenge is by going
through many rounds of looking for signal and updating our beliefs. (There is a mathematical
process for doing this, but it rests beyond the scope of this discussion). Another challenge is
that Bayes is based on historical patterns and relationships. These can and do change over
time.
Nonetheless, Bayes is an excellent tool even if it’s imperfect. Clearly, Bayesian analysis (in its
mathematical form) has a place in investing, where it can inform the user on the probability
Page 8
of an outcome once a new card turns up—particularly since people are generally pretty bad
at coming to statistical answers intuitively.
But even when we aren’t running through the math and just asking ourselves those three
questions, Bayes offers a good gut check and defense against sloppy thinking, which is
essential in a world of rampant storytelling.
Kara Lilly, CFA
Investment Strategist
June 2017
Drobny, Steven. Inside the House of Money: Top Hedge Fund Traders on Profiting in the Global Markets. John Wiley
& Sons, Inc. Hobeken, New Jersey. 2006.
1
2
American Library Association - http://www.ala.org/tools/libfactsheets/alalibraryfactsheet02
3
Bureau of Labor Statistics - https://data.bls.gov/timeseries/CES0000000001
Myers, I.B., McCaulley, M.H., Quenk, N.L., & Hammer, A.L. (1998). MBTI Manual: A guide to the development and use
of the Myers-Briggs Type Indicator (3rd ed.) Palo Alto, CA: Consulting Psychologists Press
4
5
http://introvertzone.com/ratio-of-introverts
Mary Jane Scherdin and Anne K. Beaubien. “Shattering Our Stereotype: Librarians’ New Image, “ Library Journal 120
no. 12 (1995): 35-38.
6
Page 9