Investment performance in relationship to

Whitepaper
K E S S L E R
I N V E S T M E N T
A D V I S O R S ,
I N C .
Investment performance in relationship to
holding period length
An examination of what isn’t described with common investment performance metrics and a
better way to present performance that helps close the gap
between investor expectations and reality.
July 10, 2012
Eric Hickman
President
[email protected]
+1.303.291.8441
Introduction & Summary
Investment return statistics have a primary purpose of reporting to investors
what their performance has been for a particular period. For this purpose
alone, common return statistics are sufficient; however, despite regulatory
warnings, these same historical returns are used just as often in providing
guidance for the future. Common return statistics are largely inadequate for
this second purpose because they provide only a superficial analysis,
ignoring the relevance of how returns were sequenced through time. This
article intends to show that further analysis can be applied to an ordinary set
of monthly returns more suitable for this secondary purpose.
More specifically, measurements of nominal or relative returns are quietly
implied to continue, yet, because of inconsistency, rarely exhibit past
patterns within expected time frames. In practice, returns are often shown
for periods longer than what an investor is tolerant to wait for progress, and
measurements for shorter periods can be wildly different than a
measurement for the entire period. While an asset class or strategy may well
approach an average return over time, there is no description in performance
presentations today of how long one should expect to wait for it. This
perpetuates a wide gap between investor expectations and reality, which
until closed, makes the suggestion of high returns over short periods seem
plausible and leaves investors, regulators, and even investment
professionals unsure of what is and isn’t possible within an investment.
The annualized return, standard deviation, and their combination in the
Sharpe ratio (and related risk/return ratios such as the Information and
Sortino) are commonly calculated with a set of monthly returns. Perhaps
surprisingly, any sequencing of these returns will give the same results. Take
as an example, 12 monthly returns, of which 6 are positive and 6 are
negative. If the 6 positive returns occurred in the first six months and the six
negative occurred over the most recent six months, the annualized standard
deviation of monthly returns, annualized return, and Sharpe ratio would be
identical to if the returns alternated between positive and negative each
month. These two sequences are vastly different experiences; the former
has been losing for the last six months, the latter has shown progress every
other month. Metrics are needed to make a distinction between the concept
of the first and the second; metrics describing consistency.
Investment performance in relationship to holding period length v1.03 | Kessler Investment Advisors, Inc. | 7/10/2012 | Page 1 of 17
Very little has been written on the topic of consistency and in fact, the seminal article
was written just 2 years ago (Villaverde 2010) . Michael Villaverde’s article introduced
the topic and its importance, examined the drawdown measure as a way of describing
consistency, and proposed a novel performance ranking metric.
1
This article is a somewhat different approach to the concept, but it also endeavors to
expand on specific concepts raised in the Villaverde article. The article will attempt to
show that using commonly available monthly performance data, performance analysis
can be greatly improved to offer a fair, self-meaning, and a more closely-aligned to
“what good performance should be” metric by describing returns in relationship to the
necessary holding period to have some certainty of making them, given a random entry
date. Or in simpler terms, the question this article seeks to answer is, “how long does it
take for average returns to form?”
This work naturally extends into an empirical dataset of this measure for familiar funds,
asset classes, and indices which establish a calibration to what length of time investors
should expect to wait for returns in different asset classes as well as show how much
consistency investment managers have been able to achieve (either as absolute or
relative performance).
Findings of this article produce broadly relevant, yet often unexpected results such as
given a random investment point, it requires a holding period of 8 years to have a good
expectation (95% probability) to make better than 0% (not lose) in the stock market
(S&P 500 total return), or that it takes 24 years to have a good expectation to make
better than 7% annualized in the same market. The article also tries to find from whom
and how much consistency has been produced amongst well known managers with
published returns and significant assets. In the Winton Futures Fund for instance, it
takes 11 months to have a good expectation to not lose, and about 3 years (3 years, 1
month) to have an expectation to make better than 7%. For Bridgewater’s Pure Alpha
Fund I, it takes a holding period of 2.7 years for an expectation to not lose and 5.7 years
to make better than 7% annualized. These statistics for the ‘best’ are far from what is
often casually implied in the investment industry…some variation of “I can make you
money now”.
This same analysis can be applied to the relative return space with the finding that in
Pimco’s Total Return Bond Fund, it takes about 4 years (3yrs, 11mo.) to have a good
expectation for the fund to beat its benchmark (much more than one year as sometimes
is implied), or 8.4 years to expect for Berkshire Hathaway stock to beat the S&P 500.
Knowing these statistics for the most consistent managers also sets a soft threshold that
beyond which, fraud or hidden risks should be suspected. Bernard Madoff’s counterfeit
returns are a notable example of returns that exhibit consistency way beyond other
known managers. Analyzing returns in this way provides a ready test for this.
Important considerations to this article
While this article strives to advance performance measurement beyond current
methods, the metrics presented herein have the same limitations inherent to any
1Villaverde,
Michael (2010) 'Measuring investment performance consistency', Quantitative Finance, 10: 6, 565 — 574,
http://dx.doi.org/10.1080/14697688.2010.489683
Investment performance in relationship to holding period length v1.03 | Kessler Investment Advisors, Inc. | 7/10/2012 | Page 2 of 17
historical study. Methods in this article are still bound by the regulator-prescribed
phrase “past performance is no guarantee of future results”, and they rely on the
properties of a normal (Gaussian) probability distribution. Historical return analyses
generally assume that returns are random, and to the extent that an underlying
condition of a return series has materially changed, any historical performance analysis
is greatly limited in its utility. However, it should be mentioned that the often stated
objection to the normal distribution in that it largely underestimates the frequency of
extreme events (tails) decreases significantly as holding periods increase (i.e. excess
kurtosis falls as holding period increases). History is certainly not the future; however,
to the extent that historical performance is only one of a handful of tools to evaluate an
asset class or strategy, this article suggests that it can be better analyzed to improve its
utility in providing guidance.
A note about the performance used and sampled
Investment performance used in this article is total return for each sub-period, and netof-fees if available. A total return means that the entire investment is available as cash
at that performance level at that point in time. This is specifically defined to exclude
fixed cash-flow streams from bonds or stocks that can be highly consistent, but without
consideration of the price of the security, do not represent what the total investment is
worth as cash. Total return performance is crucial to represent levels at which an
investor could enter or exit the investment.
Also, total returns are the measure in which any investment can be fairly compared to
another, independent of asset class, management style (active vs. passive, growth vs.
value), asset size, etc. No matter what special methods, insights, or promises are
made by an investment professional to make an investment sound appealing; there
should be a set of historical total returns to justify that. Investments compete on merit
alone with a set of total returns.
The main sub-period referred to and used in this article is monthly. There is nothing
mathematically important about monthly returns, but they are used because of the
wealth of empirical data that exists with them. Monthly valuations are now the industry
standard.
The term ‘return series’ as used in this article is defined to be a time-wise contiguous,
roughly equal time-length (not strictly because months have different numbers of days)
set of percentages and therefore that the geometrically linked return of the sub-period
set is equal to the cumulative return of the investment for the entire period.
Date ranges in this article are inclusive for the start and end unless otherwise stated.
The article is divided into five sections:
1.
2.
3.
4.
5.
Common investment performance statistics miss vital information
What is consistency and why is it important?
Limitations to other metrics describing consistency
A new metric
Studying the dataset
Investment performance in relationship to holding period length v1.03 | Kessler Investment Advisors, Inc. | 7/10/2012 | Page 3 of 17
1. Common investment performance statistics miss vital information
For the purposes of this article, ‘common performance statistics’ are defined to be the
annualized return, the annualized standard deviation of monthly returns, and their
combination in various risk-adjusted return ratios; information, Sharpe, & Sortino.
Notably, the maximum drawdown and related Calmar ratio are left out here because
they do lend information about consistency and will be covered in section 3.
A hallmark of these measures is that regardless of how the monthly returns are ordered,
they will produce the same result. By examining different orderings of the same set of
sub-period returns, it becomes evident there is something missing in what they
describe. First shown formulaically:
The annualized return for a set of n monthly sub-period returns {rn} is given by:
 n

annualized return    1  ri 
 i 1

12
n
1
And because multiplication is commutative, any ordering of {rn} will give the same
result.
Likewise, the annualized standard deviation of n monthly returns {rn} is given by:
n
annualized standard deviation 
 (r  r )
i 1
2
i
n
 12
And because addition is commutative, any ordering of {rn} will give the same result.
Second, visually:
It is illustrative to examine different sub-period orderings of a single set of monthly
returns on a cumulative line chart/VAMI. First, 36 monthly returns were deliberately
generated to have plausible return and volatility characteristics, but with a distinct twophase consistency behavior. Returns are high in the first couple years but then
generally negative in the last year (fig. 1).
fig.1 Example using randomly generated returns (not from any real historical series)
1350
VAMI, or growth of $1,000
Randomly Generated Returns
Randomly Generated Returns
1300
1250
1200
1150
1100
1050
1000
00yr
01yr
950
02yr
M1
M2
M3
M4
M5
M6
M7
M8
M9
M10
M11
M12
03yr
Annual
Year 1
2.16%
-1.27%
2.61%
1.41%
4.22%
-0.24%
3.07%
2.81%
0.14%
3.71%
0.79%
-2.41%
18.12%
M13
M14
M15
M16
M17
M18
M19
M20
M21
M22
M23
M24
Year 2
3.67%
-1.06%
2.23%
-1.44%
-1.84%
1.34%
4.06%
1.50%
-1.27%
3.32%
2.83%
-0.36%
13.47%
M25
M26
M27
M28
M29
M30
M31
M32
M33
M34
M35
M36
Year 3
-3.16%
-3.21%
0.10%
1.21%
-0.37%
-0.75%
3.58%
-2.54%
-0.05%
-2.38%
-2.30%
-2.65%
-12.02%
Annualized Return
Annualized Standard Deviation
Sharpe Ratio (assume 0% cash rate)
5.65%
7.92%
0.713
Data Source: Kessler
Investment performance in relationship to holding period length v1.03 | Kessler Investment Advisors, Inc. | 7/10/2012 | Page 4 of 17
This shows a rough example of inconsistent returns. Next, by sorting those returns
largest to smallest (or vice-versa) and applying a shuffling algorithm2, they can be
ordered such that the cumulative return line is visually much more consistent (fig. 2).
fig.2 Same set of sub-period returns in fig. 1, but deliberately re-ordered
VAMI, or growth of $1,000
1350
Re-ordered to be more consistent
from fig. 1
arranged for more consistency
1300
M9
M26
M27
M5
M11
M25
M33
M19
M28
M36
M6
M10
Annual
1250
1200
1150
1100
1050
1000
00yr
01yr
02yr
03yr
950
Year 1
0.14%
-3.21%
0.10%
4.22%
0.79%
-3.16%
-0.05%
4.06%
1.21%
-2.65%
-0.24%
3.71%
4.64%
M18
M32
M24
M13
M4
M12
M29
M31
M20
M34
M30
M22
Year 2
1.34%
-2.54%
-0.36%
3.67%
1.41%
-2.41%
-0.37%
3.58%
1.50%
-2.38%
-0.75%
3.32%
5.87%
M1
M35
M14
M7
M15
M17
M2
M23
M3
M16
M21
M8
From fig. 1
Year 3
2.16%
-2.30%
-1.06%
3.07%
2.23%
-1.84%
-1.27%
2.83%
2.61%
-1.44%
-1.27%
2.81%
6.44%
Annualized Return
Annualized Standard Deviation
Sharpe Ratio (assume 0% cash rate)
5.65%
7.92%
0.713
Re-ordered to be more consistent
Annualized Return
Annualized Standard Deviation
Sharpe Ratio (assume 0% cash rate)
5.65%
7.92%
0.713
Data Source: Kessler
Take note that the annualized return, standard deviation and Sharpe ratio is identical
between the two lines. The point of this example is to show that this single set of
monthly returns ordered in two different ways with a shared annualized return,
annualized standard deviation, and Sharpe ratio can exhibit starkly different interim
experiences based on how the sub-period returns are sequenced through time.
This concept can be applied to a real-world example by looking at a long-term history of
the S&P 500 total return (dividends re-invested into index), 1926 through April, 2012
(inclusive). The blue line below shows the actual cumulative return experience as
sequenced in time. The green line shows a deliberate re-ordering of the same subperiod returns to maximize consistency (fig. 3).
fig.3 S&P 500 total return index, as experienced, and shuffled for consistency
VAMI, or growth of $1,000
1926—04/2012
4,096,000
Actual
Re-ordered for consistency
1,024,000
Actual
256,000
Same common statistics
but vastly different
experiences!
64,000
16,000
4,000
1,000
1925
250
1935
1945
1955
1965
1975
1985
1995
2005
2015
Annualized Return
Annualized Standard Deviation
Sharpe ratio (using 3.58% cash return for period)
9.77%
19.17%
0.323
Re-ordered for consistency
Annualized Return
Annualized Standard Deviation
Sharpe ratio (using 3.58% cash return for period)
9.77%
19.17%
0.323
Data Sources: Global Financial Data, Bloomberg, and Kessler
2A
shuffling algorithm has been developed just to show that a drastically more consistent ordering exists. The details of it are not
endemic to advancing the thesis of this article and thus not described here.
Investment performance in relationship to holding period length v1.03 | Kessler Investment Advisors, Inc. | 7/10/2012 | Page 5 of 17
The green and blue lines in fig. 3 are two entirely different experiences. For instance,
the worst rolling 12 month period in the actual S&P 500 experience is -67.9%, the worst
rolling 12 month period in the re-ordered series is –25.7%. The worst annualized return
over 10 years in the actual S&P 500 is -5.4%!, but in the reordered series, the worst is
+5.4% annualized.
In general, the flaw with using one annualized return (the numerator of most riskadjusted return measures) is that it describes the return for only one entry and one exit
date, when many more are measurable, different, and relevant. Likewise, volatility
measured for monthly periods is just too unstable in practice to have meaning. A rule of
thumb in statistics is that a volatility greater than the average is too unstable for much
meaning. The long-term S&P 500 monthly return standard deviation (un-annualized) is
close to 6 times the average monthly return!3
A proposed solution to these problems comes from considering all possible annualized
returns (varying the holding period length and entry dates for all possibilities) and
secondly, from measuring holding periods long enough to where the volatility of a set of
them is stable enough to make more sense.
But first, a closer look at the consistency concept.
2. What is consistency and why is it important?
In general, a metric of investment performance consistency is one that measures how
evenly returns have accreted through time. It is useful to consider that perfect
consistency would be identical returns for each sub-period forming a straight-line
cumulative return line/VAMI (viewed on an exponential scale) accreting at the average
annualized return rate. Returns in the real world always fall far short of this. The
question becomes, how short?
It would seem that investment return volatility (standard deviation) would capture
consistency; however, because it does not describe how the sub-period returns are
distributed through time along with its highly volatile nature, the compounding of
successive negative or positive returns can pull the return far away from its long-term
growth trend.
Consistency is of paramount importance in evaluating historical investment
performance; four reasons that consistency is important in evaluating historical return
series:
a. Timing in and out of the investment
The more inconsistent a return series is, the more burden falls to an investor to
time an entry or exit point to it. A perfectly consistent series with identical subperiod returns would make it irrelevant when an investor entered or exited it.
3The
standard deviation (un-annualized) of the monthly returns from the S&P 500 total return index from 1926—04/2012 (inclusive) is
5.53%, the arithmetic average of those same returns is 0.93%. The standard deviation is 5.9 times the average.
Investment performance in relationship to holding period length v1.03 | Kessler Investment Advisors, Inc. | 7/10/2012 | Page 6 of 17
b. Flexibility for the investor
Reaching a goal trend of return in a shorter amount of time gives the investor
the most flexibility in the use of the funds in the same or another investment.
c. Positive reinforcement frequency
While the entry and exit dates are the only points of true economic value to an
investor, the experience in-between is as much or more important
psychologically. The frequency of the return series’ confirmation via mark-tomarket of a desired or goal trend increases comfort.
d. Leverage suitability
The more consistent a return series is, increases the amount it could be
leveraged without frequent re-balancing. Any return greater than the cost of
financing, can theoretically be multiplied with borrowed money. This would be
done infinitely if returns were perfectly consistent, but in the real world, the
combination of inconsistent returns and dynamic margin requirements make
leverage uneasy to apply consistently. As an example, using the e-mini S&P
500 futures contract, while the current contract (the ESM2 or June 2012) allows
leveraging the S&P 500 more than 15 times (15.57 a/o 04/23/12), studying a
historical portfolio (rebalancing leverage to valuation on a quarterly basis) of the
active contract, an investor would have needed to keep leverage at or below
2.2 times to avoid margin calls (insolvency) for the period from 1997 through
2011 (active period of e-mini S&P 500 contract)4. The inconsistency of the
stock market requires a very careful eye on leverage. As a related note,
inconsistency of returns (more specifically, inconsistency of alpha returns) is
guaranteed to always exist in some manner because otherwise would represent
a riskless arbitrage; the “Big-Foot” of the investment industry.
3. Limitations to other methods in describing consistency
There are several metrics in use today that directly or indirectly describe the
consistency of a return series.
a. Visually with the cumulative return line/VAMI
In this article, the cumulative return line plot has been used to demonstrate
various attributes of consistency, relying on the eye to detect an overall
“diagonal-ness” to the line. This method is unquestionably helpful, however;
aside from being qualitative rather than quantitative, the cumulative return line
can be misleading in describing consistency:
i. The consistency of a return series viewed as a line plot can appear
either much closer in consistency or much further in consistency to
another series based on a difference in time horizon of the x-axis
(fig. 4).
4A
historical model was created to obtain these results. Its construction is beyond the scope of this article.
Investment performance in relationship to holding period length v1.03 | Kessler Investment Advisors, Inc. | 7/10/2012 | Page 7 of 17
fig.4 Time scale and visual assessment for consistency: two similar looking charts
Chart 1
Chart 2
VAMI, or growth of $1,000
1926—04/2012
VAMI, or growth of $1,000
12/1991—04/2012
8,000
2,048,000
Bridgewater Pure Alpha Fund I
S&P 500 Total Return
512,000
4,000
128,000
32,000
At first glance, charts 1 and 2
look similarly consistent...
8,000
2,000
….however, chart 1 covers more than
85 years and chart 2 covers just 20.
2,000
1,000
1925
500
1935
1945
1955
1965
1975
1985
1995
2005
2015
1,000
1991 1993 1995 1997 1999 2001 2003 2005 2007 2009 2011 2013
Chart 3
VAMI, or growth of $1,000
12/1991—04/2012
8,000
S&P 500 Total Return
Bridgewater Pure Alpha Fund I
4,000
When viewed with the same time scale,
overlaying chart 1 onto chart 2, for the
shorter time period of chart 2, it shows
the real difference in consistency.
2,000
1,000
1991
1993
1995
1997
1999
2001
2003
2005
2007
2009
2011
2013
Data Sources: Global Financial Data, Bloomberg, and Barclay Hedge
ii. Viewing the chart’s y-axis linearly will skew the appearance of
consistency. A linear axis makes earlier returns look less volatile and
more recent returns look more volatile than was actually the case.
Also, as an isolated coincidence, the Barclay CTA index has exhibited
two distinct phases of performance since inception, one growing with
an approximate trend of 23% annualized to 1990 and then growing at
a much smaller rate of about 6% annualized since then. Coincidently,
viewing this chart with a linear y-axis makes the chart appear
consistent for the whole period. Viewing it on a more suitable
exponential scale shows the two phases clearly (fig. 5).
Investment performance in relationship to holding period length v1.03 | Kessler Investment Advisors, Inc. | 7/10/2012 | Page 8 of 17
fig.5 Linear compared to exponential y-axes
VAMI, or growth of $1,000
01/1980—03/2012
VAMI, or growth of $1,000
01/1980—03/2012
32,000
32,000
Barclay Hedge CTA Index
28,000
Barclay Hedge CTA Index
24,000
6%
16,000
exponential scale
4,000
12,000
INCORRECT: Barclay CTA Index
viewed on a linear y-axis, looks
more consistent than it actually is.
8,000
4,000
0
1980
1985
1990
1995
2000
2005
2010
2015
tre
nd
8,000
16,000
CORRECT: Barclay CTA Index
viewed on an exponential y-axis
23
%
linear scale
20,000
d
tren
2,000
1,000
1980
1985
1990
1995
2000
2005
2010
2015
Data Source: Barclay Hedge
b. Villaverde ratio
The novel metric proposed in Michael Villaverde’s article (which is called
the Villaverde ratio here), considers returns for all holding period lengths
and entry points. He suggests a ratio in which the numerator is the
standard deviation of all possible annualized returns for all holding period
lengths, and the denominator, the arithmetic average of this same set. The
problem with it; is that differing time period lengths are un-comparable.
The geometric averaging calculation for the annualized return is such that
longer periods become exponentially impervious to volatility. Take the
example of the S&P 500 back to 1925. The long-term annualized return is
9.8% (through 04/2012). If a hypothetical -80% sub-period return were
added, the long-term annualized return only drops to 7.7%. If instead, a
-80% sub-period were added to the same annualized return representing
only 10 years, it would affect the annualized return with much greater
magnitude, lowering it to 1.3%.
What happens is that longer series with the same perceived consistency
will rank better (lower numbers) than shorter periods only because of the
inclusion of more stable returns for the longer periods. An example of this
follows (fig. 6).
Investment performance in relationship to holding period length v1.03 | Kessler Investment Advisors, Inc. | 7/10/2012 | Page 9 of 17
fig.6 Villaverde ratio improves with just repeated returns
This example shows a set of randomly generated returns for three years (36 months) and then repeated 5 times.
Measuring the Villaverde ratio after each repeat (for 3 years, for 6 years, for 9 years, etc) shows a decrease in the ratio
(more consistent), yet, the mere repeating of a series does not make it more consistent. This illustrates that this ratio will
be biased towards series of greater length even though it may be no more consistent.
VAMI, or growth of $1,000
16,000
5x
4x
8,000
Times
repeated
1x
2x
3x
4x
5x
3x
4,000
2x
2,000
1x
1,000
00yrs
03yrs
06yrs
09yrs
12yrs
Villaverde ratio
(lower is better)
1.17
0.82
0.68
0.60
0.54
15yrs
Data Source: Kessler
c. Drawdown/Calmar ratio
The worst drawdown measure and associated Calmar ratio (information ratio
but with worst drawdown in the denominator rather than standard deviation),
does take ordered monthly returns into account. However, it can be misleading in two scenarios:
i. The measure only considers one drawdown. It could be logical for an
investor to favor a return series with a large drawdown near inception
and without anything nearly that large since, over a return series with
frequent smaller drawdowns. The Calmar ratio would favor the latter.
ii. Also, if the return series had an uncharacteristic surge of performance,
a reversion to the mean would create a drawdown larger than if the
surge had not occurred to begin with. The drawdown created from
the reversion to a growth trend would seem logically not to be the
same as if the drawdown started from a series that was at its growth
trend. The Calmar ratio would obscure this. This issue is what the
next metric attempts to address.
d. Consistency Ratio
In the search for methods to describe consistency, the author constructed a
metric that is appealing theoretically, but requires a single static growth trend
throughout its history to make sense. In practice, a return series that would be
undeniably considered consistent, may exhibit several growth trends
throughout its history. The historical series of Bernie Madoff’s counterfeit
returns illustrates this problem.
Investment performance in relationship to holding period length v1.03 | Kessler Investment Advisors, Inc. | 7/10/2012 | Page 10 of 17
The metric is a ratio that takes the annualized return as its numerator like the
Sharpe or information ratios, but uses the standard deviation from the
exponential growth trend as ordered in time as its denominator. The ratio is
given by:
annualized return
consistency ratio 
e
n
  log e ( indexi )  log e ( c . indexi ) 
i 1
n
Exponential standard deviation
from average growth trend.
1
Where indexn is the index level (VAMI) incremented at each sub-period by the
rate of return and where c.indexn is the index level starting at the same level as
the actual index (i.e.1000), but incremented at each sub-period by the average
monthized return of n monthly returns {rn} given by:
 n

monthized return    1  ri 
 i 1

1
n
1
This can also be seen visually (figs. 7,8).
fig.7 Consistency ratio with S&P 500 example
VAMI, or growth of $1,000
1926—04/2012
2,048,000
S&P 500 Total Return Index
512,000
Dispersion from the average
growth trend (gray)
128,000
32,000
8,000
2,000
5001925
Av
ge
era
1935
wt
gro
re
ht
1945
n
n
c.i
d(
x n)
de
1955
Annualized Return
Std deviation from Growth Trend
1965
1975
1985
1995
2005
2015
9.77%
=
59.84%
0.16
consistency
ratio
Data Sources Global Financial Data, Bloomberg, and Kessler
Investment performance in relationship to holding period length v1.03 | Kessler Investment Advisors, Inc. | 7/10/2012 | Page 11 of 17
fig.8 Consistency ratio doesn’t pass the ‘Madoff test’
Any measure of consistency should pass the ‘Madoff test’ in that Madoff’s counterfeit returns should rank very high, if
not the highest to indicate that the returns were too consistent. With the consistency ratio, if the return series is
consistent, yet grows at different rates, it will spend its entire history away from its growth trend and thus not rank well
with this metric. Madoff’s counterfeit returns exhibit this problem and deem this ratio too theoretically perfect for use
in the real world.
VAMI, or growth of $1,000
12/1990—10/2008
8,000
Bernie Madoff's counterfeit returns
4,000
Annualized Return
2,000
A
eg
ag
ver
th
row
nd
tre
e
ind
(c.
Std deviation from Growth Trend
x n)
1,000
1990 1992 1994 1996 1998 2000 2002 2004 2006 2008
10.56%
=
14.95%
0.71
consistency
ratio
Data Source New Private Bank Ltd.
VAMI, or growth of $1,000
02/1989—04/2012
The 25 year UST STRIP index scores a
better consistency ratio than Bernie
Madoff’s returns, yet, visually, it is clear
that this should be the other way
around.
16,000
25yr UST STRIP Index
8,000
4,000
Annualized Return
Std deviation from Growth Trend
2,000
1,000
1989
1994
1999
2004
2009
2014
11.28%
=
15.31%
0.74
consistency
ratio
Data Source Ryan Labs, Inc.
4. A new metric
A proposed solution to these limitations lies in considering annualized returns for
periods long enough that their volatility is low enough to make a more meaningful
statement (discussion, end of section 1). As stated in the Villaverde ratio section
before, as the holding period for the returns that volatility is being measured on
increases (say 1 month, 2 month, … , n month), volatility tends to decrease. Because of
this, longer holding periods tend to imply more certainty of making expected returns.
This feature can be harnessed to make more meaningful statements about investment
performance.
The methodology is to study the set of annualized returns for all possible holding period
lengths at all possible entry point dates (same as in Villaverde’s proposed metric), but
to consider each holding period length set separately, and apply properties of the
Investment performance in relationship to holding period length v1.03 | Kessler Investment Advisors, Inc. | 7/10/2012 | Page 12 of 17
probability cumulative distribution function to them. This forms a relationship between
holding period length and annualized returns. The metric is designed to describe
necessary waiting periods to have some certainty of making more than a desired
performance threshold, having invested into the strategy/asset at a random point.
More specifically, the methodology is to find the shortest holding period length t whose
probability to exceed some annualized performance threshold x (say 0%)is greater than
or equal to some confidence level c (say 95%). This holding period length t can then be
translated into the sentence: given a random entry date, it takes a minimum of time t to
have a c probability of making greater than annualized return x in the studied
investment. Formulaically:
Find the minimum holding period t such that
P ( rt , x )  c
where P(rt , x) is the probability of an annualized return with holding period t
exceeding x given by:

P ( rt , x)  1  CDF ( x, rt ,  ( rt ))

where {rt} is a set of annualized returns for a given holding period and CDF(d,e,f) is
the cumulative distribution function of the Gaussian probability distribution.
What follows is a visual walk-through of the creation of the metric using the total return
series of the S&P 500 (figs. 9-12).
fig.9 Rolling annualized returns for the S&P 500
The set of all 10-year-holding-period–length annualized returns for the S&P 500 total return index, 1926—04/2012
Max
Min
-6%
-4%
-2%
0%
2%
4%
6%
8%
10%
12%
14%
16%
18%
20%
22%
The set of all 20-year-holding-period–length annualized returns for the S&P 500 total return index, 1926—04/2012
Max
Min
-6%
-4%
-2%
0%
2%
4%
6%
8%
10%
12%
14%
16%
18%
20%
22%
The set of all 30-year-holding-period–length annualized returns for the S&P 500 total return index, 1926—04/2012
Min
-6%
-4%
-2%
0%
Data Sources: Global Financial Data, Bloomberg
2%
4%
6%
8%
Max
10%
12%
14%
16%
18%
20%
22%
In general, dispersion decreases as
holding period increases.
Investment performance in relationship to holding period length v1.03 | Kessler Investment Advisors, Inc. | 7/10/2012 | Page 13 of 17
fig.10 All possible returns for all possible holding period lengths,
S&P 500 total return 1926—04/2012
This cone-shaped diagram rotates and expands figure 9 to all holding period lengths from 1 month out to 50 years
(monthly granularity), green is the maximum, red is the minimum, and blue is the variation in between.
80%
30%
70%
20%
Annualized Return
60%
10%
50%
0%
40%
-10%
10yr annualized
returns
30%
-20%
00yrs
10yrs
20yr annualized
returns
20yrs
30yr annualized
returns
30yrs
40yrs
50yrs
Data Source: Kessler
Holding Period
fig.11 Standard deviation of returns at different holding period lengths for S&P 500 total return 1926—04/2012
50%
Standard Deviation of Annualized Returns
From fig. 10, it follows that standard
deviation decreases as holding
period increases. Or put another
way, the certainty of a an average
return increases as holding period
increases.
40%
30%
20%
10%
0%
00yrs
10yrs
20yrs
30yrs
40yrs
50yrs
Holding Period
Data Source: Kessler
fig.12 The probability cumulative distribution function of the Gaussian distribution can be applied at each
holding period length using x=0% for S&P 500 total return 1926—04/2012
100%
Confidence level c chosen as 95%
Probability of making greater than 0%
90%
80%
Final result: given a random entry date, it takes a
minimum of 8 years to have a 95% probability of
making greater than 0% in the S&P 500 total return
index (1925 – 04/2012).
70%
60%
50%
00yrs
10yrs
8
20yrs
30yrs
Holding Period
40yrs
50yrs
Data Source: Kessler
Investment performance in relationship to holding period length v1.03 | Kessler Investment Advisors, Inc. | 7/10/2012 | Page 14 of 17
Note: An excel spreadsheet that calculates this metric for any set of monthly returns is
available by contacting the author.
5. Studying the dataset
This metric has been calculated for many familiar indices and funds from their inception
dates through April of 2012 (if still active). For comparison, the other metrics discussed
in this article have been calculated for each return series. For c, 95% is used (chosen
to represent relative certainty), and the metric is calculated for x’s: 0%, 5%, and 7% (fig.
13, gold box).
fig.13 Table of performance statistics
Common Investment Performance
Metrics
Description
From
To
Time
(yrs)
Annualized
Return
Annualized
Standard
Deviation of
Monthly
Returns
Sharpe
Ratio
(higher is
better)
Years to have 95%
probability to make
>x% ann.
Statistics describing Consistency
(lower is better)
Worst
Drawdown
Consistency
Calmar Ratio
Villaverde
Ratio (lower is Ratio (higher is
(higher is
better)
better)
better)
0%
5%
7%
table sorted
by this
column
Example Return Series (not real)
Bernard Madoff's Returns 11/30/90 10/31/08
17.9
10.6%
2.4%
2.73
-0.6%
16.49
0.22
0.71
S&P 500 (deliberate re-ordering from fig. 3) 12/31/25 04/30/12
86.3
9.8%
19.2%
0.32
-31.7%
0.31
1.70
0.97
Bluecrest Bluetrend Fund 03/31/04 09/30/11
7.5
16.7%
13.9%
1.05
-12.6%
1.33
0.74
1.80
Kessler Cornerstone Absolute Return Strategy 05/31/07 04/30/12
4.9
8.7%
7.7%
1.00
-8.2%
1.06
1.11
1.47
Nominal Return Series
Winton Futures Fund 09/30/97 04/30/12
14.6
15.8%
17.8%
0.74
-25.6%
Bridgewater Pure Alpha I fund 11/30/91 04/30/12
20.4
10.7%
9.6%
0.78
-14.2%
Pimco Total Return Fund (nominal) 05/31/87 04/30/12
24.9
8.4%
4.3%
1.06
-5.6%
1.49
0.37
0.77
25yr UST STRIP Index 01/31/89 04/30/12
23.2
11.3%
19.9%
0.38
-38.9%
0.29
1.45
0.74
4
3
0.62
0.92
0.63
0.75
0.66
1.19
Hedge Fund Research Composite Index 12/31/89 04/30/12
22.3
11.3%
7.0%
1.11
-21.4%
0.53
0.59
0.25
Recent Barclay Hedge CTA Index (1990 - 2011) 12/31/89 04/30/12
22.3
5.9%
8.2%
0.30
-10.1%
0.59
0.78
0.54
Barclays Aggregate Bond Index 01/31/76 04/30/12
36.2
8.2%
5.6%
0.52
-12.7%
0.65
0.45
0.37
S&P 500 Total Return (good period, 1940 - 2000) 12/31/39 12/31/99
60.0
12.8%
14.5%
0.59
-42.6%
0.30
0.60
0.21
Berkshire Hathaway (nominal) 11/30/87 04/30/12
24.4
16.5%
21.8%
0.58
-44.5%
0.37
1.42
0.13
Citi Treasury Index 12/31/79 04/30/12
32.3
8.4%
5.6%
0.58
-6.8%
1.24
0.47
0.29
S&P 500 Total Return 12/31/25 04/30/12
86.3
9.8%
19.2%
0.32
-83.7%
0.12
1.75
0.16
Barclay Hedge CTA Index 12/31/79 03/31/12
32.2
11.0%
15.0%
0.40
-15.7%
0.71
1.46
0.11
Commodities (SP/GS total return) 01/31/70 04/30/12
42.2
9.7%
20.0%
0.22
-67.6%
0.14
1.40
0.09
IBM stock total return (dividends re-invested) 01/31/68 04/30/12
44.2
8.4%
24.8%
0.12
-67.5%
0.12
2.18
0.10
0.2
1.8
2
1
0.6
1.8
4.8
3.7
0.9
0.9
1.6
2.6
0.9
2.7
2.6
1.8
2.1
3.1
8.3
1.7
8.0
5.3
7.7
11.5
1.6
2.6
2.8
4.6
5.1
6.2
10.0
10.6
11.1
11.1
13.2
13.6
18.3
19.9
21.5
26.3
1.8
4.5
3.6
5.5
16.8
9.8
13.7
n/a
22.8
22.6
14.2
25.3
24.0
24.0
38.2
28.6
3.9
9.2
11.1
n/a
18.9
n/a
n/a
n/a
n/a
Relative Return Series
Pimco over Barclays Aggregate Bond Index 05/31/87 04/30/12
24.9
1.0%
Berkshire Hathaway over S&P 500 11/30/87 04/30/12
24.4
6.4%
Hedge Fund Research Composite Index over S&P 500 12/31/89 04/30/12
22.3
2.6%
Notes:
a.
Results can and will change based on the time period studied.
b.
Shorter return series should be judged with more caution vs. longer return series
c.
Results in the gold box that are within a year of the total length of the series are using less than twelve data points in the calculation and thus not as
reliable as statistics derived from larger sets.
d.
The analysis here is shown for nominal returns, however; it would be relevant to do this analysis on the alpha returns only to determine how long it takes
for a strategy to rise above levels over the risk-free rate.
e.
The Sharpe ratio is calculated for each series using the actual risk-free return for the time in-effect, using returns from a 1mo. t-bill index
f.
The Bluecrest Bluetrend fund has closed to new investors and thus performance is only available through September of 2011
g.
‘n/a’ indicates that the probability threshold was not reached for any holding period
Observations:
1.
The biggest ’tell’ in Bernie Madoff’s returns were the extremely short time period required to make greater than 5% annualized; about half of a year, nearly
three times as fast as the next best studied, the Bluecrest Bluetrend fund
2.
Even with the most consistent funds and strategies, an investor should be prepared for a negative first year. It is only at the 11 month mark, that any of
these strategies have a 95% chance of being above 0%.
3.
The Winton Futures fund is very consistent, yet by traditional statistics (Calmar, Sharpe) ranks lower than the author would suggest it should.
4.
The Hedge Fund Research composite index has the highest Sharpe ratio of any series studied here, yet takes 10 years to have a good expectation to get
above 5% annualized, a long time. This is an example where the Sharpe ratio may be misleading.
Data Sources: New Private Bank Ltd., Barclay Hedge, Global Financial Data, Bloomberg, Hedge Fund research, Barclays, Citigroup, Ryan Labs, S&P/Goldman Sachs
Investment performance in relationship to holding period length v1.03 | Kessler Investment Advisors, Inc. | 7/10/2012 | Page 15 of 17
This metric and methodology works well for comparison/ranking purposes, but it also
can have utility as a guide to investors already decided on or invested in a strategy. An
investor could be given a cone diagram like in fig. 10 or 14, and be able to compare
their actual performance for a certain holding period against the historical record. A
data point inside the cone, would represent a normal accretion of progress, and could
help eliminate the gap between expectations and reality (fig. 14).
Conclusion
Investment professionals and investors make statements in various ways to the effect of
“now is the right time to buy investment x.” As innocent as this might seem, it is making
the implication that someone knows for sure how an asset will behave in the short term.
If someone could do this reliably, it would be seen somewhere in the historical return
record, with returns occurring much more frequently than can be seen in fig 13.
If the fastest that a strategy or trader can be reliably profitable is about one year (as
shown in fig. 13), then the articulate prognostications made day to day in the financial
media are mostly wrong, and even when someone has a great historical annualized
return figure, there is often no monthly performance record to see how consistent,
correlated, or volatile it was.
Understandably, lengthy calculations (as proposed in this article), might have been too
cumbersome to complete fifteen or twenty years ago, but ordinary computers now
exceed the capacity necessary to calculate investment performance metrics. In fact, an
ordinary computer can complete the calculations necessary for the metric proposed in
this article in about 5-10 seconds.
To the extent that monthly return records are available, further analysis should be done
to give investors a more complete picture of how long one needs to wait to see results,
given that with all likelihood, they will not enter an investment at an optimal point.
It is also important to point out that this article suggests that the ‘market’ can indeed be
reliably beat (relative return series section in fig. 13), it just happens at a pace that most
investors and investment professionals are hoping to happen much faster.
Investment performance in relationship to holding period length v1.03 | Kessler Investment Advisors, Inc. | 7/10/2012 | Page 16 of 17
fig.14 A sampling of holding period vs. annualized returns for several return series
The diagrams below show what span of returns should be expected in a strategy at a given holding period length,
with a random entry date. For comparison purposes, the axes of all charts are the same. Returns over 30% or under
–20% have been cut-off to keep the convergent part of the chart (where expectations get high) more visible.
S&P 500 Total Return
1926—04/2012
80%
30%
Bridgewater Pure Alpha Fund I
12/1991—04/2012
80%
30%
Maximum return
measured
70%
20%
70%
20%
Annualized Return
Annualized Return
Variation inbetween
60%
10%
50%
0%
Minimum return
measured
40%
-10%
30%
-20%
00yrs
05yrs
10yrs
15yrs
60%
10%
50%
0%
40%
-10%
20yrs
30%
-20%
00yrs
05yrs
Bernard Madoff counterfeit returns
12/1990—10/2008
15yrs
20yrs
Kessler Cornerstone Absolute Return Strategy
06/2007—04/2012
80%
30%
80%
30%
While subtle, the most unusual feature of this diagram
is the asymmetry between maximums and minimums.
70%
20%
70%
20%
Annualized Return
Annualized Return
60%
10%
50%
0%
Negative returns are almost non-existent
40%
-10%
30%
-20%
00yrs
05yrs
10yrs
15yrs
60%
10%
50%
0%
40%
-10%
20yrs
30%
-20%
00yrs
05yrs
10yrs
Bluecrest Bluetrend Fund
05/2004—09/2011
80%
30%
80%
30%
70%
20%
70%
20%
60%
10%
60%
10%
50%
0%
15yrs
20yrs
15yrs
20yrs
50%
0%
40%
-10%
40%
-10%
30%
-20%
00yrs
05yrs
10yrs
15yrs
20yrs
30%
-20%
00yrs
05yrs
Pimco Total Return Fund
06/1987—04/2012
80%
30%
70%
20%
60%
10%
60%
10%
Annualized Return
70%
20%
50%
0%
40%
-10%
30%
-20%
00yrs
10yrs
Holding Period
Holding Period
Annualized Return
20yrs
Hedge Fund Research Composite Index
01/1990—04/2012
Annualized Return
Annualized Return
15yrs
Holding Period
Holding Period
80%
30%
10yrs
Holding Period
Holding Period
Winton Futures Fund
10/1997—04/2012
50%
0%
40%
-10%
05yrs
10yrs
15yrs
20yrs
Holding Period
30%
-20%
00yrs
05yrs
10yrs
Holding Period
Data Sources: New Private Bank Ltd., Global Financial Data, Bloomberg, Hedge Fund Research
Investment performance in relationship to holding period length v1.03 | Kessler Investment Advisors, Inc. | 7/10/2012 | Page 17 of 17