|
|
![]() |
3.3.4Studies and Observations that Show the Daunting Odds of Stock Picking
The basic problem
with stock picking is revealed when we examine how stock pickers are
unable to beat a market over the long run. In a random and efficient
stock market, active investors are just gambling or playing a game of chance. The
money managers that run actively managed mutual funds are essentially gamblers,
paid by the unsuspecting shareholders, with a high average annual
fee of about 1.5%.
Figure 3-2 The odds of throwing a two (snake eyes) at the craps table are the same as the results of this study, one in 36. The least likely rolls of a pair of dice are two and 12. The odds in roulette are one in 38 for picking a one-number winner. Gambling in Las Vegas may lead to more success than trying to find a manager who beats a chosen index at the beginning of the period. Says John Bogle, founder of Vanguard: “Investors earn a net return, after all of the costs of our system of financial intermediation. Just as gambling in a casino is a zero-sum game before the croupiers rake in their share and a loser’s game thereafter, so beating the stock and bond markets is a zero-sum game before the intermediation costs, and a loser’s game thereafter.”
To illustrate the daunting odds of success for stock pickers, take a look at these studies. Alfred
Cowles conducted one of the first recorded studies of stock pickers’
performance in a July 1933 article titled, “Can Stock Market Forecasters
Forecast?” He concluded that it was “doubtful.” Figure
3-3
![]() In a similar analysis by a different firm, and using a different database and a slightly different time period, very similar results were determined.
![]()
The findings of another study by Sharpe titled, “Asset Allocation:
Management Style and Performance Measurement, an Asset Class Factor Model
can Help Make Order out of Chaos” supported the hypothesis that
the average mutual fund cannot beat the market before costs. That’s
because such funds constitute a large and presumably representative part
of the market. Annualized, the mean underperformance is approximately
0.89% per year—an amount that is approximately equal to the costs
incurred by a typical mutual fund. ![]() In another study by
Odean and Barber titled, “Too Many Cooks Spoil the Profits: The Performance
of Investment Clubs,” 166 investment clubs were followed from February
1991 through December 1996. Many people belong to investment clubs, which
are touted as a valuable way for investors to learn about the markets.
Of the total investment clubs, 57% underperformed the market. Henry Blodget took a hard look at active management, and he came to this conclusion: "Academics have essentially proved that active fund management, for the fund customer, is a loser's game. The vast majority of active funds underperform passive benchmarks. So the vast majority of customers of active funds pay billions of dollars in exchange for, at best, nothing." DFA looked at 31 institutional pension plans with $70
billion in total assets. The firm found that when the returns were properly
risk adjusted using the Fama/French Three-Factor Model, at least 95% of the returns
were explained by the three risk factors, and the value added by active
management was statistically insignificant, even before fees. Jeff's sobering conclusion was that, "If such a small percentage beat the index, many of them do it with luck and there's no way to identify those that really are brilliantly managed... . Well, that's why index-fund investing is so attractive." Figure 3-6
Figure 3-6A-1 Figure 3-6A-2 False Discoveries of the Elusive Alpha The term “alpha” represents the difference between the return on an investment and the return which could have been achieved in an index with identical risk exposure, quantifying a fund manager’s skill. A recent study by Laurent Barras, Olivier Scaillet, and Russ Wermers investigates the presence of true alpha in the results of 2,076 open-end domestic equity mutual funds for the thirty-two years from January 1975 to December 2006. The study, “False Discoveries in Mutual Fund Performance: Measuring Luck in Estimated Alphas,” employs the use of t-statistic hypothesis testing and statistical data to compare funds’ relative performance, employing a “False Discovery Test” to avoid errors which commonly plague statistical analysis and mitigate the effects of false positive and negative results. Unlike many previous studies of mutual fund performance, this method allows for distinctions to be made between fund results based on luck and those based on skill.
In a July 2008 New York Times article titled, “The Prescient Are Few”, journalist Mark Hulbert digs into the results of the landmark study and its implications as described by Prof. Russ Wermers who headed up the study. “The number of funds that have beaten the market over their entire histories is so small that the False Discovery Rate test can’t eliminate the possibility that the few that did were merely false positives,” says Prof. Wermers--or as Hulbert puts it “just lucky.” Figure 3-6B In a study of the Morningstar Direct database, the same conclusions were reached. Virtually no evidence of stock picking skill was found. A multivariable regression analysis of historical returns was conducted to determine whether or not a fund manager has skill, or to put it in academic speak, reliably delivered alpha. The three variables used were the Fama-French three risk factors of market, size and value. This analysis reveals the extent to which the returns can be replicated with a combination of index funds, as well as the value added or subtracted by the manager (i.e., alpha). One way to test the claim that a manager can beat a market is to see if we have enough years of performance data to be statistically significant. The statistical test called the Student’s t-test was introduced in 1908 by William Sealy Gosset, referred to as the “Student,” while working for the Guinness brewery in Dublin, Ireland to evaluate the quality of the brewery’s ingredients. The t-test can be used to determine if a series of historical returns is reliably superior to a risk-equivalent benchmark. This can determine whether alpha (any return over the benchmark return) is due to luck or skill. A t-stat of 2 or higher indicates that we are at least 95% confident that the manager actually earned a return higher than his benchmark due to skill, with up to a 5% chance that it was due to luck. In Figure 3-6B-i, the t-test is applied to U.S. equity funds in six different style classifications over a ten-year period. Out of 614 mutual funds that were compared to their risk-appropriate benchmarks, only 80 of the 614 fund managers had positive excess returns. Of those 80, only one (0.16%) had a t-stat greater than or equal to 2 (signifying skill). But when the time period of that one was extended back to the fund’s November 1991 inception, the t-stat dropped below 2, indicating that skill evaporated. Figure 3-6B-i Only one fund (NFJ Allianz Small Cap Value) had a statistically significant positive alpha (t-statistic greater than 2), and when this fund was analyzed over its entire period since inception, the alpha was no longer statistically significant. The chart below shows the excess return of NFJ Allianz Small Cap Value relative to the Russell 2000 Value Index (Morningstar’s designated benchmark). From the average alpha and variability of the alpha, we see that we need 170 years of similar returns to conclude the presence of skill.
Figure 3-6B-ii Another way to view this data is to draw a line that separates statistical significance on a Alpha versus Standard Deviation of Alpha Scatter Plot. Funds that fall above the line inicated that there is a 95% chance that they may be skillfull. As seen above, after extending the period for the only possible skillful manager, the probablity of skill went down the drain. Figure 3-6C-i Bill Miller of Legg Mason Capital Management holds the distinction of being the only manager to have ever beaten the S&P 500 index for fifteen consecutive years (1991 to 2005). Unfortunately, his returns after 2005 fell short of the S&P 500, so those of his investors who put their money in after he became well-known discovered the meaning of disappointment. The chart below shows how the Legg Mason Capital Management Value Trust fared against the Russell 1000 Index (Morningstar’s designated benchmark) on a calendar year basis from inception through 2010. From the average alpha and variability of the alpha, we see that we need 269 years of similar returns to anoit Mr. Miller with having stock picking skill. Figure 3-6C-ii Two funds that have recently received attention from the financial media are the Yacktman Fund and the Yacktman Focused Fund, both managed by Donald and Stephen and Yacktman. The chart below shows the excess return of Yacktman Focused relative to the Russell 1000 Value Index (Morningstar’s designated benchmark). From the average alpha and variability of the alpha, we see that we need 105 years of similar returns to conclude the presence of skill. Well, 105 is certainly better than 269. Figure 3-6C-iii For the Yacktman Fund vs. the Russell 1000 Value Index, the average alpha was -1.10%, so there is no number of possible years to conclude the existence of skill. Unfortunately for them, investors are constantly bombarded with advertisements, market commentaries, and screaming magazine covers telling them what they should do with their money. Contributing to all the clamor and din is Morningstar’s annual announcement of their awards for “Fund Manager of the Year.” As usual, investors are best served by not paying it any attention. In order to determine whether being named “Fund Manager of the Year” engenders a valid expectation of higher returns for the fund’s investors, Index Funds Advisors ran a statistical test (the t-test) of sixteen domestic equity mutual funds which received this Morningstar recognition (cached article) to determine if the fund’s outperformance was truly attributable to skill (95% or higher probability) or if it could be explained as luck. For each fund, the performance from the manager’s inception date (or the inception date of Morningstar’s benchmark in two cases) through year-end 2011 was evaluated against the benchmark designated for the fund by Morningstar. The charts below show each fund’s alpha (the difference in returns between the fund and the benchmark) on a year-by-year basis. Only one of the sixteen funds (about 6%) met the requirement of the statistical test that would suggest ruling out luck as the explanation for the outperformance based on a 95% confidence level. Before you get too excited however, please note that this fund belongs to the small growth category which of has the lowest expected return per unit of risk of all the different equity style boxes. Among the sixteen funds, the median number of years needed to conclude the presence of skill over luck was 72 years. Five of the funds showed a high enough degree of volatility in their returns (relative to their benchmarks) as to require a minimum of 100 years. Even when there is a statistical indication of skill in a manager’s performance, it is often confined to a single time period and does not persist beyond it. A perfect example of this is Bill Miller of the Legg Mason Value Trust who carries the distinction of being the only mutual fund manager to have beaten the S&P 500 for fifteen consecutive years. Viewing the fifteen-year winning period alone indicates over a 99% probability of true skill, but if we broaden the scope of analysis to his entire tenure, we no longer can statistically conclude the presence of skill over luck. Figure 3-6D1Figure 3-6D2 Calculating for t-statIn calculating the t-stat, the first step is to determine the excess returns the manager earned above an appropriate benchmark. Then we determine the regularity of the excess returns by calculating the standard deviation of those returns. Based on these two numbers, we can then calculate how many years we need to support the manager’s claims. Of the 80 fund managers who had positive excess returns, the average excess return was 0.84% and the standard deviation was 5.64%. To estimate the years needed for statistical significance, you can find the intersection of the average excess return (about 0.8%) and standard deviation (about 5.6%) in the chart below (see data box for point estimates). Then follow the line out, and you can see that 180 years of returns data are needed to establish skill as the reason for the higher returns. The calculator below the chart provides the exact number of years needed. Obviously, no manager has ever managed a fund for 180 years; therefore, we are unable to accept any of these manager’s claims. Alas, managers are mere mortals. Three Aspects of Performance Chart The Figure below shows the formula to calculate the number of years needed for a t-stat of 2. We first determine the excess return over a benchmark (the alpha) then determine the regularity of the excess returns by calculating the standard deviation of those returns. Based on these two numbers, we can then calculate how many years we need (sample size) to support the manager’s claim of skill. Sample Size Calculator for Active Manager Alphas As you see in the calculator above, the t-stat is held at 2. Understanding why a t-stat of 2 or more is considered statistically significant is important. However, it is vital to simply grasp why bigger t-stats mean the value is more “reliably” different from zero. To begin with, refer to the following equation defining a t-stat: or t-stat = (average x √Observations ) / standard deviation Decomposing the elements of this equation can demonstrate what leads to bigger t-stats and help instill the intuition behind why a bigger t-stat implies that the observed value is less likely to have a true value of zero. “Average” is the average of all observations in the sample. This parameter is in the numerator, so as the average increases, so does the t-stat. To illustrate, consider the two data series below: Series A: 1, 2, 1, 2, 1, 2, 1, 2, 1, 2 Series B: 9, 10, 9, 10, 9, 10, 9, 10, 9, 10 Both have the same number of observations and the same standard deviation. But series A has an average of 1.5 and series B has an average of 9.5. As the average increases, so does the t-stat, meaning it is less likely the true average from series B is actually zero. The intuition here is that a mean further from zero makes it less likely that the true value is in fact zero. “√N” is the square root of the number of observations. This parameter is also in the numerator, so as the number of observations increases, the t-stat does as well. Consider the two data series below: Series A: 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2 Series B: 1, 2, 1, 2 Both have the same average of 1.5 and the same standard deviation of 0.5, but series A has 20 observations and series B only has 4. As the number of observations increases, so does the t-stat, and the observed average becomes more reliable. In this example, series A has a t-stat of 13.4 and series B has a t-stat of 6 due to the difference in the number of observations. This means series A is more reliably different from zero than series B. The intuition here is that a larger number of observations results in more reliability. “Standard deviation” is a measure of how much the individual observations in the sample vary from the average. This parameter is in the denominator, so as the standard deviation decreases, the t-stat increases. Consider the two data series below: Series A: 9, 10, 9, 10, 9, 10, 9, 10, 9, 10 Series B: 18, 0, -18, 32, 10, -20, 40, 15, 8, 10 Both have the same 9.5 average and the same number of observations, but series A has much less volatility and a lower standard deviation than series B. As the standard deviation increases, the t-stat decreases, so the average from series B is less reliably different from zero than the same average from series A. Said differently, there is a greater likelihood the 9.5 average from series B happened by chance due to the volatility of the data series. The intuition here is that a more volatile data series results in a mean that is less reliably different from zero. Here is a calculator to determine the t-stat. Don't trust an alpha or average return without one. The Fama and French Risk Premiums are good examples of the use of the t-stat. Based on the long term data, there has been an excess return for exposure to these risk factors, referred to as the US Equity Premium (Risk of the Total Market - Risk Free - 30 d T-Bill), the US Value Premium (High Book to Market - Low Book to Market), and the US Size Premium (Small Companies - Big Companies). An important consideration for investors is the likelihood that these risk “premiums” are actually zero (i.e., there is no premium) despite a historical mean that is positive. As discussed, the starting point is calculating a t-stat for each return series as outlined in Table 1 below. The t-stats in Table 1 are all considered statistically significant (i.e., greater than 2), and we can almost be 99% sure that all three risk premiums are positive, with only the SMB t-stat being marginally lower than the required 2.6 for that level of significance. All three data series have the same number of observations, so differences in their t-stats will be a function of different means and standard deviations, as illustrated in Table 2 below. As you can see, the equity premium is the most reliable (i.e., different from zero) despite having the highest volatility because it has a significantly higher mean to go with it. Conversely, the size premium is less reliable than the value premium despite having nearly the same volatility because it has a lower historical mean. In “Challenge to Judgment,” Paul Samuelson dismisses investors who claim they can find benchmark-beating managers by saying, “They always claim that they know a man, a bank, or a fund that does do better. Alas, anecdotes are not science. And once Wharton School dissertations seek to quantify the performers, these have a tendency to evaporate into thin air—or, at least, into statistically insignificant t-statistics.” Although a few managers will occasionally appear to have reliably delivered alpha, IFA cautions investors that the fact that there are so many managers virtually guarantees that there will be some who appear to have demonstrated true skill. Unfortunately, the number of such managers is no higher than what we would have if all of them were monkeys throwing darts at the Wall Street Journal. Two studies that elegantly address this point are:
Rob Silverblatt of U.S. News and World Report spoke with Eugene Fama about the implications of the “Luck versus Skill in the Cross Section of Mutual Fund Alpha Estimates” study conducted by Fama of the University of Chicago and Kenneth French from Dartmouth, which casts serious doubt on managers’ ability to generate alpha. Here is his interview: Figure 3-6C
A Stock Picker's DefeatEven professional stock pickers can fall hard. Bill Miller, chief investment officer of Legg Mason Capital Management and portfolio manager of the Legg Mason Capital Management Value Trust and Value Equity Strategy, lost his Midas touch after a long stretch of beating the S&P. On November 17, 2011, the company announced that Miller will be stepping down effective April 30, 2012. Formerly a former Morningstar “Fund Manager of the Decade,” Miller seemed to glitter throughout the 90’s only to have his sparkle go dim towards the end of the following decade. His fund grew from $750 million in 1990 to more than $20 billion in 2006. As of November 16, 2011, total assets are down to $2.8 billion. His Legg Mason Value Trust Fund (LMVTX) is portrayed in Figures 3-A, 3-B and 3-C, showing the risk and return results of his fund for three different time periods, compared to various indexes and index portfolios: Figure 3-A for the decade of the 90s through 2000; Figure 3-B for the ten years from 2001 to 2010; and Figure 3-C for the 28 years and 8 months since the inception of the LMVTX fund. Figure 3A Figure 3B Figure 3C Figure 3-B shows just how hard the mathematics did hit Miller. Despite the fact that his “so-called streak” showed him to outperform the S&P 500 for a 10-year period, Miller’s subsequent 10-year returns from 2001 to 2010 pale in comparison to the indexes and index portfolios shown. Miller’s outperformance and subsequent underperformance were the result of his excessively risky bets on concentrated investments among highly correlated stocks. While equity index portfolios invest across many asset classes and invest in as many as 12,000 companies in 40 different countries, Miller’s strategy was to “place big bets on stocks other investors feared,” cites a Wall Street Journal article, “The Stock Picker’s Defeat.” According to the December 2008 article, “Mr. Miller was in his element [a year ago] when troubles in the housing market began infecting financial markets. Working from his well-worn playbook, he snapped up American International Group Inc., Wachovia Corp., Bear Stearns Cos. and Freddie Mac. As the shares continued to fall, he argued that investors were overreacting. He kept buying.” The article continued, “What he saw as an opportunity turned into the biggest market crash since the Great Depression. Many Value Trust holdings were more or less wiped out. After 15 years of placing savvy bets against the herd, Mr. Miller had been trampled by it.” Miller stated, “The thing I didn’t do, from Day One, was properly assess the severity of this liquidity crisis... I was naïve… Every decision to buy anything has been wrong…It’s been awful.” Not only did the assets themselves plummet, but investors bailed on the fund pushing its assets down from its apex of $21 billion to around $4.2 billion.
See this article for more Lessons from Bill Miller: Don't concentrate, don't style drift, and nobody can beat a risk adjusted market over long periods. Invest right, sit tight. Also see the Quote of the Week #45.
3.3.5 Stock Pickers and Coin FlippersThe attempt
to predict the outcome of a coin toss is a futile endeavor. Unless
the coin is rigged, the only way to make a correct prediction is to
guess blindly. Unfortunately, it is with the same disregard for investors’
financial health that the financial institutions and media perpetuate
the false idea that some people have a gift or method for predicting
future stock price gyrations.
|
|