Benchmark buyer beware: How well do you know your index?

While benchmarks are important tools for portfolio managers and investors alike, we challenge the conventional wisdom that they are paradigms of investing excellence. Despite the effective marketing campaigns that have brought benchmarks into the public consciousness and attracted significant capital to passive investment strategies, few investors fully understand how these benchmarks are calculated or what they represent. We dissect the benchmark construction process and reveal how decisions made by index providers can lead to unintended factor exposures in various benchmarks. Using the Russell Midcap® Value Index as our primary example, we find evidence of size, momentum and sector tilts, as well as outsized exposure to interest rates, clientele effects and low-quality businesses. We demonstrate that without full understanding of benchmark construction, evaluations of active-manager performance are unreliable. Moreover, factors that have led to outperformance by index funds in recent years could easily reverse.


INTRODUCTION
'This is the darkest of days', Morningstar's Vice President of research, John Rekenthaler, recently warned The Wall Street Journal. 'Active management has never been in worse repute'. According to Rekenthaler's firm, 78 per cent of active domestic equity managers trailed their relevant benchmarks in 2014. 1 Despite strong relative returns from active strategies in 2013, critics cite the 2014 data as evidence that active managers lack sufficient investment skill to justify their fees. 2 Fund investors appear to agree. In 2014, actively managed domestic equity funds saw US$13 billion in net withdrawals, while their passively managed counterparts received $244 billion in net subscriptions. 3 The superiority of passive investing seems to have become conventional wisdom, but we challenge the view that the benchmark is the archetype of investing excellence. Despite the effective marketing campaigns that have brought benchmarks into the public consciousness, few investors fully understand how these benchmarks are calculated or what they represent. Even some of the oldest market benchmarks are poorly understood. Take the iconic Dow Jones Industrial Average ® (DJIA). Journalists and politicians frequently refer to the DJIA as a gauge of the overall US stock market and economy, even though it is a price-weighted index. 4 The change in the DJIA from one period to the next only provides an estimate of the US stock market return if each of its constituent companies has an equal number of shares outstanding, which is obviously not the case. 5 Clearly, failure to understand how a benchmark is constructed can lead to improper inferences about its meaning, and the DJIA is but one example. With the plethora of indexes available in the marketplace, and the proliferation of funds that mimic them, the potential for misunderstanding is significant.
In this article, we examine the purpose of benchmarks, discuss how benchmarks are selected in practice, and demonstrate why benchmark construction matters to both active and passive investors. We show how to identify tilts in benchmarks and explain why no benchmark is inherently good (or bad). Using the case of a mid-cap value manager, we highlight some shortcomings of blindly relying on the natural choice of a value benchmark. We conclude that benchmarks can be useful tools for estimating opportunity cost and evaluating manager skill, but they are only beneficial if they are well-understood.

PURPOSE OF BENCHMARKS
A well-designed benchmark provides information that is useful to investors and fund managers. Generally speaking, a benchmark allows for the calculation of an average return on a basket of assets. For example, the S&P 500 ® Index measures the average return on roughly 500 of the largest public companies in the United States, weighted by market capitalization. The return on a benchmark provides the basis for measuring opportunity costthe return one would have earned had he invested in that pool of assets at their benchmark weightswhich is helpful for evaluating asset allocation decisions. Moreover, the benchmark provides the basis for evaluating the performance of an active manager. The relative risk-adjusted return, or alpha, of a portfolio versus an appropriate benchmark is a helpful quantitative measure of its manager's skill.
Outside of the special case of the 'market portfolio' of the capital asset pricing model introduced by Sharpe (1964), a benchmark does not necessarily represent the optimal portfolio of assets within a given opportunity set. Stambaugh (1982), Fama and French (1998) and others show that even broad market indexes fail to exhibit mean-variance efficiency. Narrowly defined benchmarks, such as sectorand size-constrained indexes, are even less likely to be optimal in the mean-variance sense.

BENCHMARK SELECTION
A manager selects a benchmark based on his investment strategy and opportunity set. For example, a US equity fund manager might select a benchmark based on the strategy's: (1) objective (for example, income versus capital appreciation), (2) target size (for example, large-cap versus small-cap), (3) style (for example, growth versus value) and (4) breadth (for example, diversified versus sector-focused).
There are myriad benchmarks from which to choose, although a handful of index providers, such as S&P Dow Jones, Russell Investments and MSCI, produce the most widely referenced benchmarks for US equity funds. Selecting a lesser-known benchmark can affect the marketability of a fund, so most managers opt for a brand name, as shown in Table 1.
Benchmark selection is a serious matter. Once a fund selects a benchmark, it is difficult to switch. A domestic mutual fund must obtain board approval and file with the SEC to change its benchmark. Plus, the fund must report its performance versus both its new and former benchmarks for a period of time. Even if a manager takes the appropriate legal steps to effectuate a benchmark change, many investors will look askance at a benchmark change and assume that it is simply a cosmetic treatment for poor performance.
An investor typically establishes his investment objectives and selects a manager accordingly. For example, a pension manager might allocate a portion of his assets to a specific class, such as mid-cap domestic equities. If he intends to measure his performance against the Russell Midcap ® Index, he would ideally select a fund that uses the same benchmark.
As performance relative to the benchmark is viewed as one of the primary barometers of manager skill, selecting an appropriate benchmark is critical for both the manager and investor. In selecting a benchmark, a manager balances the need for accuracy of measurement with a desire to outperform. Choosing a benchmark that closely matches his strategy and opportunity set helps him avoid exposure to unwanted risk factors that might lead to underperformance. However, Kroah (2011) suggests that many managers draw significantly from securities outside of their chosen benchmarks. The implication is that these managers deliberately select misaligned benchmarks with the goal of generating outperformance that is unrelated to their skill. A common example of this is a manager who uses a large capitalization benchmark while investing heavily in small capitalization stocks, seeking to take advantage of the small-stock effect documented by Fama and French (1993). A more extreme example is a manager who knowingly selects an inappropriate benchmark to increase the chances of differentiating his performance with outlier returns. This manager hopes that investors will mistake such outperformance for skill.
Recognizing the competing interests of managers, prudent investors take several steps to confirm that benchmark choices are appropriate. First, they evaluate how a benchmark aligns with a fund's stated investment objectives. For instance, a fund designed to invest in high-yield corporate bonds that uses a government bond index rightly arouses suspicion. Second, investors compare the fund's holdings with its benchmark constituents to determine if the manager's security selections are drawn from his intended universe. Holdings of foreign companies by a domestic equity manager, for example, could be an indication of style drift that warrants further examination. Third, shrewd investors devote time to understanding how a manager's process and philosophy might produce returns that deviate from a given benchmark. Consider the manager who states, ex ante, that he intends to avoid investments in highly regulated businesses, such as banks and utilities, because of his belief that such businesses produce inferior risk-adjusted returns through the full economic cycle. This manager could expect his performance to differ significantly from a benchmark that contains large weightings in these businesses. Such deviations would reflect the manager's self-imposed constraints but not his skill.
In the absence of a more appropriate benchmark, both parties would need to understand the benchmark's construction to properly evaluate the manager's performance.

BENCHMARK CONSTRUCTION
In order to properly use a benchmark, one must understand how it is constructed. 6 For example, consider the Russell US Equity Indexes, a family of market cap-weighted indexes widely used by managers. Each June, Russell ranks, in descending order of market capitalization, all of the common stocks of US companies. The top 3000 stocks in this list become the constituents of the Russell 3000 ® Index. The largest 1000 stocks become the Russell 1000 ® Index (a prominent benchmark for large-cap equity funds), and the next 2000 stocks in this list become the constituents of the Russell 2000 ® Index (a popular small-cap benchmark). Companies that rank 201st-1000th in size, which are a subset of the Russell 1000, become the Russell Midcap Index. These are examples of primary indexes. Russell further refines its primary indexes by style. One such style distinction is growth versus value. This is where the construction methodology gets highly technical. Russell takes all of the constituents of a primary index, such as the Russell Midcap Index, and sorts them based on two fundamental attributes, historical sales growth and forecasted EPS growth, and one valuation attribute, the book-to-price (B/P) ratio. Russell uses the fundamental attributes to determine whether a company possesses growth characteristics, and it uses the B/P ratio to assess value. 7 It then applies a 'nonlinear probability method' to assign companies to the growth and value benchmarks (Russell Investments, 2014). 8 Russell's process has some important nuances. For one, a company can be represented in both the Russell Midcap ® Growth Index and the Russell Midcap Value Index. 9 In fact, the average of the company's weights in the value and growth indexes is its weight in the primary index. As all companies are fully represented by the combination of their growth and value weights, if a company has 80 per cent of its weight in the value index, it has 20 per cent of its weight in the growth index. This concept is illustrated in Table 2. Likewise, a company with equal growth and value prospects has the same weight in each of the stylized indexes as in the primary index.
Those companies with strong (weak) growth attributes and weak (strong) value attributes are entirely assigned to the growth (value) index. A company appearing in only one style index has twice the weight in that index as compared with its weight in the primary index. Companies with strong growth and strong value attributes are assigned to both indexes. Interestingly, companies with neither strong growth nor strong value attributes are also assigned to both indexes. A company that is assigned to both styles has less weight in each stylized index relative to its weight in the primary index as compared with a similar company assigned to only one index. Russell's treatment over overlapping companies is common among providers of stylized indexes. However, in an effort to create factor portfolios that are not diluted by overlapping stocks, S&P Dow Jones launched a Pure Style Index Series that contains only companies that possess strong tendencies toward growth or value.

BENCHMARK DESIGN IMPLICATIONS
The case of the Russell Midcap Value Index After gaining familiarity with the construction process, one can begin to consider its implications for the benchmark. The Russell Midcap Value Index proved to be a challenging benchmark to beat in 2014, 10 so it provides an interesting case study. To analyze the Russell Midcap Value Index, we start with the benchmark from which it is derived, the Russell Midcap Index. The simplest observation about the Russell Midcap Index is that a large portion of the benchmark is concentrated in a small number of companies. Roughly half of the index weight consists of about a quarter of the constituents. This is a common feature of market cap-weighted indexes. 11 Moreover, because of its definition as the smallest 800 names in the Russell 1000 Index, the Russell Midcap Index includes many companies that would traditionally be considered large-cap. At the end of 2014, about one sixth of the constituents of the Russell Midcap Index had market capitalizations greater than $15 billion, and approximately a third of the weight in the benchmark was assigned to these companies. So, the investor who deliberately constrains his opportunity set to include mid-cap equities should be aware that the Russell Midcap Index has a size tilt relative to an equal-weight portfolio. 12 As a result, the performance of large-cap companies has a pronounced impact on the returns of this index.
Like all other market cap-weighted benchmarks, the Russell Midcap Index overweights high-momentum securities. Because the weight of an individual company depends on its market capitalization, those stocks with recent outperformance have grown in weight relative to recent underperformers. Moreover, Hsu (2006) and Treynor (2005) show that overvalued stocks have higher weights in a cap-weighted index than would be warranted by their (unobservable) fair values. Arnott et al (2005) also notes that, relative to a portfolio of companies weighted by fundamental measures, such as sales, book value or cash flow, cap-weighted indexes have a tilt toward high-multiple stocks with strong perceived growth opportunities. These characteristics coerce cap-weighted indexes into assuming growth and momentum characteristics, which can lead to underperformance of contrarian or value-based strategies in certain market environments.
Russell's definitions of growth and value have implications for portfolio management. Finance theory suggests that there is a reason why one company has a higher-B/P ratio than another, and it relates to the relative return on equity (ROE) of each company. Ranking companies based on B/P alone tends to favor businesses with low ROEs. This is because the B/P metric is inversely related to ROE, as shown in the following derivation (Stowe et al, 2002).
Assume a stock is fairly valued by the single-stage Gordon Growth Model, 14 written in terms of next year's earnings: Then, divide by current book value and simplify to achieve In the above, P 0 Current value of equity D 1 Dividend at t = 1 E 1 Earnings at t = 1 PR Dividend payout ratio B 0 Book value of equity at t = 0 ROE Return on equity at t = 1, or E 1 /B 0 r e Cost of equity g Organic growth rate in earnings and dividends.
Since B/P and ROE are inversely related, high B/P companies tend to be low-ROE businesses. 15 Some of these businesses have ROEs that are depressed because of historical acquisitions or cyclical factors, but many are structurally constrained because of high capital intensity (for example, banks and REITs) or regulation (for example, utilities).
Hence, value as defined by Russell may not reflect cheapness. It may be a function of high capital intensity, regulation and/or fierce competition, which are often indications of low business quality. Perhaps some valueoriented managers specifically seek out these low-return businesses. If so, the Russell value benchmarks are appropriate for them. However, the value investing narrative that we hear more frequently involves (i) identifying quality businesses with high returns on capital and (ii) paying a reasonable price relative to the expected earnings or cash flows such businesses will produce over time.
The number one idea is to view a stock as an ownership of the business and to judge the staying quality of the business in terms of its competitive advantage. Look for more value in terms of discounted future cash-flow than you are paying for.-(Charlie Munger, iconic value investor and vice-chairman of Berkshire Hathaway Corporation) The Russell determination of value does not directly consider price relative to cash flow or earnings. While earnings are an input into the calculation (see equation above), consider the case of a capital-light business that produces a 40 per cent ROE and trades in the market for 10 times earnings. Is this not a value? According to Russell's distinction between value and growth, it likely is not. 16 Clearly, the manager that subscribes to the Buffett-Munger definition of value could have a vastly different portfolio construction than what is captured in the benchmark.
If one were to take a cross-section of companies in the same industry, such as commercial banking, and compare them on their B/P ratios, one might reasonably draw a conclusion about their relative values. After all, they face the same regulations and have similar capital structures. So, ranking companies based on B/P ratios might be fair for companies within an industry, but the Russell methodology ranks companies across industries. 17 As a result, all companies with high B/P ratios, regardless of industry, rank highly on Russell's value scale. This approach tends to apply the value tag to entire industries with high B/P ratios, such as REITs, utilities and banks. An emphasis on these specific industries is indeed evident in the Russell value indexes, as shown in Figure 1. This phenomenon is persistent over time, as well. For instance, the Russell Midcap Value Index consistently contains a large weighting in REITs, utilities and banks, as shown in Figure 2.
Industry concentrations can make a benchmark vulnerable to macroeconomic factors and clientele effects. 18 The large weightings of REITs and utilities in the Russell Midcap Value Index provide a pertinent example. The above-average dividend yields of REITs and utilities have historically made them desirable investment vehicles for yield-oriented investors. As shown in Figure 3, this clientele effect  has recently intensified. With interest rates hovering near all-time lows, the correlations of REIT and utility stock returns with those of US Treasury bonds have increased to extreme levels. Hence, it comes as no surprise that, against a backdrop of quantitative easing and generally low-interest rates, REITs and utilities have performed well. This has boosted the Russell Midcap Value Index materially and led to significant underperformance among mid-cap value managers. According to DeSanctis and Wang (2015), mid-cap value managers were 840 basis points underweight REITs and 580 basis points underweight utilities in 2014. The strong performance of these two sectors in 2014 caused the returns of actively managed funds to trail their benchmarks by a wide margin. However, it is possible to imagine an environment in which the attributes of REITs and utilities are less desirable. With more than a quarter of the Russell Midcap Value Index exposed to these two areas, any increase in interest rates or disassembly of the yield-hungry investor clientele could have a meaningful, negative impact on its performance. A further implication of the Russell approach to constructing its stylized benchmarks is that, ceteris paribus, the growth indexes overweight expensive stocks, and the value indexes overweight low-growth businesses. This is best illustrated with an example. Consider two businesses, A and B.  14 '11 '08 '05 '02 '99 '96 '93 '90 '87 '84 '81 '78 '14 '08 '02 '96 '90 '84 '78 '72 '66 '60 '54 '48 '42 '36 '30 Percent (%) Percent (%) Both have weightings of 1 per cent in the primary index, and both rank highly on the growth scale. Company B ranks highly on the value scale, while A ranks near the bottom. As shown in Figure 4, despite having similar growth attributes, B receives a lower weighting in the growth index than A. This feature of the Russell methodology serves to amplify the signal associated with each stylized benchmark. However, it is in stark contrast to the likely approach of many portfolio managers. While a manager might prefer the less-expensive stocks from among a collection of fast-growing companies, the Russell growth methodology exhibits a preference for the expensive companies. By the same token, the manager might prefer cheap stocks that also have favorable growth prospects, but the Russell value methodology favors those that lack growth. Intuition suggests that, in the long run, this manager could outperform his stylized benchmark. However, in the short run, he could suffer bouts of underperformance if the most expensive growth companies or slowest growing value companies produce superior returns.

CONCLUSIONS
In the case of a mid-cap value manager who is benchmarked against the Russell Midcap Value Index, there are numerous construction issues that a client needs to consider when evaluating performance relative to the benchmark. These include tilts in favor of size, momentum, and low-ROE businesses, vulnerability to industry concentration and clientele effects, and a preference to avoid growth, even when it is cheaply priced. Any of these issues could account for deviations in manager performance that are not reflective of skill. Only a client who truly understands these benchmark construction issues is able to make a reasonable inference about that manager's performance over a given time period.
Understanding benchmark construction might be even more important for passive investors. Again, consider the Russell Midcap Value Index. According to Morningstar, there is more than $40 billion in client money invested in passive investment vehicles (exchange-traded funds or mutual funds) that are attempting to replicate the performance of the Russell Midcap Value Index. These investors have fared remarkably well for the last 3 years, outperforming more than 80 per cent of actively managed funds in the mid-cap value category, as shown in Figure 5. However, if these investors have not studied their benchmark carefully, they might be surprised to learn that they are 27 per cent invested in bond proxies, 19 heavily skewed toward low-ROE businesses and systematically underweight in companies that show signs of growth. A rational decision for these investors might be to take profits and select a manager who deviates from the benchmark (a.k.a., an active manager). Benchmarks are pervasive in today's investing environment, but it is important to remember that there is nothing inherently bad or good about a benchmark. While benchmarks can be helpful tools for estimating opportunity cost and evaluating manager skill, they are only beneficial if they are well-understood. Inappropriate application of benchmarks can lead to incorrect evaluations and improper capital allocation decisions.
The Russell indexes are not flawed just because they exhibit the tilts mentioned in this article. All benchmarks are predisposed to certain factors. That means even passive investment strategies involve active decisions. Those decisions are simply made by benchmark providers rather than by fund managers. Whether investors choose active or passive investment strategies, they must recognize what tilts are present and account for these when using the benchmark to allocate capital or evaluate manager performance.
Most active managers underperformed their benchmarks in 2014. Judging whether they performed poorly or not requires an understanding of how a benchmark is constructed and how its implicit exposures relate to the managers' opportunity sets and constraints. In the case of active mid-cap value managers, a small number of factors had an outsized influence on the benchmark's performance. Should those factors reverse in the future, actively managed funds could outperform. Benchmark returns have had a good run, but fund investors would be wise to remember that it is always darkest just before the dawn.
1. Statistics such as this are not overly surprising, as Sharpe (1991) demonstrates that active strategies, in aggregate, must underperform the benchmark by their level of fees. However, because summary statistics based on mutual funds do not fully represent the performance of all active strategies and are not dollar weighted, they are not definitive evidence of underperformance by all active managers.