How to Benchmark Commodity Performance (And Why It's Tricky)
Book: Commodities: Markets, Performance, and Strategies
Editors: H. Kent Baker, Greg Filbeck, Jeffrey H. Harris
Publisher: Oxford University Press, 2018
ISBN: 9780190656010
The Problem Nobody Talks About
Chapter 17, written by Aaron Filbeck, addresses something that most commodity investors skip right over: how do you actually measure commodity performance? With stocks, you pick the S&P 500 or the Russell 2000, compare your returns against it, and move on. With commodities, it is nowhere near that simple.
The core issue is that commodity indexes are not true benchmarks in the way equity indexes are. They are strategies dressed up as benchmarks. And the choice of which index you compare against can completely change whether a manager looks good or terrible.
Three Generations of Commodity Indexes
Filbeck organizes commodity indexes into three generations, each progressively more active in how it handles the futures curve.
First Generation: Simple and Exposed
First-generation indexes are long-only and hold front-month futures contracts. They just buy the nearest expiring contract and roll to the next one as expiration approaches. No fancy optimization. No attempt to manage the effects of contango or backwardation.
The two most well-known first-generation indexes are the S&P-GSCI and the Bloomberg Commodity Index (BCOM, formerly the Dow Jones-UBS Commodity Index).
The S&P-GSCI originated in 1991 and weights its 24 commodity futures contracts by global production. This sounds neutral, but it means energy commodities make up about 56 percent of the index, with WTI crude oil holding the largest weight. Seven of the 24 commodities have weights below 1 percent each. So when people say “the GSCI was up 10 percent,” what they really mean is oil and gas were up.
The BCOM takes a different approach. It limits individual commodities to 15 percent and sectors to 33 percent. It uses liquidity and production weights, which gives it a more balanced feel. The BCOM has historically performed best when agriculture and metals outperform energy.
Same asset class. Same time period. Different answers about performance. That is the fundamental problem.
Second Generation: Trying to Be Smarter
Second-generation indexes are still long-only but take a more active approach to rolling contracts. The idea is to reduce the damage from rolling in contangoed markets.
The UBS Bloomberg CMCI, launched in 2007, uses a daily rolling mechanism and spreads positions across multiple maturities on the futures curve. Instead of rolling the entire position in a few days each month (which first-generation indexes do), CMCI rolls small portions daily. This “tenor diversification” reduces the concentrated negative impact of rolling in steep contango.
The Merrill Lynch Commodity Index (MLCX) holds contracts that expire two months out and rolls to the third month, giving it about one to six weeks of extra maturity compared to first-generation indexes.
The Deutsche Bank Liquid Commodity Index uses a quantitative methodology to pick contracts with the smallest negative returns in contango or the biggest positive returns in backwardation.
Research by Skapa (2013) found that second-generation indexes have historically produced higher returns than first-generation ones while maintaining similar risk levels.
Third Generation: Long and Short
Third-generation indexes go further by taking both long and short positions. They go long in backwardated markets and short in contangoed ones. This lets them profit from falling prices in some commodities while still capturing upside in others.
The SummerHaven Dynamic Commodity Index (SDCI) selects 14 commodities from a universe of 27 each month. Seven are chosen for having the most backwardation. The other seven are selected based on strongest price momentum over the prior 12 months. For each commodity, the index picks the individual contract with the largest backwardation.
The UBS Bloomberg CMCI Active Index layers an active management overlay on top of the second-generation CMCI methodology. A research team of analysts makes strategic and tactical decisions about allocation, components, tenors, and sectors.
Why the CAPM Does Not Apply
Here is something important that Filbeck points out. The capital asset pricing model (CAPM), which is the standard way to value stocks, does not work for commodities.
The CAPM says a stock’s expected return comes from the risk-free rate plus a risk premium based on the stock’s beta. But commodities do not raise capital. They do not distribute cash flows. They get consumed or transformed. Their value is entirely driven by supply and demand, not by discounted future cash flows.
Erb and Harvey (2006) found that an equally weighted, non-rebalanced basket of commodities had an excess return over the risk-free rate of basically zero. Traditional valuation does not explain commodity returns. You need a different framework.
The Financialization Debate
Between 2003 and 2008, passive commodity index fund assets grew from about $13 billion to $200 billion. That is a massive amount of money flowing into a relatively small market.
Michael Masters testified before the U.S. Senate in 2008 that institutional investor speculation was driving commodity price inflation. He called institutional investors “index speculators” and argued that their buying of entire commodity indexes was creating a demand shock that drove up prices regardless of actual supply and demand fundamentals.
Irwin and Sanders (2012) tested this “Masters hypothesis” and found no evidence linking commodity index performance to bubble-like characteristics. They argued that Masters’ methodology had too many errors and omissions to be reliable. Instead, the movements from peak to trough looked more like a normal commodity cycle.
Hamilton (2008) reached the same conclusion. He argued that price elasticity creates a natural limit on how much commodity prices can rise in the long term. If crude oil prices go up too much, production increases and consumption decreases, which pushes prices back down.
But Zaremba (2016) found that highly financialized markets produced lower excess returns, lower standard deviations, and more negative skewness. The growing presence of noncommercial traders in commodity markets may have changed the game.
Correlations That Break When You Need Them
Filbeck highlights a finding from Buyuksahin, Haigh, and Robe (2010) that should concern anyone counting on commodities for diversification. They studied correlations between commodity and equity indexes from 1991 to 2008. For most of that period, correlations were low to negative, just as advertised.
But when they extended their sample to include the fall of 2008, correlations between commodity and equity indexes rose dramatically. In the worst of the financial crisis, commodities fell alongside stocks. The diversification benefit disappeared exactly when investors needed it most.
Kawamoto, Morishita, and Higashi (2011) found that different factors drove commodity prices at different times. Between 2007 and 2008, investor flows from other assets into commodities (a “flight to simplicity”) drove the commodity boom. Between 2009 and 2011, accommodative monetary policy and emerging market demand drove prices.
Are Commodity Indexes Even Passive?
Filbeck saves the most provocative argument for last. Erb and Harvey (2006) assert that no true passive index exists in commodity markets. They describe current benchmarks as strategies rather than true passive benchmarks.
The reasoning: in futures markets, every long position has a corresponding short position. The total market capitalization is effectively zero. Unlike equities, where you can weight by market cap, commodities use alternative weighting methods like worldwide production or liquidity. These are arbitrary choices that embed active bets.
Atwill and Liebel (2016) expand on this, stating that commodity index providers are “not attempting to provide a ‘slice of the market,’ as most mainstream equity indexes do, but instead are simply very explicit commodity trading strategies based on different arbitrary metrics.”
This matters because if your “benchmark” is actually a strategy, then benchmarking against it does not tell you what you think it tells you.
My Takeaway
This chapter is a reality check. If someone shows you commodity index performance and says “this is what commodities did,” your first question should be “which index?” and your second should be “what is in it?”
The GSCI, BCOM, and CRB all produced different returns and volatilities over the same time period. The choice of generation (first, second, or third) matters enormously. And the philosophical question of whether any commodity index is truly passive is one that most investors never even consider.
For anyone evaluating commodity managers or deciding on a commodity allocation, this chapter’s message is clear: understand the benchmark before you use it. Otherwise, you are measuring against a yardstick that might be stretching.
Previous: Return Characteristics of Commodities: What to Actually Expect
Next: Volatility in Commodity Markets Part 1: How Shocks Spread