Error-Adjusted Momentum

In the last post I introduced a method to normalize returns using the VIX to improve upon a standard momentum or trend-following strategy. There are many possible extensions of this idea, and I would encourage readers to look at one of the comments in the previous post which may inspire some new ideas. The motivation for this method was to provide an alternative approach that is more broadly applicable to other assets than a VIX-based strategy (which is more appropriate for equities). This method uses the standard error of the mean to adjust returns instead of using the VIX, which is a proxy for market noise. The logic is that returns should be weighted more when predictability is high, and conversely weighted less when predictability is low. In this case, the error-adjusted moving average will hopefully be more robust to market noise than a standard moving average. To calculate the standard error, I used the 10-day average return to generate a forecast, and then took the 10-day mean absolute error of the forecast. To normalize returns, I divide each return by this standard error estimate prior to taking a 200-day average of the re-scaled returns. The rules for the ER-MOM strategy are the same as in the last post (although poorly articulated):

Go LONG when the Error-Adjusted Momentum is > 0, Go to CASH if the Error-Adjusted Momentum is < 0

Here is how this strategy compares to both the VIX-adjusted strategy and the other two baseline strategies:

er mom table

er mom chart

The error-adjusted momentum strategy has the best returns and risk-adjusted returns- edging out the previous method that used the VIX. In either case, both adjusted momentum strategies performed better than their standard counterparts. One concept to note is that the benefit or edge of the adjusted momentum strategies tends to be more significant at longer trend-following lookbacks. This makes sense because there are likely to be a wider range of variance regimes throughout a long stretch of time than over a shorter lookback. Adjusting for these different variance regimes gives a clearer picture of the long-term trend. Using the historical standard deviation is also a viable alternative to either using the standard error or the VIX, and there are a lot of other ways to measure variability/noise that can be used as well.

VIX-Adjusted Momentum

The addition of many small details can make a big difference in seemingly simple strategies. I often like to use cooking analogies, and so I like to think of tomato sauce as a classic example: it contains few ingredients and is simple to make but difficult to master without understanding the interaction between components. Trend-following strategies are no different: anyone can create a simple strategy, few can master the nuances. One of the problems in measuring trends in financial market data is that the variance is not constant. In statistics we know that heteroscedacity can render the use of traditional regression analysis meaningless. Therefore, to use un-adjusted price data in conjunction with a moving average strategy, or even taking the simple compound return or ROC (rate of change) can lead to potentially poor timing decisions and increase the frequency of trading.

The good news is that it is well-accepted that volatility is highly predictable in financial markets. Perhaps one of the best measures of volatility is implied volatility reflected by market participants in the VIX. A simple idea would be to use the VIX to adjust daily returns in order to create a trend-following strategy that is more robust to non-constant variance. The method as follows is very simple:

1) compute daily returns or log returns for the S&P500 time series
2) divide each daily return by the VIX level on the same day
3) take a lag of your choosing and compute the simple average–say 200-days in this example

Strategy: Go LONG when the VIX-Adjusted Momentum>0, Go to cash if sma, cash if not) and a 200-day traditional momentum strategy (go long when the ROC>0, cash if not).

vix mom table

here is a graph comparing the strategies:

vix mom graph

Clearly the VIX-ajusted momentum is superior to the traditional trend-following strategies using this set of parameters. This concept can be extended in several different ways- for example, one could instead use historical volatility, or the difference between historical and implied in other creative ways. Hopefully readers will be inspired to take a fresh look at improving upon a simple and traditional strategy.

Part 2: What Factors Drive the Performance of Momentum Strategies?

mean dispersion
In part 1 of the series we introduced a three-factor model that decomposes momentum profitability and how that can be translated into a momentum score for an asset universe. In this post we will show how momentum strategies can be profitable even under the conditions where the market is efficient and time series performance is not predictable.

The momentum score we introduced in the last post was comprised of: 1) time series predictability (T) 2) dispersion in mean returns (D) and 3) the existence of lead/lag relationships (L). The score is computed by adding T and D and subtracting out the value of L. More formally, we would take the average auto-covariance across time series, the variance in cross-sectional mean returns and the average cross-serial auto-covariance between asset returns.

One of the core predictions of a truly efficient market is that asset prices should follow a random walk and hence should not be predictable using past prices (or any form of technical analysis). The next period price in this context is a function of the current price plus a random distribution output with a mean and error term. Whether this theory is in fact true based upon the empirical evidence is not a subject that I will address in this article. Instead, what I personally found more interesting was to determine whether the presence of an efficient market would still permit a momentum strategy to be successful. The answer boils down to the formula that de-composes momentum profitability:

T+D-L= Momentum Score

in more formal technical terms, the equation breaks down to:

Momentum Profitability= average asset auto-covariance (T) + cross-sectional variance in asset means (D) – average asset cross-serial auto-covariance (L)

Returning back to the concept of a random walk, this would imply that both auto-correlations and cross-serial autocorrelations would be equal to zero (or close to zero). In that case the formula breaks down as follows:

Momentum Profitability= cross-sectional variance in asset means (D)

Thus, even in the case of a true random walk or an efficient market, we can expect profits to a momentum strategy as long as there is dispersion in the asset means– in other words, we would require that the asset means be heterogeneous to some degree to capture momentum profits. Technically another requirement is that the asset means are fairly stationary– in other words they can drift over time but their means stay approximately the same. However, from a practical perspective many risk premiums are fairly stable over long periods of time (ie the return to investing in the stock market for example). Hence the existence of variation in asset returns alone can support the existence of momentum profits even if the market was considered to be efficient. This helps reconcile why Eugene Fama- the father of the Efficient Markets Hypothesis- can still claim that momentum is the “premier anomaly” and still not technically be a hypocrite (even though it sounds that way to many industry practitioners).

In the last post, we showed that a broad multi-asset class universe had a higher momentum score using the formula presented above than a sector equity universe. This was demonstrated to be primarily due to the fact that the dispersion in asset means is much higher in an asset class universe than a sector universe. To add further to this result, we would expect that the mean returns for asset classes will be more stationary than the means for sectors or individual stocks since they reflect broad risk premiums rather than idiosyncratic or specific risk. As markets become more efficient over time and all assets become more unpredictable, the importance of cross-sectional dispersion in the means (and also mean stationarity) become essential to preserving momentum profits. The implication for investors is that the safest way to profit from a momentum strategy is to employ tactical asset allocation on an asset class universe in order to achieve greater consistency in returns over time.

Momentum Score Matrices

In the previous post we introduced the momentum score as a measure of the potential for momentum profits for a given investment universe. Before proceeding to part 2 of the series, I thought it would be interesting for readers to see a pairwise matrix of momentum scores to get a better feel for how they work in practice. Note that higher scores indicate higher potential for momentum profits. Below are the pairwise momentum score matrices for both sectors and asset classes:

momentum score matrix sectors

Notice that sectors with similar macro-factor exposure have lower scores: for example materials and energy which tend to thrive in cyclical upturns in the economy (XLE/XLB), or health care and consumer staples (XLP/XLV) which thrive in recessions or cyclical downturns. The highest scores accrue to sectors that are likely to do well at different times in the economic cycle such as energy and utilities (XLE/XLU). This makes logical sense– momentum strategies require the ability to rotate to assets that are doing well at different times.

momentum score matrix asset classes

Notice that pairings of equity, real estate or commodity assets classes (e.g. SPY,EEM, IEV, DBC, RWX) with TLT and GLD tend to have the highest momentum scores. The combination of near substitute assets such as intermediate bonds (IEF) and long-term bonds (TLT), or say S&P500 (SPY) and European stocks (IEV) tend to have very low scores by comparison. In general most pairwise scores for asset classes are substantially higher than those contained within the sector momentum score matrix.

What Factors Drive the Performance of Momentum Strategies? (Part 1)

factors
Momentum strategies generate a lot of hype and deservedly so- it is the “premier market anomaly”- a praise heaped by no less a skeptic than Eugene Fama himself. For those who do not know Fama, he happens to be both a founder and ardent proponent of the so-called “Efficient Markets Hypothesis.” The belief in momentum as a legitimate market anomaly has no less fervor in financial circles than organized religion. Doubt its existence and you are akin to a quack or relegated to amateur status among the experienced.

But any real scientist worth their salt should always question “why?” if only to gain a better understanding of the phenomenon. This is not just academic, it is also a practical matter for those who trade with real money. A deeper analysis of the drivers of momentum performance and the conditions in which it can exist can reveal the potential for superior strategies. There have been several landmark papers which shed light on this issue that have no doubt been forgotten or ignored due to their technical nature. For example Lo and MacKinlay (When Are Contrarian Profits Due to Stock Market Overreaction) and Conrad and Kaul (An Anatomy of Trading Strategies). The arguments and evidence put forth in these articles help to reconcile how Mr. Fama can both believe in Efficient Markets and still consider momentum to also exist as a legitimate anomaly at the same time. This isn’t a quirk borne of quantum physics, but rather the implication of some basic math and demonstrated conclusively using simulated financial data.

In a previous post, I presented some ideas and testing related to identifying superior universes for momentum strategies. A simple/naaive method of finding the best performing universes through brute force shows promise, but there are pitfalls because that method does not capture the drivers of momentum performance. So lets begin with inverting the basic math introduced by Lo and MacKinlay that describes the favorability of a particular universe for contrarian or mean-reversion strategies. Since momentum is the polar opposite of contrarian, what is good for one is bad for the other. The table below shows the three ingredients that affect momentum performance:

momentum score 2

The first factor- time series predictability- relates to how predictable or “auto-correlated” an asset or group of assets is on the basis of whether high (low) past returns predict high (low) future returns. If a universe contains highly predictable assets then a momentum strategy will be better able to exploit measurements of past performance.The second factor- dispersion in mean returns- relates to whether a group of assets have average or mean returns that are likely to be materially different from one another. A heterogeneous universe of assets such as one containing diverse asset classes will have different sources of returns- and hence greater dispersion- than a homogeneous universe such as sectors within a stock index. The final factor- lead/lag relationships- is a measure of the strength of any tendency for certain assets or stocks to lead or lag on another. This tendency can occur for example between large liquid stocks and small illiquid stocks. In this case a positive relationship would imply that if say Coke went up today, then a smaller cola company would go up tomorrow. This is good for contrarian strategies that would buy the smaller cola company and short Coke, but obviously bad for momentum strategies–hence the fact that this factor is negatively related to momentum profits. In summary, the equation shows that a “momentum score” can be formulated by adding the time series predictability factor, the dispersion in means factor and subtracting the lead/lag relationship factor.

Let’s show a tangible example to demonstrate how the math matches up with intuition. I calculate a momentum score using the last five years of data for both a diverse asset class universe (SPY,DBC,GLD,TLT,IEF,RWX,IYR,EEM,EWJ,IEV) and also a sector universe (XLE,XLU,XLP,XLB,XLV,XLF,XLK,XLY,XLI). Note that the last five years covers a bull market which would easily obscure comparisons based on just back-testing momentum strategy performance on each universe. The momentum score (higher is better) is broken down by contribution in each table for the two different universes.

global asset class universe

sector universe

Clearly the asset class universe is considered to be superior to just using a sector universe for momentum strategies. This certainly jives with intuition and also empirical performance. But what is more interesting is looking at the largest contribution to the difference between the two universes. We see that the dispersion in the means or variation in cross-sectional average returns is by far the biggest factor that separates an asset class universe from a sector universe. The other two factors practically cancel each other out. This makes sense since most sector returns share a dominant common factor– the return of the stock market or say the return of the S&P500. When the market is up (or down), most sectors are up (or down) to a greater or lesser extent. In contrast, in an asset class universe you could have a lot more variation- stocks could be up, bonds could be down and commodities could be up. The variation in performance is far more substantial. Note that variation in performance or dispersion in means is not equivalent to correlations which measure the scaled relationship between shorter-term returns. Having a universe with low cross-correlations is not a good proxy for this effect. To better demonstrate the effect of adding variation, lets look at how adding different assets to the sector universe individually can change the momentum score:

mom score sectors

Simply adding long-term bonds (TLT) nearly doubles the momentum score versus the baseline of using just the sectors. On the flip side adding the dominant common factor- the S&P500 (SPY)- reduces the momentum score versus the baseline. Adding Gold is actually superior to adding 10-year/Intermediate Treasurys (IEF) which is typically used to proxy the bond component in most portfolios- despite the fact that the correlation of IEF is far more negative than GLD. Using this analysis can provide some very interesting and sometimes counter-intuitive insights (though most make intuitive sense). But more practically, it can be used to create good universes to apply momentum strategies or any other strategy that derives a large chunk of its returns from the momentum effect. In the next post we will clarify why Mr. Fama can both believe in efficient markets and in momentum as an anomaly and also provide some interesting implications and further analysis.

Momentum Strategies and Universe Selection

momentum
It is well established that the momentum effect is robust across individual stocks and broad asset classes. However, one of the biggest issues for implementation at the strategy level is to choose a universe for trading. For example, one might choose a broad index such as the S&P500 for an individual stock momentum strategy, but is that the best choice to use to maximize returns? Or if we wanted to build an asset allocation strategy with momentum, which assets should we include/exclude and why? In general, these issues are rarely if ever addressed in either academic papers or in the blogosphere. The consequence is that the choice of universe can artificially inflate results due to data mining (finding the best universe in hindsight prior to presenting the final backtest), or the choice can be too arbitrary and hence sub-optimal from a strategy development standpoint.

There are good reasons to believe that certain asset universes are likely to be superior to others. In a subsequent post, I will attempt to de-compose mathematically what makes a universe particularly well-suited for momentum strategies. But for now, lets discuss some obvious factors that may drive momentum strategy performance: 1) universe heterogeneity/homogeneity: it stands to reason that having an investment universe comprised of six different large cap ETFs will not lead to desirable results because the universe is too similar (homogeneous). In contrast, choosing different sectors or styles or even asset classes should provide opportunities to find good-performing assets when other assets in the universe are not doing as well. 2) the number of assets in the universe: fewer assets will lead to fewer opportunities other things being equal. 3) co-integration/mean-reversion: choosing a universe comprised of co-integrated assets such as say Coke and Pepsi, or Exxon Mobil and the Energy Sector ETF will probably result in negative momentum performance since deviations from a common mean will eventually revert versus continue. This is not a complete description of the factors that drive momentum performance but rather a list that is likely to make logical sense to most investment professionals.

Since there are good reasons to believe that some universes are simply better than others, it makes sense to determine some heuristic for universe selection to improve the performance of momentum strategies. One logical method to determine the universe for trading/backtesting is to try selecting the best universes on a walk-forward basis rather than in hindsight. In other words, we backtest at each time step with a chosen momentum strategy- for example selecting the top asset by 60-day return- and using another window that is much longer- say 756 days or more- to test each possible universe subset from a chosen universe using a performance metric such as CAGR. We would then select the top n/% of universes by their performance, and then apply the momentum strategy to these universes to determine the assets to trade at each re-balance.

A simple example would be to use the nine different sectors in the S&P500 (sector spyders). Perhaps there are sectors that are better suited to a momentum strategy than using all nine? To test this assumption one might choose all universe subsets that are two assets or more (between 2 and 9 in this case) which results in 502 different momentum portfolios. This highlights a key difficulty with this approach- the computational burden grows exponentially as a function of universe size. Suppose we used a 60-day momentum strategy where we chose the top sector by CAGR and re-balance monthly. Looking back 756 trading days or 3 years, we test all 502 different universes and select the top 10% of universes by CAGR using the momentum strategy. Now at each re-balance, we choose the top asset using 60-day momentum from each of the universes that are in the top 10%. The purpose of this strategy- lets call it momentum with universe selection- is to hopefully enhance returns and risk-adjusted returns versus using all assets in the universe. The results of this walk-forward strategy are presented below:

umass sectors

It appears that universe selection substantially enhances the performance of a basic momentum strategy. Both returns and risk-adjusted returns are improved by using rolling universe selection. There are clearly sectors that are better suited to a switching strategy than just using all of them at once. What about asset classes? Does the same effect hold true? We chose a 10-asset universe that we have used before for testing Adaptive Asset Allocation: S&P500/SPY,Real Estate/IYR,Gold/GLD,Long-Term Treasurys/TLT,Commodities/DBC,10-year Treasurys/IEF,Emerging Markets/EEM,Europe/IEV,International Real Estate/RWX,Japan/EWJ. The results of this walk-forward strategy are presented below:

umass asset class

Once again, the returns and risk-adjusted returns are much higher when employing universe selection. The differences are highly significant in this case. Clearly there are subsets of asset classes that are superior to using the entire universe.

This approach to universe selection is not without flaws however, and the reasons why will be clarified in a subsequent post. However it is still reasonably practical as long as the backtest lookback window (756 in the above example) is much larger than the momentum lookback window (60 in the above example). Furthermore, the backtest lookback window would ideally cover a market cycle–using shorter lookback windows could end up choosing only the best performers during say a bull market–which would lead to a biased universe. In addition, it would be helpful to choose a reasonable number or % of the top universes such as the top 5 or top 10 or even the top 10% in the examples we used above. That helps to mitigate the effect of data-mining too many different combinations and ending up with a universe that simply performed well due to chance. It also improves the reliability of out-of-sample performance.

The Innovator’s Credo

Success in the quantitative field according to the best hedge funds is a “research war.” Building or maintaining an edge  in the highly competitive world of financial markets requires constant innovation. The requirement for creativity and re-invention is equally important on the product and business side of finance. The late Ronald Reagan once said that the “status quo” is a sad reflection of our troubles.  We must constantly strive to look at problems differently and accept that change is the only constant.  The late Steve Jobs said it best:

the innovators credo