# Using Non-Parametric Statistics to Improve Portfolio Optimization

*David Varadi and Henry Bee*

Financial markets are noisy and do not follow gaussian distributions nor do they conform neatly to analysis using linear regression. This is because of the lack of stationarity as well as a healthy degree of noise relative to the underlying relationships within the data. It is therefore strange that the conventional practice is to use classical statistics and linear optimization methods for portfolio allocation or indicators for that matter. This point has been stressed many times on the blog, and I cannot stress it enough: you need to use as few assumptions as possible when working with financial data in order to create useful tools or applications. This means that you should be using percentiles/histograms and other nonparametric statistics as the cornerstone of your analysis. It is ludicrous to draw up mathematical proofs as the basis for how you will allocate capital as is the case with modern portfolio theory and CAPM. Yet this is the standard somehow in asset management in the real world– it is therefore not surprising that financial companies underestimate risk and “blow up.” In other industries a more practical approach is taken to solve complex problems. The military doesn’t depend on fixed equations to guide missles to their targets otherwise they would be highly inaccurate. There are too many variables such as wind, pressure and temperature to account for that make a fixed solution impractical. Allowing for non-linear statistics with the use of recursion to improve estimation is absolutely essential for various forms of modern “rocket science”. Why should allocating capital be any different?

One of the weaknesses of portfolio optimization outside of the process are the inputs that are incorporated into calculations. The correlation coefficient is one of the prime offenders because it is most effective when there is a linear relationship between two variables. Anyone with experience with running regressions on financial market data will understand that strong linear relationships are few and far between. Often a cluster of dots expressing a financial relationship is more dense in some areas than others, and the relationship tends to have non-linear qualities. The best way to measure this type of correlation is to use a non-parametric variant called the Spearman Correlation http://en.wikipedia.org/wiki/Spearman’s_rank_correlation_coefficient . Effectively the Spearman correlation uses ranks instead of the raw data to discern the relationship. This permits the actual relationship to be any type of shape, but if it is systematic than the coefficient will be high- much like the regular correlation coefficient.

To test this hypothesis we looked at a variety of different allocation exercises with different assets as well as individual trading strategies. In this case, we performed a Kelly Portfolio Optimization using the covariance matrix and returns on the 9 sector ETFs (SPYDERS) that comprise the S&P500. For those that have worked with the SPYDERS, they are very noisy and do not yield many systematic effects. We chose a 3-year window (756-day) for the correlation and CAGR inputs which is long-term and thus reflective of classic portfolio allocation but also useful especially for strategy allocation (think RSI2, moving averages etc). Across a variety of allocation exercises with different parameters the Spearman correlation was consistently superior to the conventional Pearson correlation.

**Figure 1**: *the Spearman Covariance can improve results dramatically.*

Below is a performance summary that shows that the Spearman covariance was substantially more useful in the portfolio allocation exercise. Using shorter-term lookbacks such as 252-day for CAGR and 60-day for correlation yielded the same conclusion. The results were even more pronounced on other asset classes as well as strategy allocation. While the Spearman itself can be substantially improved, in its basic form it is a nice improvement. As they say in cooking– “all it takes to improve a classic recipe is superior ingredients.” CSS has been doing extensive research into creating superior versions of classic portfolio allocation inputs—expect more to come.

Interesting approach, did you try a similar thing with the Kendall Tau coefficient ? I never did myself but I am wondering if the results would be similar, since they measure basically the same thing. Good post.

QF

hi QF indeed the kendall is on the to-do list as it looks at rank divergences in a different way. I assume it might be superior. thanks—and good thinking

best

dv

I would be curious to see which one performs better with financial time series. I think that depending the use for it, the best option might be varying to a certain extent. The pairing used for the ranks in the Kendall tau coefficient seems appealing for some applications, looking at strategies correlation for example. However, I can’t really conceptualize how or why it would be superior. I look forward to your findings!

QF

What is the formula for the Kelly optimal portfolio? Does it involve inverting a covariance matrix? What are the eigenvalues of the two covariance matrices?

hi nick the answer to your question is yes, and a link to the paper most relevant to this method is: http://www.edwardothorp.com/sitebuildercontent/sitebuilderfiles/KellyCriterion2007.pdf

section 7.1

best

dv

Hi, could you recommend a good book on nonparametric statistics ? Preferrably practical and some focus on financial data / trading ? Regard J.E.

Actually my question was mostly about the eigenvalues – the question about the Kelly was just confirmation