Note: A more complex method that applies to this article  mentioned by one of our readers is the application of a regression which is briefly touched upon in the link http://en.wikipedia.org/wiki/T_Test

One method of determining when a trading strategy is breaking down is to run a statistical test. Conceptually when I use the term “breaking down” I am  referring to the recent profitability of the strategy being significantly different than average. However, this may also refer to a strategy becoming more volatile than average relative to its profit per trade. Both an analysis of the “mean” and “variance” of a trading strategy in the most recent period versus its historical average are important  in equity curve analyis. Another possible area of investigation is the trade win% versus its historical win%. For this post we will look at how to address the first issue which concerns the possibility that the average profitability is significantly different than average. Thus we will assume that the trading strategy still has the same variance or volatility as measured historically. Enter the paired t-test which is used to determine whether two samples from the same trading strategy are similar or statistically different than normal. In this case I have used the equation that assumes unequal sample sizes because in most cases traders would do a 5 or 10 year backtest, and would want to evaluate a strategy that is currently undergoing some form of drawdown or deterioration (ie you aren’t going to wait 5 or 10 years before you test again!):

From wikipedia: http://en.wikipedia.org/wiki/T_Test

Unequal sample sizes, equal variance

This test is used only when it can be assumed that the two distributions have the same variance. (When this assumption is violated, see below.) The t statistic to test whether the means are different can be calculated as follows:

where

Note that the formulae above are generalizations for the case where both samples have equal sizes (substitute n1 and n2 for n and you’ll see).

is an estimator of the common standard deviation of the two samples: it is defined in this way so that its square is an unbiased estimator of the common variance whether or not the population means are the same. In these formulae, n = number of participants, 1 = group one, 2 = group two. n − 1 is the number of degrees of freedom for either group, and the total sample size minus two (that is, n1 + n2 − 2) is the total number of degrees of freedom, which is used in significance testing.

Note that this can also be applied to determine whether say two mean reversion systems like DV2 and RSI2 are signficantly different from eachother. There are some issues with the t-test and to me the assumption of normality–ie that a given trading strategy is normally distributed–is certainly one issue. This flawed assumption has caused quants many problems in the past. The most hazardous to your account is the delay that may be introduced by waiting too long to turn a strategy off. One adjustment that can be made is to have some form of trailing stop or gradual reduction of exposure to a strategy even before it is “statistically significantly different than average.” Other issues concern forms of conditional bias introduced by backtesting during only one “regime” such as an uptrend, this will naturally screw up the test as trading strategies often behave very differently across various “regimes.” Knowing in advance that certain strategies perform poorly when a regime change occurs, gives you “early warning” to stop using it and thus helps you to avoid drawdowns. An example would be shorting after two up days, which typically performs well, should be expected to perform worse during the onset of a new “uptrend” perhaps defined by the 200 day MA or others. Common sense in this regard can be a more proactive form of risk management vs reactive management achieved by trading the equity curve. It stands to reason for example that when the market is 3 standard deviations below its mean, eventually it will snap back, and thus a reduction of exposure to a shorting strategy at this point is probably better than waiting for a drawdown to occur to cut bait. This is also why the use of mean-reversion principles of overbought and oversold are valuable.

1. October 26, 2009 5:28 am

Thanks for this clear explanation of how we can apply stats analysis in trading.

I still have to learn much more about stats and how to apply them in my automate trading strategies analysis – but from my initial investigation I got the impression that standard “stats” are not very relevant for trading because of their assumptions on distribution types (by that I mean parametric statistics which are confirmatory data analysis – ie they assume a hypothesis first and run a test to disprove it).

Student’s T-test assumes normal distribution – which clearly seems inappropriate with the typical fat-tails that markets throw at us.

Do you know if there are other types of stats that could give better results (ie robust stats, non-parametric stats, explanatory data analysis, etc.)

October 26, 2009 9:57 am

hi Jez, you are correct and thanks for the kind words. I mentioned this drawback at the end of the article– i personally do not use this specific test but it is a good start. in the article on wikipedia they mention the following: “To relax the normality assumption, a non-parametric alternative to the t-test can be used, at a cost of lower statistical power. The usual choices for non-parametric location tests are the Mann–Whitney U test for independent samples, and the binomial test or the Wilcoxon signed-rank test for paired samples.”

cheers
dv

• October 26, 2009 1:45 pm

That’s where I realise I need to buy a stats book! 😉

Do you use other – more appropriate tests?

October 26, 2009 2:03 pm

well, as we shall see in the next article, sometimes all you need is a simple technical indicator :o)
to be honest Jez, im sure you are more sophisticated than you are letting on, but i have found in the course of testing everything that some simple stats give you 80-90% of the results of very onerous and complicated procedures. Some simplifications do even better than sophisticated procedures. i think it is worthwhile to have a good understanding, but simplicity in trading is far more desirable—-and I always tend to look for the most robust and simplest approach whether for trading the equity curve or for trading strategies.
best
dv

• October 26, 2009 5:04 pm

David,
I DO feel way behind in my stats knowledge even though I think I can grasp some concepts. So I did go back and found your post (I think) that you mention in your comment and I am going to start with this..
The Adaptive Time Machine: The Importance of Statistical Filters

Do you have any recommendations for learning stats applicable to trading strategies testing – from basic to more sophisticated? thanks

October 26, 2009 5:27 pm

thats my favorite post :o) actually that method is fairly easy to apply and very durable. Try to ensure that you don’t count periods where the strategy is in cash and use trades only.

look for values >1.6 or <-1.6 as a basic guideline.

most basic econometric books are most suitable, if that is too heavy than just get a basic stats book. use a program like MATLAB or SPSS to do calculations so all you have to do is select and interpret.

cheers
dv

October 26, 2009 8:39 am

I wonder if it is better to apply the t-test to trade profits or to the time series of daily profits. Maybe both are worth analyzing.

October 26, 2009 10:00 am

Yes you are correct and a good suggestion. A regression is a better application than this specific approach. Of course it is a little more complicated to explain in this blog. Your mention of “profits” is also important because the equity curve is distorted by compounding and it is best to use a fixed bet methodology to apply this approach.
cheers
dv

3. November 11, 2009 5:03 am

Unless you are an existing user of MATLAB or SPSS, you might want to try R instead.