Ernie chan proposes a method to calculate the speed of mean reversion. He proposes to adjust the ADF (augmented dickey fuller test, more stringent) formula from discrete time to differential form. This takes shape of the Ornstein-Uhlenbeck Formula for mean reverting process. Ornstein Uhlenbeck Process – Wikipedia

**dy(t) = (λy(t − 1) + μ)dt + dε**

Where dε is some Gaussian noise. Chan goes on to mention that using the discrete ADF formula below:

**Δy(t) = λy(t − 1) + μ + βt + α1Δy(t − 1) + … + αkΔy(t − k) + ∋t**

and performing a linear regression of `Δy(t)`

against `y(t − 1)`

provides `λ`

which is then used in the first equation. However, the advantage of writing the formula in differential form is it allows an analytical solution for the expected value of **y(t).**

**E( y(t)) = y0exp(λt) − μ/λ(1 − exp(λt))**

Mean reverting series exhibit negative `λ`

. Conversely positive `λ`

means the series doesn’t revert back to the mean.

When `λ`

is negative, the value of price decays exponentially to the value `−μ/λ`

with the half-life of decay equals to `−log(2)/λ`

. *See references*.

We can perform the regression of yt-1 and (yt-1-yt) with the below R code on the SPY price series. For this test we will use a look back period of 100 days versus the entire price series (1993 inception to present). If we used all of the data, we would be including how long it takes to recover from bear markets. For trading purposes, we wish to use a shorter sample of data in order to produce a more meaningful statistical test.

The procedure:

1. Lag SPY close by -1 day

2. Subtract todays close – yesterdays close

3. Subtract (todays close – yesterdays close) – mean(todays close – yesterdays close)

4. Perform linear regression of (today close – yesterday) ~ (todays close – yesterdays close) – mean(todays close – yesterdays close)

5. On regression output perform -log(2)/λ

# Calculate yt-1 and (yt-1-yt) y.lag <- c(random.data[2:length(random.data)], 0) # Set vector to lag -1 day y.lag <- y.lag[1:length(y.lag)-1] # As shifted vector by -1, remove anomalous element at end of vector random.data <- random.data[1:length(random.data)-1] # Make vector same length as vector y.lag y.diff <- random.data - y.lag # Subtract todays close from yesterdays close y.diff <- y.diff [1:length(y.diff)-1] # Make vector same length as vector y.lag prev.y.mean <- y.lag - mean(y.lag) # Subtract yesterdays close from the mean of lagged differences prev.y.mean <- prev.y.mean [1:length(prev.y.mean )-1] # Make vector same length as vector y.lag final.df <- as.data.frame(final) # Create final data frame # Linear Regression With Intercept result <- lm(y.diff ~ prev.y.mean, data = final.df) half_life <- -log(2)/coef(result)[2] half_life # Linear Regression With No Intercept result = lm(y.diff ~ prev.y.mean + 0, data = final.df) half_life1 = -log(2)/coef(result)[1] half_life1 # Print general linear regression statistics summary(result)

Observing the output of the above regression we see that the slope is negative and is a mean revering process. We see from summary(results) `λ`

is -0.06165 and when we perform `-log(2)/λ`

we obtain a mean reversion half life of **11.24267 days**.

11.24267 days is the half life of mean reversion which means we anticipate the series to fully revert to the mean by 2 * the half life or 22.48534 days. However, to trade mean reversion profitably we need not exit directly at the mean each time. Essentially if a trade extended over 22 days we may expect a short term or permanent regime shift. One may insulate against such defeats by setting a ‘time stop’.

The obtained 11.24267 day half life is short enough for a interday trading horizon. If we obtained a longer half life we may be waiting a long time for the series to revert back to the mean. Once we determine that the series is mean reverting we can trade this series profitably with a simple linear model using a look back period `equal to the half life`

. In a previous post we explored a simple linear `zscore model`

: https://flare9xblog.wordpress.com/2017/09/24/simple-linear-strategy-for-sp500/

The lookback period of 11 days was obtained using a ‘brute force approach’ (maybe luck). An optimal look back period of 11 days produced the best result for the SPY.

Post brute forcing, it was noted during optimization of the above strategy that adjusting the half life from 11 days to any number above or below, we experienced a decrease in performance.

We illustrate the effect of moving the look back period shorter and longer than the obtained half life. For simplicity, we will use the total cumulative returns for comparison:

We see that a look back of 11 days produced the highest cumulative compounded returns.

Ernie Chan goes on to mention that ‘why bother with statistical testing’. The answer lies in the fact that specific trading rules only trigger when their conditions are met and therefore tend to skip over data. Statistical testing includes data that a model may skip over and thus produce results with higher statistical significance.

Furthermore, once we confirm a series is mean reverting we can be assured to find a profitable trading strategy and not per se the strategy that we just back tested.

References

*Algorithmic Trading: Winning Strategies and Their Rationale – May 28, 2013, by Ernie Chan*

Hi,

I think your steps are not consistent with the code. Correct me if I am not.

#### DD Comment：####

# The following line of code is doing:

# yesterdays’ close – mean of (YESTERDAYS’ CLOSE)

# BUT NOT, Subtract the mean of lagged differences?

########

prev.y.mean <- y.lag – mean(y.lag) # Subtract yesterdays close from the mean of lagged differences

LikeLike

Hi Danny

Your correct, my description does not match this version of calculating the half life of mean reversion. This version regresses y(t) − y(t − 1)

vs y(t − 1) – mean(y(t − 1). I will update to add Ernie Chans method when I get a chance from his book, algorithmic trading, winning strategies and their rationale. In short, Ernies looks like this:

Simply regress y(t) − y(t − 1) vs y(t − 1) or in English today close – yesterday close vs yesterdays close.

ylag=lag(y, 1) == y(t − 1)

deltaY=y-ylag == y(t) − y(t − 1)

regression = y(t) − y(t − 1) (y, dependant variable) and y(t − 1) (x,independent variable)

halflife=-log(2)/regression coefficient

I will get round to updating this post when I get a chance. Thanks for reading!

LikeLike

Thanks Andrew,

It might be obvious, but would like to confirm:

So in:

“This version regresses y(t) − y(t − 1) vs y(t − 1) – mean(y(t − 1)”

x is y(t) − y(t − 1)

y is y(t − 1) – mean(y(t − 1)

Is that correct? Cheers.

LikeLike

Thanks, Andrew.

Great read. Very helpful!

Happy new year

LikeLike

Happy New Year! The R lm() function is y,x. y being dependant variable and x independent. lm.fit() is x,y!

In this case:

y is y(t) − y(t − 1)

x is y(t − 1) – mean(y(t − 1)

I had to check! Ernie Matlab, ols function is y,x. I thought it was like Pythons, x,y. The above is correct and my R code correct too for this method.

Cheers!

Andrew

LikeLike