Model Fitting – Using Autocorrelation to Detect The Nature of a Time Series

I believe a solid approach in developing trading strategies is to fit the correct model to the under lying. In a previous post we studied the Hurst exponent and its ability to detect if a series was mean reverting, momentum or a random walk. In this post we will look at a 2 lag autocorrelation on the S&P500. The long and short of it is: if there is high degree of correlation from one point in time to a previous point in time then we have momentum. Conversely, if we have weak correlation between one point in time to a previous point in time, we can say we have mean reversion.

In order to test this we use the acf R function. We run this over daily S&P500 log returns using rollyapply on a (252 trading days * 3, 756 bars = 3 trading years) rolling window. Thus each data point is the correlation coefficient of the previous 3 year rolling window.

The code that achieves this:

# Rolling auto correlation 

# Required Packages
require(ggplot2)
require(TTR)

# SP500 
SP500 <- read.csv("C:/Stock Market Analysis/Norgate Market Data/MASTER_DATA_DUMP/$SPX.csv", header=TRUE,stringsAsFactors = FALSE)
SP500$Date <- ymd(SP500$Date)
SP500$log.close <- log(SP500$Close)

# rets 
SP500$rets <- ROC(SP500$log.close, type = c("discrete"))
SP500$rets[1] <-0

# Rolling acf width 
acf.width <- 756 # bars ,1008 bars =  4 years (252 * 4 = 1008)

# Dates 
SP500dates <- SP500[acf.width:nrow(SP500),]
dates.df <- data.frame(Date=SP500dates$Date)
str(dates.df)
head(dates.df )

# Rolling auto correlation 
result <- rollapply(SP500$rets, width = acf.width, FUN=acf, 
                    lag.max = 30,type = "correlation",plot = FALSE)

# Extract different correlations 
cor2 <- lapply(result[,1], function(x) x[2, drop=FALSE])
cor3 <- lapply(result[,1], function(x) x[3, drop=FALSE])
cor4 <- lapply(result[,1], function(x) x[4, drop=FALSE])
cor5 <- lapply(result[,1], function(x) x[5, drop=FALSE])
cor6 <- lapply(result[,1], function(x) x[6, drop=FALSE])
cor7 <- lapply(result[,1], function(x) x[7, drop=FALSE])
cor8 <- lapply(result[,1], function(x) x[8, drop=FALSE])
cor9 <- lapply(result[,1], function(x) x[9, drop=FALSE])
cor10 <- lapply(result[,1], function(x) x[10, drop=FALSE])
cor11 <- lapply(result[,1], function(x) x[11, drop=FALSE])
cor12 <- lapply(result[,1], function(x) x[12, drop=FALSE])
cor13 <- lapply(result[,1], function(x) x[13, drop=FALSE])
cor14 <- lapply(result[,1], function(x) x[14, drop=FALSE])
cor15 <- lapply(result[,1], function(x) x[15, drop=FALSE])
cor16 <- lapply(result[,1], function(x) x[16, drop=FALSE])
cor17 <- lapply(result[,1], function(x) x[17, drop=FALSE])
cor18 <- lapply(result[,1], function(x) x[18, drop=FALSE])
cor19 <- lapply(result[,1], function(x) x[19, drop=FALSE])
cor20 <- lapply(result[,1], function(x) x[20, drop=FALSE])
cor21 <- lapply(result[,1], function(x) x[21, drop=FALSE])
cor22 <- lapply(result[,1], function(x) x[22, drop=FALSE])
cor23 <- lapply(result[,1], function(x) x[23, drop=FALSE])
cor24 <- lapply(result[,1], function(x) x[24, drop=FALSE])
cor25 <- lapply(result[,1], function(x) x[25, drop=FALSE])
cor26 <- lapply(result[,1], function(x) x[26, drop=FALSE])
cor27 <- lapply(result[,1], function(x) x[27, drop=FALSE])
cor28 <- lapply(result[,1], function(x) x[28, drop=FALSE])
cor29 <- lapply(result[,1], function(x) x[29, drop=FALSE])
cor30 <- lapply(result[,1], function(x) x[30, drop=FALSE])

# cbind outputs
cor2.df <- as.data.frame(do.call(rbind, cor2))
cor2.df <- data.frame(Date=dates.df$Date,cor2.df)
cor3.df <- as.data.frame(do.call(rbind, cor3))
cor3.df <- data.frame(Date=dates.df$Date,cor3.df)
cor4.df <- as.data.frame(do.call(rbind, cor4))
cor4.df <- data.frame(Date=dates.df$Date,cor4.df)
cor5.df <- as.data.frame(do.call(rbind, cor5))
cor5.df <- data.frame(Date=dates.df$Date,cor5.df)
cor6.df <- as.data.frame(do.call(rbind, cor6))
cor6.df <- data.frame(Date=dates.df$Date,cor6.df)
cor7.df <- as.data.frame(do.call(rbind, cor7))
cor7.df <- data.frame(Date=dates.df$Date,cor7.df)
cor8.df <- as.data.frame(do.call(rbind, cor8))
cor8.df <- data.frame(Date=dates.df$Date,cor8.df)
cor9.df <- as.data.frame(do.call(rbind, cor9))
cor9.df <- data.frame(Date=dates.df$Date,cor9.df)
cor10.df <- as.data.frame(do.call(rbind, cor10))
cor10.df <- data.frame(Date=dates.df$Date,cor10.df)
cor11.df <- as.data.frame(do.call(rbind, cor11))
cor11.df <- data.frame(Date=dates.df$Date,cor11.df)
cor12.df <- as.data.frame(do.call(rbind, cor12))
cor12.df <- data.frame(Date=dates.df$Date,cor12.df)
cor13.df <- as.data.frame(do.call(rbind, cor13))
cor13.df <- data.frame(Date=dates.df$Date,cor13.df)
cor14.df <- as.data.frame(do.call(rbind, cor14))
cor14.df <- data.frame(Date=dates.df$Date,cor14.df)
cor15.df <- as.data.frame(do.call(rbind, cor15))
cor15.df <- data.frame(Date=dates.df$Date,cor15.df)
cor16.df <- as.data.frame(do.call(rbind, cor16))
cor16.df <- data.frame(Date=dates.df$Date,cor16.df)
cor17.df <- as.data.frame(do.call(rbind, cor17))
cor17.df <- data.frame(Date=dates.df$Date,cor17.df)
cor18.df <- as.data.frame(do.call(rbind, cor18))
cor18.df <- data.frame(Date=dates.df$Date,cor18.df)
cor19.df <- as.data.frame(do.call(rbind, cor19))
cor19.df <- data.frame(Date=dates.df$Date,cor19.df)
cor20.df <- as.data.frame(do.call(rbind, cor20))
cor20.df <- data.frame(Date=dates.df$Date,cor20.df)
cor21.df <- as.data.frame(do.call(rbind, cor21))
cor21.df <- data.frame(Date=dates.df$Date,cor21.df)
cor22.df <- as.data.frame(do.call(rbind, cor22))
cor22.df <- data.frame(Date=dates.df$Date,cor22.df)
cor23.df <- as.data.frame(do.call(rbind, cor23))
cor23.df <- data.frame(Date=dates.df$Date,cor23.df)
cor24.df <- as.data.frame(do.call(rbind, cor24))
cor24.df <- data.frame(Date=dates.df$Date,cor24.df)
cor25.df <- as.data.frame(do.call(rbind, cor25))
cor25.df <- data.frame(Date=dates.df$Date,cor25.df)
cor26.df <- as.data.frame(do.call(rbind, cor26))
cor26.df <- data.frame(Date=dates.df$Date,cor26.df)
cor27.df <- as.data.frame(do.call(rbind, cor27))
cor27.df <- data.frame(Date=dates.df$Date,cor27.df)
cor28.df <- as.data.frame(do.call(rbind, cor28))
cor28.df <- data.frame(Date=dates.df$Date,cor28.df)
cor29.df <- as.data.frame(do.call(rbind, cor29))
cor29.df <- data.frame(Date=dates.df$Date,cor29.df)
cor30.df <- as.data.frame(do.call(rbind, cor30))
cor30.df <- data.frame(Date=dates.df$Date,cor30.df)

# smooth 
#cor2.df$sma <- SMA(cor2.df$V1,100)

# Plot ACF 
ggplot(cor2.df,aes(Date,V1))+geom_line(aes(color="Lag 2"))+
    geom_line(data=cor3.df,aes(color="Lag 3"))+
  geom_line(data=cor4.df,aes(color="Lag 4"))+
  geom_line(data=cor5.df,aes(color="Lag 5"))+
  geom_line(data=cor6.df,aes(color="Lag 6"))+
  geom_line(data=cor7.df,aes(color="Lag 7"))+
  geom_line(data=cor8.df,aes(color="Lag 8"))+
  geom_line(data=cor9.df,aes(color="Lag 9"))+
  geom_line(data=cor10.df,aes(color="Lag 10"))+
  geom_line(data=cor11.df,aes(color="Lag 11"))+
  geom_line(data=cor12.df,aes(color="Lag 12"))+
  geom_line(data=cor13.df,aes(color="Lag 13"))+
  geom_line(data=cor14.df,aes(color="Lag 14"))+
  geom_line(data=cor15.df,aes(color="Lag 15"))+
  geom_line(data=cor16.df,aes(color="Lag 16"))+
  geom_line(data=cor17.df,aes(color="Lag 17"))+
  geom_line(data=cor18.df,aes(color="Lag 18"))+
  geom_line(data=cor19.df,aes(color="Lag 19"))+
  geom_line(data=cor20.df,aes(color="Lag 20"))+
  geom_line(data=cor21.df,aes(color="Lag 21"))+
  geom_line(data=cor22.df,aes(color="Lag 22"))+
  geom_line(data=cor23.df,aes(color="Lag 23"))+
  geom_line(data=cor24.df,aes(color="Lag 24"))+
  geom_line(data=cor25.df,aes(color="Lag 25"))+
  geom_line(data=cor26.df,aes(color="Lag 26"))+
  geom_line(data=cor27.df,aes(color="Lag 27"))+
  geom_line(data=cor28.df,aes(color="Lag 28"))+
  geom_line(data=cor29.df,aes(color="Lag 29"))+
  geom_line(data=cor30.df,aes(color="Lag 30"))+
labs(color="Legend text")


# Plot with shade
ggplot(cor2.df,aes(Date,V1))+geom_line(aes(colour="V1"))+
  geom_area(aes(y = V1, fill="V1"))+
  ggtitle("Rolling Auto Correlation, 2 lag, rolling window = 3 years", subtitle = "S&P500 Log Returns") +
  labs(x="Date",y="Correlation")+
  theme(plot.title = element_text(hjust=0.5),plot.subtitle =element_text(hjust=0.5))+
  theme(legend.position="none")+
  scale_y_continuous(breaks = seq(-.5, .50, by = 0.05))

# Write 
write.csv(cor2.df,"C:/R Projects/lag2.autocorr.csv")

Plot the 2 lag autocorrelation:

Rplot172

A few things are obvious… but first… lets plot a simple strategy to test for buying high and selling higher (momentum), buying low and selling higher (mean reversion). We want to simply test what happens if we executed the above.

The rules:
1. Take a rolling z-score of S&P500 close prices
2. Momentum buy and exit: Buy when rolling z-score is over 0 and sell when its below 0.
3. Mean reversion buy and exit: Buy when rolling z-score is below 0 and sell when its above 0.

The code that achieves this:

# Mean Reversion @ Momentum S&P500 
# Andrew Bannerman 1.17.2018

######################################
# Required Packages 
#####################################
require(xts)
require(data.table)
require(ggplot2)
require(lubridate)
require(magrittr)
require(scales)
require(reshape2)
require(PerformanceAnalytics)
require(dplyr)
require(TTR)
require(gridExtra)

## Load Data / Format handling
SP500 <- read.csv("C:/Stock Market Analysis/Norgate Market Data/MASTER_DATA_DUMP/$SPX.csv", header=TRUE,stringsAsFactors = FALSE)
SP500$Date <- ymd(SP500$Date)

# Use TTR package to create rolling SMA n day moving average 
# Create function and loop in order to repeat the desired number of SMAs for example 2:30
getSMA <- function(numdays) {
  function(SP500) {
    SMA(SP500[,"Close"], numdays)    # Calls TTR package to create SMA
  }
}
# Create a matrix to put the SMAs in
sma.matrix <- matrix(nrow=nrow(SP500), ncol=0)

# Loop for filling it
for (i in 2:60) {
  sma.matrix <- cbind(sma.matrix, getSMA(i)(SP500))
}

# Rename columns
colnames(sma.matrix) <- sapply(2:60, function(n)paste("close.sma.n", n, sep=""))

# Bind to existing dataframe
SP500 <-  cbind(SP500, sma.matrix)

# Use TTR package to create rolling Standard Deviation
# Create function and loop in order to repeat the desired number of Stdev for example 2:30
getSD <- function(numdays) {
  function(SP500) {
    runSD(SP500$Close, numdays, cumulative = FALSE)    # Calls TTR package to create SMA
  }
}
# Create a matrix to put the SMAs in
sd.matrix <- matrix(nrow=nrow(SP500), ncol=0)

# Loop for filling it
for (i in 2:60) {
  sd.matrix <- cbind(sd.matrix, getSD(i)(SP500))
}

# Rename columns
colnames(sd.matrix) <- sapply(2:60, function(n)paste("close.sd.n", n, sep=""))

# Bind to existing dataframe
SP500 <-  cbind(SP500, sd.matrix)


# Use base R to work out the rolling z-score (Close - roll mean) / stdev
SP500$close.zscore.n2 <- apply(SP500[,c('Close','close.sma.n2', 'close.sd.n2')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n3 <- apply(SP500[,c('Close','close.sma.n3', 'close.sd.n3')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n4 <- apply(SP500[,c('Close','close.sma.n4', 'close.sd.n4')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n5 <- apply(SP500[,c('Close','close.sma.n5', 'close.sd.n5')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n6 <- apply(SP500[,c('Close','close.sma.n6', 'close.sd.n6')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n7 <- apply(SP500[,c('Close','close.sma.n7', 'close.sd.n7')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n8 <- apply(SP500[,c('Close','close.sma.n8', 'close.sd.n8')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n9 <- apply(SP500[,c('Close','close.sma.n9', 'close.sd.n9')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n10 <- apply(SP500[,c('Close','close.sma.n10', 'close.sd.n10')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n11 <- apply(SP500[,c('Close','close.sma.n11', 'close.sd.n11')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n12 <- apply(SP500[,c('Close','close.sma.n12', 'close.sd.n12')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n13 <- apply(SP500[,c('Close','close.sma.n13', 'close.sd.n13')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n14 <- apply(SP500[,c('Close','close.sma.n14', 'close.sd.n14')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n15 <- apply(SP500[,c('Close','close.sma.n15', 'close.sd.n15')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n16 <- apply(SP500[,c('Close','close.sma.n16', 'close.sd.n16')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n17 <- apply(SP500[,c('Close','close.sma.n17', 'close.sd.n17')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n18 <- apply(SP500[,c('Close','close.sma.n18', 'close.sd.n18')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n19 <- apply(SP500[,c('Close','close.sma.n19', 'close.sd.n19')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n20 <- apply(SP500[,c('Close','close.sma.n20', 'close.sd.n20')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n21 <- apply(SP500[,c('Close','close.sma.n21', 'close.sd.n21')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n22 <- apply(SP500[,c('Close','close.sma.n22', 'close.sd.n22')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n23 <- apply(SP500[,c('Close','close.sma.n23', 'close.sd.n23')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n24 <- apply(SP500[,c('Close','close.sma.n24', 'close.sd.n24')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n25 <- apply(SP500[,c('Close','close.sma.n25', 'close.sd.n25')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n26 <- apply(SP500[,c('Close','close.sma.n26', 'close.sd.n26')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n27 <- apply(SP500[,c('Close','close.sma.n27', 'close.sd.n27')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n28 <- apply(SP500[,c('Close','close.sma.n28', 'close.sd.n28')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n29 <- apply(SP500[,c('Close','close.sma.n29', 'close.sd.n29')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n30 <- apply(SP500[,c('Close','close.sma.n30', 'close.sd.n30')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n31 <- apply(SP500[,c('Close','close.sma.n31', 'close.sd.n31')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n32 <- apply(SP500[,c('Close','close.sma.n32', 'close.sd.n32')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n33 <- apply(SP500[,c('Close','close.sma.n33', 'close.sd.n33')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n34 <- apply(SP500[,c('Close','close.sma.n34', 'close.sd.n34')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n35 <- apply(SP500[,c('Close','close.sma.n35', 'close.sd.n35')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n36 <- apply(SP500[,c('Close','close.sma.n36', 'close.sd.n36')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n37 <- apply(SP500[,c('Close','close.sma.n37', 'close.sd.n37')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n38 <- apply(SP500[,c('Close','close.sma.n38', 'close.sd.n38')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n39 <- apply(SP500[,c('Close','close.sma.n39', 'close.sd.n39')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n40 <- apply(SP500[,c('Close','close.sma.n40', 'close.sd.n40')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n41 <- apply(SP500[,c('Close','close.sma.n41', 'close.sd.n41')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n42 <- apply(SP500[,c('Close','close.sma.n42', 'close.sd.n42')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n43 <- apply(SP500[,c('Close','close.sma.n43', 'close.sd.n43')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n44 <- apply(SP500[,c('Close','close.sma.n44', 'close.sd.n44')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n45 <- apply(SP500[,c('Close','close.sma.n45', 'close.sd.n45')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n46 <- apply(SP500[,c('Close','close.sma.n46', 'close.sd.n46')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n47 <- apply(SP500[,c('Close','close.sma.n47', 'close.sd.n47')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n48 <- apply(SP500[,c('Close','close.sma.n48', 'close.sd.n48')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n49 <- apply(SP500[,c('Close','close.sma.n49', 'close.sd.n49')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n50 <- apply(SP500[,c('Close','close.sma.n50', 'close.sd.n50')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n51 <- apply(SP500[,c('Close','close.sma.n51', 'close.sd.n51')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n52 <- apply(SP500[,c('Close','close.sma.n52', 'close.sd.n52')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n53 <- apply(SP500[,c('Close','close.sma.n53', 'close.sd.n53')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n54 <- apply(SP500[,c('Close','close.sma.n54', 'close.sd.n54')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n55 <- apply(SP500[,c('Close','close.sma.n55', 'close.sd.n55')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n56 <- apply(SP500[,c('Close','close.sma.n56', 'close.sd.n56')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n57 <- apply(SP500[,c('Close','close.sma.n57', 'close.sd.n57')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n58 <- apply(SP500[,c('Close','close.sma.n58', 'close.sd.n58')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n59 <- apply(SP500[,c('Close','close.sma.n59', 'close.sd.n59')], 1, function(x) { (x[1]-x[2])/x[3] } )
SP500$close.zscore.n60 <- apply(SP500[,c('Close','close.sma.n60', 'close.sd.n60')], 1, function(x) { (x[1]-x[2])/x[3] } )

# Convert all NA to 0
SP500[is.na(SP500)] <- 0

# Calculate quartiles, where close is relation to range (Close - High) / (High - Low)
#SP500$quartile <- apply(SP500[,c('Close', 'Low', 'High')], 1, function(x) { (x[1]-x[2])/(x[3]-x[2])} )

# Calculate Returns from open to close 
SP500$ocret <- apply(SP500[,c('Open', 'Close')], 1, function(x) { (x[2]-x[1])/x[1]} )

# Calculate Close-to-Close returns
SP500$clret <- ROC(SP500$Close, type = c("discrete"))
SP500$clret[1] <- 0

#####################################################################
# Split Data To Train and Test Set
#####################################################################
#SP500 <- subset(SP500, Date >= as.Date("1993-01-01") )             #Input for subsetting by date versus splitting, works with train set only
train.index <- 1:(nrow(SP500)*.570) # Add *.666 for 2/3rd split
train.set <- SP500[train.index, ]
test.set <- SP500[-train.index, ]

#####################################################################
# Assign train and test set indicactors 
#####################################################################
# Name indicators #
train.indicator <- train.set$close.zscore.n11
test.indicator <- test.set$close.zscore.n11

######################################################################
# Develop Training Set Paramaters
# ####################################################################
# Enter buy / sell rules
train.set$enter.mean.rev <- ifelse(train.indicator < 0, 1, 0)
train.set$exit.mean.rev <- ifelse(train.indicator > 0, 1, 0)
train.set$enter.momo <- ifelse(train.indicator > 0, 1, 0)
train.set$exit.momo <- ifelse(train.indicator < 0, 1, 0)

# Mean Rev
train.set <- train.set %>%
  dplyr::mutate(sig.mean.rev = ifelse(enter.mean.rev == 1, 1,
                                      ifelse(exit.mean.rev == 1, 0, 0)))

# Momentum
train.set <- train.set %>%
  dplyr::mutate(sig.momo = ifelse(enter.momo == 1, 1,
                                      ifelse(exit.momo == 1, 0, 0)))


# lag signal by one forward day to signal entry next day 
train.set$sig.mean.rev <- lag(train.set$sig.mean.rev,1) # Note k=1 implies a move *forward*
train.set$sig.momo <- lag(train.set$sig.momo,1) # Note k=1 implies a move *forward*
train.set[is.na(train.set)] <- 0  # Set NA to 0

# Calculate equity curves
# Mean rev
train.set <- train.set %>%
  dplyr::mutate(RunID = rleid(sig.mean.rev)) %>%
  group_by(RunID) %>%
  dplyr::mutate(mean.rev.equity = ifelse(sig.mean.rev == 0, 0,
                                         ifelse(row_number() == 1, ocret, clret))) %>%
  ungroup() %>%
  select(-RunID)

# Momentum
train.set <- train.set %>%
  dplyr::mutate(RunID = rleid(sig.momo)) %>%
  group_by(RunID) %>%
  dplyr::mutate(momo.equity = ifelse(sig.momo == 0, 0,
                                         ifelse(row_number() == 1, ocret, clret))) %>%
  ungroup() %>%
  select(-RunID)

# Pull select columns from data frame to make XTS whilst retaining formats 
xts1 = xts(train.set$mean.rev.equity, order.by=as.POSIXct(train.set$Date, format="%Y-%m-%d")) 
xts2 = xts(train.set$momo.equity, order.by=as.POSIXct(train.set$Date, format="%Y-%m-%d")) 
xts3 = xts(train.set$clret, order.by=as.POSIXct(train.set$Date, format="%Y-%m-%d")) 

# Join XTS together 
train.compare <- cbind(xts1,xts2,xts3)

# Use the PerformanceAnalytics package for trade statistics
colnames(train.compare) <- c("Mean Reversion","Momentum","Buy And Hold")
charts.PerformanceSummary(train.compare,main="Cumulative Returns", wealth.index=TRUE, colorset=rainbow12equal)
#png(filename="20090606_rsi2_performance_updated.png", 720, 720)
#performance.table <- rbind(table.AnnualizedReturns(compare), maxDrawdown(compare), CalmarRatio(compare),table.DownsideRisk(compare))
#drawdown.table <- rbind(table.Drawdowns(compare))
#dev.off()

# Train Set Log Returns 
train.logRets <- log(cumprod(1+train.compare))
chart.TimeSeries(train.logRets, legend.loc='topleft', colorset=rainbow12equal)

print(performance.table)
print(drawdown.table)

######################################################
# Develop Test Set Paramaters (Unseen Data)
######################################################

# Enter buy / sell rules
test.set$enter.mean.rev <- ifelse(test.indicator < 0, 1, 0)
test.set$exit.mean.rev <- ifelse(test.indicator > 0, 1, 0)
test.set$enter.momo <- ifelse(test.indicator > 0, 1, 0)
test.set$exit.momo <- ifelse(test.indicator < 0, 1, 0)

# Mean Rev
test.set <- test.set %>%
  dplyr::mutate(sig.mean.rev = ifelse(enter.mean.rev == 1, 1,
                                      ifelse(exit.mean.rev == 1, 0, 0)))

# Momentum
test.set <- test.set %>%
  dplyr::mutate(sig.momo = ifelse(enter.momo == 1, 1,
                                  ifelse(exit.momo == 1, 0, 0)))


# lag signal by one forward day to signal entry next day 
test.set$sig.mean.rev <- lag(test.set$sig.mean.rev,1) # Note k=1 implies a move *forward*
test.set$sig.momo <- lag(test.set$sig.momo,1) # Note k=1 implies a move *forward*
test.set[is.na(test.set)] <- 0  # Set NA to 0

# Calculate equity curves
# Mean rev
test.set <- test.set %>%
  dplyr::mutate(RunID = rleid(sig.mean.rev)) %>%
  group_by(RunID) %>%
  dplyr::mutate(mean.rev.equity = ifelse(sig.mean.rev == 0, 0,
                                         ifelse(row_number() == 1, ocret, clret))) %>%
  ungroup() %>%
  select(-RunID)

# Momentum
test.set <- test.set %>%
  dplyr::mutate(RunID = rleid(sig.momo)) %>%
  group_by(RunID) %>%
  dplyr::mutate(momo.equity = ifelse(sig.momo == 0, 0,
                                     ifelse(row_number() == 1, ocret, clret))) %>%
  ungroup() %>%
  select(-RunID)

# Pull select columns from data frame to make XTS whilst retaining formats 
xts1 = xts(test.set$mean.rev.equity, order.by=as.POSIXct(test.set$Date, format="%Y-%m-%d")) 
xts2 = xts(test.set$momo.equity, order.by=as.POSIXct(test.set$Date, format="%Y-%m-%d")) 
xts3 = xts(test.set$clret, order.by=as.POSIXct(test.set$Date, format="%Y-%m-%d")) 

# Join XTS together 
test.compare <- cbind(xts1,xts2,xts3)

# Use the PerformanceAnalytics package for trade statistics
colnames(test.compare) <- c("Mean Reversion","Momentum","Buy And Hold")
charts.PerformanceSummary(test.compare,main="Cumulative Returns", wealth.index=TRUE, colorset=rainbow12equal)
#png(filename="20090606_rsi2_performance_updated.png", 720, 720)
#performance.table <- rbind(table.AnnualizedReturns(compare), maxDrawdown(compare), CalmarRatio(compare),table.DownsideRisk(compare))
#drawdown.table <- rbind(table.Drawdowns(compare))
#dev.off()

# Test Set Log Returns 
test.logRets <- log(cumprod(1+test.compare))
chart.TimeSeries(test.logRets, legend.loc='topleft', colorset=rainbow12equal)

print(performance.table)
print(drawdown.table)
head(train.logRets)

tail(train.logRets)

# Prepare data for plotting 
train.reps <- rep(1,nrow(train.logRets))
test.reps <- rep(2,nrow(test.logRets))
id <- c(train.reps,test.reps)
final <- rbind(test.compare,train.compare)
final <-log(cumprod(1+final))
final.df <- data.frame(final,id=id)
final.df <- setDT(final.df, keep.rownames = TRUE)[] # Set row names
colnames(final.df)[1] <- "Date"
final.df$Date <- ymd_hms(final.df$Date)

# negative 2 lag auto correlation dates 
start_one <- as.POSIXct("1931-9-25")
end_one <- as.POSIXct("1932-7-21")
start_two <- as.POSIXct("1936-1-13")
end_two <- as.POSIXct("1940-5-13")
start_three <- as.POSIXct("1940-12-26")
end_three <- as.POSIXct("1941-1-11")
start_four <- as.POSIXct("1941-4-4")
end_four <- as.POSIXct("1941-7-29")
start_five <- as.POSIXct("1965-9-23")
end_five <- as.POSIXct("1965-10-22")
start_six <- as.POSIXct("1990-10-12")
end_six <- as.POSIXct("1990-10-19")
start_seven <- as.POSIXct("1994-12-23")
end_seven <- as.POSIXct("1995-1-16")
start_eight <- as.POSIXct("1998-9-8")
end_eight <- as.POSIXct("2017-10-20")

# Plot train and test set equity curves with ggplot2 
p1<- ggplot(data = final.df, aes(x =Date, y = Mean.Reversion,colour=id))+
  geom_line(data = final.df, aes(x =Date, y = Mean.Reversion,colour=id))+
  geom_line(data = final.df, aes(x =Date, y = Momentum,colour=id))+
  geom_line(data = final.df, aes(x =Date, y = Buy.And.Hold,colour=id))+
  geom_line(data = final.df, aes(x =Date, y = Mean.Reversion,colour=id))+
  geom_line(data = final.df, aes(x =Date, y = Momentum,colour=id))+
  geom_line(data = final.df, aes(x =Date, y = Buy.And.Hold,colour=id))+
  theme_bw()+
  geom_vline(xintercept=as.numeric(as.POSIXct("1977-08-18 19:00:00"),linetype="dashed"))+
  theme(legend.position = "none")+
  scale_y_continuous(breaks = seq(-7, 7, by = .5))+
  ggtitle("Mean Reversion, Momentum, Buy And Hold - S&P500 Index",subtitle="1928 to Present") +
  labs(x="Date",y="Cumulative Log Return")+
  theme(plot.title = element_text(hjust=0.5),plot.subtitle =element_text(hjust=0.5))+
  #geom_rect(aes(xmin=start_one,xmax=end_one,ymin=-Inf,ymax=Inf),alpha=0.01,fill="#CC6666")+
  #geom_rect(aes(xmin=start_two,xmax=end_two,ymin=-Inf,ymax=Inf),alpha=0.01,fill="#CC6666")+
  #geom_rect(aes(xmin=start_three,xmax=end_three,ymin=-Inf,ymax=Inf),alpha=0.01,fill="#CC6666")+
  #geom_rect(aes(xmin=start_four,xmax=end_four,ymin=-Inf,ymax=Inf),alpha=0.01,fill="#CC6666")+
  #geom_rect(aes(xmin=start_five,xmax=end_five,ymin=-Inf,ymax=Inf),alpha=0.01,fill="#CC6666")+
  #geom_rect(aes(xmin=start_six,xmax=end_six,ymin=-Inf,ymax=Inf),alpha=0.01,fill="#CC6666")+
  #geom_rect(aes(xmin=start_seven,xmax=end_seven,ymin=-Inf,ymax=Inf),alpha=0.01,fill="#CC6666")+
  #geom_rect(aes(xmin=start_eight,xmax=end_eight,ymin=-Inf,ymax=Inf),alpha=0.01,fill="#CC6666")+
  annotate("text", label = "Mean Reversion", x = as.POSIXct("2010-01-01"), y = -2.0, color = "#003A90")+
  annotate("text", label = "Momentum", x = as.POSIXct("2010-01-01"), y = 6.5, color = "#003A90")+
  annotate("text", label = "Buy and Hold", x = as.POSIXct("2010-01-01"), y = 3.5, color = "#003A90")+
annotate("text", label = "Out of Sample", x = as.POSIXct("2000-01-01"), y = .5, color = "#56B1F7")+
  annotate("text", label = "In Sample", x = as.POSIXct("1940-01-01"), y = 4.0, color = "#132B43")
  
  

p2 <- ggplot(cor2.df,aes(Date,V1))+geom_line(aes(colour="V1"))+
  geom_area(aes(y = V1, fill="V1"))+
  ggtitle("Rolling Auto Correlation, 2 lag, rolling window = 3 years", subtitle = "S&P500 Log Returns") +
  labs(x="Date",y="Correlation")+
  theme(plot.title = element_text(hjust=0.5),plot.subtitle =element_text(hjust=0.5))+
  theme(legend.position="none")+
  scale_y_continuous(breaks = seq(-.5, .50, by = 0.05))
gridExtra::grid.arrange(p1, p2, ncol = 1)

Now lets plot the strategies against the lag 2 autocorrelation:

Rplot173

The success of a momentum strategy relies (in this short term momentum rule) on a positive 2 lag autocorrelation. We see that the positive 2 lag autocorrelation peaked in 5/23/1973 and declined into negative territory around 1998 where it remains negative at present day. The success of a mean reversion strategy is largely dependent on a negative 2 lag autocorrelation, notably beginning post dot com bubble. This marked a new regime shift from momentum to mean reversion (interday basis).

To conclude, the 2 lag autocorrelation may be a useful tool for detecting which type of model to fit to the underlying series.

Author: Andrew Bannerman

Integrity Inspector. Quantitative Analysis is a favorite past time.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s