XIV | VXX – Testing Mean Reversion, Momentum, Random Walk – Rolling Hurst Exponent

In addition to the previous posts studying volatility strategies here and here we aim to study the nature of the XIV,VXX series. We subject the series to the Hurst exponent and we follow the same procedure as Ernie Chan (2013) where we take the lagged price differences and perform a regression on the log time lags vs log variance of the lagged differences, an example of this here.

For this post we will download VIX futures data from CBOE website and join synthetic XIV and XIV data to make the largest data set we possibly can per this previous post. After we have our XIV and VXX data we then proceed to create the hurst exponent of each series and the procedure for this is as follows:

1. Compute lagged differences on varying time lags. For example. lag 2 = todays close price – close price 2 days ago. lag 3 = todays close price – close price 3 days ago.
2. Next, Compute the variance of the lagged differences. Ernie chan recommends at least 100 days of data for this. Compute variance over a period of 100 days for each lagged difference.
3. Perform linear regression of the x,y,  log(time_lags) ~ log(variance_lagged_differences) and divide the slope by 2 to obtain the hurst exponent.

We compute the procedure above on a rolling basis and the look back chosen for the variance is 100 trading days. We also use the R package RcppEigen and use fastlm to perform a rolling linear regression. The code that achieves this:

###############################
# Hurst Exponent (varying lags)
###############################

require(magrittr)
require(zoo)
require(lattice)

## Set lags
lags <- 2:(252*6)

# Function for finding differences in lags. Todays Close - 'n' lag period
getLAG.DIFF <- function(lagdays) {
  function(term.structure.df) {
    c(rep(NA, lagdays), diff(term.structure.df$vxx_close, lag = lagdays, differences = 1, arithmetic = TRUE, na.pad = TRUE))
  }
}
# Create a matrix to put the lagged differences in
lag.diff.matrix <- matrix(nrow=nrow(term.structure.df), ncol=0)

# Loop for filling it
for (i in lags) {
  lag.diff.matrix <- cbind(lag.diff.matrix, getLAG.DIFF(i)(term.structure.df))
}

# Rename columns
colnames(lag.diff.matrix) <- sapply(lags, function(n)paste("lagged.diff.n", n, sep=""))

# Bind to existing dataframe
term.structure.df <-  cbind(term.structure.df, lag.diff.matrix)
head(term.structure.df,25)

############################################################
# Calculate rolling variances of 'n period' differences
# Set variance look back to 100 days
############################################################
# Convert NA to 0
term.structure.df[is.na(term.structure.df)] <- as.Date(0)

get.VAR <- function(varlag) {
  function(term.structure.df) {
    runVar(term.structure.df[,paste0("lagged.diff.n", lags[i])], y = NULL, n = 100, sample = TRUE, cumulative = FALSE)
  }
}
# Create a matrix to put the variances in
lag.var.matrix <- matrix(nrow=nrow(term.structure.df), ncol=0)

# Loop for filling it
for (i in 1:length(lags)) {
  lag.var.matrix <- cbind(lag.var.matrix, get.VAR(i)(term.structure.df))
}

# Rename columns
colnames(lag.var.matrix) <- sapply(lags, function(n)paste("roll.var.diff.", n, sep=""))

# Bind to existing dataframe
term.structure.df <-  cbind(term.structure.df, lag.var.matrix)

########################################
# Subset to remove all leading NA
########################################
#NonNAindex <- which(!is.na(term.structure.df))
#set_lag_threshold <- 50 # Set Variance
#na <- which(!is.na(term.structure.df[,paste0("roll.var.diff.", set_lag_threshold)]))
#firstNonNA <- min(na)
#term.structure.df<-term.structure.df[firstNonNA:nrow(term.structure.df),]

########################################
# Rolling linear regression to compute hurst exponent
########################################
variance <- list()
lag.vec <- c(2:30)  # Select short term lags
# Select column selection
for (i in 1:nrow(term.structure.df)) {
  variance[i] <- list(c(term.structure.df$roll.var.diff.2[i],
                        term.structure.df$roll.var.diff.3[i],
                        term.structure.df$roll.var.diff.4[i],
                        term.structure.df$roll.var.diff.5[i],
                        term.structure.df$roll.var.diff.6[i],
                        term.structure.df$roll.var.diff.7[i],
                        term.structure.df$roll.var.diff.8[i],
                        term.structure.df$roll.var.diff.9[i],
                        term.structure.df$roll.var.diff.10[i],
                        term.structure.df$roll.var.diff.11[i],
                        term.structure.df$roll.var.diff.12[i],
                        term.structure.df$roll.var.diff.13[i],
                        term.structure.df$roll.var.diff.14[i],
                        term.structure.df$roll.var.diff.15[i],
                        term.structure.df$roll.var.diff.16[i],
                        term.structure.df$roll.var.diff.17[i],
                        term.structure.df$roll.var.diff.18[i],
                        term.structure.df$roll.var.diff.19[i],
                        term.structure.df$roll.var.diff.20[i],
                        term.structure.df$roll.var.diff.21[i],
                        term.structure.df$roll.var.diff.22[i],
                        term.structure.df$roll.var.diff.23[i],
                        term.structure.df$roll.var.diff.24[i],
                        term.structure.df$roll.var.diff.25[i],
                        term.structure.df$roll.var.diff.26[i],
                        term.structure.df$roll.var.diff.27[i],
                        term.structure.df$roll.var.diff.28[i],
                        term.structure.df$roll.var.diff.29[i],
                        term.structure.df$roll.var.diff.30[i]))
}

#Initialize list, pre allocate memory
results<-vector("list", length(variance))
hurst<-vector("list", length(variance))
library(RcppEigen)
i=1
for(i in 1:length(variance)){
  results[[i]]<-fastLm( log(lag.vec) ~ log(variance[[i]]), data=variance)
  hurst[[i]]<- coef(results[[i]])[2]/2
  ptm0 <- proc.time()
  Sys.sleep(0.1)
  ptm1=proc.time() - ptm0
  time=as.numeric(ptm1[3])
  cat('\n','Iteration',i,'took', time, "seconds to complete")
}

# Join results to data frame
hurst <- do.call(rbind, hurst)
hurst.df <- as.data.frame(hurst)
hurst.df <- data.frame(hurst.df,Date=term.structure.df$Date)
colnames(hurst.df)[1] <- "Hurst"
hurst.df <- subset(hurst.df, Date >= as.POSIXct("2008-04-28") ) # subset data remove leading NA VXX only
# Plot Data
ggplot() +
  geom_line(data=hurst.df ,aes(x=Date,y=Hurst), colour="black") +
  theme_classic()+
  scale_y_continuous(breaks = round(seq(min(hurst.df$Hurst), max(hurst.df$Hurst), by = 0.2),2))+
  scale_x_date(breaks = date_breaks("years"), labels = date_format("%Y"))+
  ggtitle("VXX Hurst Exponent - Daily Bars - Lags 2:30", subtitle = "Regression log(variances) ~ log(time_lags) - Hurst = Coef/2") +
  labs(x="Date",y="Hurst")+
  theme(plot.title = element_text(hjust=0.5),plot.subtitle =element_text(hjust=0.5))+
  #geom_hline(yintercept = 0.5, color = "red", size=0.5,linetype="dashed")+
  geom_rect(aes(xmin=as.Date(head(hurst.df$Date,1)),xmax=as.Date(Inf),ymin=0.5,ymax=Inf),alpha=0.1,fill="green")+
  geom_rect(aes(xmin=as.Date(head(hurst.df$Date,1)),xmax=as.Date(Inf),ymin=-Inf,ymax=0.5),alpha=0.1,fill="orange")+
  geom_rect(aes(xmin=as.Date(head(hurst.df$Date,1)),xmax=as.Date(Inf),ymin=0.48,ymax=0.52),alpha=.7,fill="red")

The output:

Rplot140
Random Walk Band = Red Between .48 and .52. Momentum = Green > 0.52. Mean Reversion = Red < .48.
Rplot142
Random Walk Band = Red Between .48 and .52. Momentum = Green > 0.52. Mean Reversion = Red < .48.

This is for lagged differences of 2 to 30 days. A rolling variance of 100 days. We chose the smaller lagged differences as the strategies going forward will likely hold no longer than 30 days so it makes sense to see the nature of the series on this time period.

We can compute how often XIV and VXX is in a momentum, mean reversion and random walk regime. The code for this:

# Count how often in each regime
momo <- sum(hurst.df$Hurst > 0.52, na.rm=TRUE)
mean.rev <- sum(hurst.df$Hurst < 0.48 , na.rm=TRUE)
random <- sum(hurst.df$Hurst >= 0.48 & hurst.df$Hurst <=.52, na.rm=TRUE)
exact.random <- sum(hurst.df$Hurst >= 0.50 & hurst.df$Hurst <.51, na.rm=TRUE)
total.rows <- NROW(hurst.df)

# Percentage of time in momentum, mean reversion, random walk
momo.perc <- momo / total.rows
mean.rev.perc <- mean.rev / total.rows
random.perc <- random / total.rows
exact.random.perc <- exact.random / total.rows

VXX:

vxx.percs.df <- data.frame ("Momentum, Over 0.50" = momo.perc,"Mean Reversion, Less than 0.5" = mean.rev.perc, "Random Walk Band, 0.48 to 0.52" = random.perc, "Exact Random Walk, 0.50" = exact.random.perc) vxx.percs.df > vxx.percs.df
  Momentum..Over.0.50 Mean.Reversion..Less.than.0.5 Random.Walk.Band..0.48.to.0.52 Exact.Random.Walk..0.50
1           0.7471264                     0.1395731                      0.1133005              0.02791461<span 				data-mce-type="bookmark" 				id="mce_SELREST_start" 				data-mce-style="overflow:hidden;line-height:0" 				style="overflow:hidden;line-height:0" 			></span>

XIV:

 xiv.percs.df <- data.frame ("Momentum, Over 0.50" = momo.perc,"Mean Reversion, Less than 0.5" = mean.rev.perc, "Random Walk Band, 0.48 to 0.52" = random.perc, "Exact Random Walk, 0.50" = exact.random.perc) xiv.percs.df  > xiv.percs.df
  Momentum..Over.0.50 Mean.Reversion..Less.than.0.5 Random.Walk.Band..0.48.to.0.52 Exact.Random.Walk..0.50
1           0.7081281                     0.2085386                     0.08333333                0.022578

What we see is VXX is in momentum phase 74% of the time and XIV in momentum phase 70% of the time. That is the dominating theme where mean reversion 13% (VXX) and 20% (XIV) and times of random walk 11%(VXX) and 8% (XIV).

If fitting a model to the series itself without using the volatility risk premium / roll yield as entry signals. One may try fitting models based on the theme of momentum.

In subsequent posts we will be applying models based on extracting profits when the market is in contango/backwardation as well as applying models to the series itself. We will also look at the auto correlatation of the series and this will serve as a primer for testing a strategy for robustness.

Thanks for reading!

Advertisements

Creating Constant 30 Day Maturity From VIX1 | VIX2 Futures Contracts

In previous post Creating synthetic VXX and XIV data and plotting the VIX term structure using R we created the VIX term structure from CBOE data.

In this post we will create a 30 day constant maturity contract for the VIX1 | VIX2 futures contracts. This code is an extension and total rework of Ilya Kipnis term structure code.

The methodology for this procedure is to weight VIX1 and VIX2 based on the number of days remaining to expiration. The math for this taken from this prospectus, PS-32.

eg

where:
dt = The total number of business days in the current Roll Period beginning with, and including, the starting CBOE VIX Futures Settlement Date and ending with, but excluding, the following CBOE VIX Futures Settlement Date. The number of business days will not change for purposes of this calculation in cases of a new holiday introduced intra-month or an unscheduled market closure.
dr = The total number of business days within a roll period beginning with, and including, the following business day and ending with, but excluding, the following CBOE VIX Futures Settlement Date. The number of business days includes a new holiday introduced intra-month up to the business day preceding such a holiday.

At the close on the Tuesday corresponding to the start of the Roll Period, all of the weight is allocated to the first month contract. On each subsequent business day a fraction of the first month VIX futures contracts are sold and an equal notional amount of the second month VIX futures contracts is bought. The fraction, or quantity, is proportional to the number of first month VIX futures contracts as of the previous index roll day, and inversely proportional to the length of the current Roll Period. In this way the initial position in the first month contract is progressively moved to the second month over the course of the month, until the next following Roll Period starts when the old second month VIX futures contract becomes the new first month VIX futures contract. In addition to the transactions described above, the weight of each index component is also adjusted every day to ensure that the change in total dollar exposure for the index is only due to the price change of each contract and not due to using a different weight for a contract trading at a higher price

In short, on Tuesday before VIX Wednesday expiration 100% of the weight will be in the VIX1 contract, the former VIX2 contract. As a short example. Lets say there are 25 days in the roll period (dt). On day 10 (dr) the weight in the front month (CRW1,t) would be 100*dr/dt = 40%. On the Monday before Wednesday VIX expiration, dt = 25, dr = 1, CRW1,t = 100*dr/dt = 4%. As the prospectus describes, “At the close on the Tuesday corresponding to the start of the Roll Period, all of the weight is allocated to the first month contract.” Thus on Tuesday before Wednesday expiration, if (dt) = 20, (dr) also =20 and the weight at the start of a new roll period would be CRW,1,t = 100*dr/dt = 100% as per the prospectus. Remember on Tuesday, VIX2 becomes VIX1 because this is when the start of the roll period begins for the next expiration date.

One note here, in this example we provide no additional dollar weight adjustments.

With that out of the way. The R code to perform the above procedure.

# VIX1|VIX2 30 Day Constant Maturity
# Andrew Bannerman 12.10.2017 

require(xts)
require(data.table)
require(ggplot2)
require(lubridate)
require(magrittr)
require(scales)
require(reshape2)
require(dplyr)
require(PerformanceAnalytics)
require(quantmod)

# 06 through 18
years <- c(paste0("0", c(6:9)), as.character(c(10:18)))

# futures months
futMonths <- c("F", "G", "H", "J", "K", "M",
               "N", "Q", "U", "V", "X", "Z")

# expiries come from http://www.macroption.com/vix-expiration-calendar/
expiries <- read.table("D:/R Projects/Final Scripts/VIX_term_structure/expiries.txt", header=FALSE)

# convert expiries into dates in R
dateString <- paste(expiries$V3, expiries$V2, expiries$V1, sep = "-")
dates <- as.Date(dateString, format = "%Y-%B-%d")

# map futures months to numbers for dates
monthMaps <- cbind(futMonths, c("01", "02", "03", "04", "05", "06",
                                "07", "08", "09", "10", "11", "12"))
monthMaps <- data.frame(monthMaps)
colnames(monthMaps) <- c("futureStem", "monthNum")

settlement.dates <- data.frame(Date=dates, Expiration.Date=2)
dates <- data.frame(dates)
dates$dateMon <- substr(dates$dates, 1, 7) # Extract year month only

contracts <- expand.grid(futMonths, years)
contracts <- paste0(contracts[,1], contracts[,2])
#contracts <- c(contracts, "F18")
stem <- "https://cfe.cboe.com/Publish/ScheduledTask/MktData/datahouse/CFE_"
#contracts <- paste0(stem, contracts, "_VX.csv")

masterlist <- list()
timesToExpiry <- list()
i=1
for(i in 1:length(contracts)) {

  # obtain data
  contract <- contracts[i]
  dataFile <- paste0(stem, contract, "_VX.csv")
  expiryYear <- paste0("20",substr(contract, 2, 3)) # Paste element 2 and 3 from the contract xYY
  expiryMonth <- monthMaps$monthNum[monthMaps$futureStem == substr(contract,1,1)]
  expiryDate <- dates$dates[dates$dateMon == paste(expiryYear, expiryMonth, sep="-")]
  data <- suppressWarnings(fread(dataFile))
  # create dates
  dataDates <- as.Date(data$`Trade Date`, format = '%m/%d/%Y')

  # create time to expiration xts
  toExpiry <- xts(expiryDate - dataDates, order.by=dataDates)
  colnames(toExpiry) <- contract
  timesToExpiry[[i]] <- toExpiry

  # get settlements
  settlement <- xts(data$Settle, order.by=dataDates)
  colnames(settlement) <- contract
  masterlist[[i]] <- settlement
}
i
# cbind outputs
masterlist <- do.call(cbind, masterlist)
timesToExpiry <- do.call(cbind, timesToExpiry)

# NA out zeroes in settlements
masterlist[masterlist==0] <- NA

sumNonNA <- function(row) {
  return(sum(!is.na(row)))
}

simultaneousContracts <- xts(apply(masterlist, 1, sumNonNA), order.by=index(masterlist))
chart.TimeSeries(simultaneousContracts)

dim(masterlist)
i=1
termStructure <- list()
expiryStructure <- list()
masterDates <- unique(c(first(index(masterlist)), dates$dates[dates$dates %in% index(masterlist)], Sys.Date()-1)) # %in% operator matches dates, sys.date-1 to include final range date
for(i in 1:(length(masterDates)-1)) {
  subsetDates <- masterDates[c(i, i+1)]
  dateRange <- paste(subsetDates[1], subsetDates[2], sep="::")
  subset <- masterlist[dateRange,c(i:(i+7))]
  subset <- subset[-1,]
  expirySubset <- timesToExpiry[index(subset), c(i:(i+7))]
  colnames(subset) <- colnames(expirySubset) <- paste0("C", c(1:8))
  termStructure[[i]] <- subset
  expiryStructure[[i]] <- expirySubset
}

termStructure <- do.call(rbind, termStructure)
expiryStructure <- do.call(rbind, expiryStructure)

simultaneousContracts <- xts(apply(termStructure, 1, sumNonNA), order.by=index(termStructure))
chart.TimeSeries(simultaneousContracts)

plot(t(coredata(last(termStructure))), type = 'b')
# Plot specific date term structure
backwardation <- termStructure["2017-11-30"] # Subset specific date
back.df <- as.data.frame(backwardation)
back.df <- setDT(back.df, keep.rownames = TRUE)[] # Set row names
colnames(back.df)[1] <- "Date"
back.df$Date <- ymd(back.df$Date)
back.df <- melt(data = back.df,id.vars = 'Date')   # melt df for plotting with ggplot2
colnames(back.df)[2] <- "Contract"

# plot
ggplot(data=back.df,aes(x=Contract,y=value,group = 1))+
  geom_point()+  geom_line()+
  theme_classic()+
  ggtitle("VIX Term Structure for Date 2017-11-30",subtitle="Example of Contango")+
  labs(x="Contract",y="Settlement")+
  theme(plot.title = element_text(hjust=0.5),plot.subtitle =element_text(hjust=0.5))

# Save Term Structure to data frame
term.structure.df <- as.data.frame(termStructure)
term.structure.df <- setDT(term.structure.df, keep.rownames = TRUE)[] # Set row names
colnames(term.structure.df)[1] <- "Date"
term.structure.df$Date <- ymd(term.structure.df$Date)

# Adjust for 10:1 split pre 03-26-2007
# Split information here: http://cfe.cboe.com/publish/CFEinfocirc/CFEIC07-003%20.pdf
df.10.split <- subset(term.structure.df, Date < as.POSIXct("2007-03-26/") ) # subset data prior to split
library(magrittr)
df.10.split[,2:9] %<>% lapply(function(x) x / 10) # appply split to all columns excluding date 1
df.post.split <- subset(term.structure.df, Date >= as.POSIXct("2007-03-26/") )  # subset post split
term.structure.df <- rbind(df.10.split,df.post.split) # rbind pre and post split data frames
tail(term.structure.df,50)
length(term.structure.df)

# Find last dates
last <- tail(term.structure.df$Date,1)
last.settle <- tail(settlement.dates$Date,1)
date.fill <- as.data.frame(seq(as.Date(last), as.Date(last.settle), "days"))
colnames(date.fill)[1] = "Date"
date.excl.wknds <- date.fill[which(weekdays(as.Date(date.fill$Date, format = "%Y/%m/%d"))
                              %in% c('Monday','Tuesday', 'Wednesday', 'Thursday', 'Friday')), ]

# Dummy data in order to fill out to future expiration dates
dates.df <- data.frame(date.excl.wknds,NA,NA,NA,NA,NA,NA,NA,NA)
colnames(dates.df)[1] <- "Date"
colnames(dates.df)[2] <- "C1"
colnames(dates.df)[3] <- "C2"
colnames(dates.df)[4] <- "C3"
colnames(dates.df)[5] <- "C4"
colnames(dates.df)[6] <- "C5"
colnames(dates.df)[7] <- "C6"
colnames(dates.df)[8] <- "C7"
colnames(dates.df)[9] <- "C8"
head(dates.df)
term.structure.df <- rbind(term.structure.df,dates.df)

# Calculate constant 30 day VIX1|VIX2 maturity
# Check weights http://www.ipathetn.com/US/16/en/details.app?instrumentId=259118
# USe method: http://app.velocitysharesetns.com/files/prospectus/PRICING_SUPPLEMENT_No__VLS_ETN-1_A36_long_form_2.pdf
# Join settlement dates
term.structure.df <- full_join(term.structure.df, settlement.dates, by = c("Date" = "Date"))
term.structure.df$Expiration.Date[is.na(term.structure.df$Expiration.Date)] <- 1  # Turn all NA to 1
# Calculate days in roll period
#term.structure.df <- subset(term.structure.df, Date >= as.POSIXct("2014-09-16") & Date <= as.POSIXct("2014-11-21")) # Subset between matches
term.structure.df$Day.in.roll.period <- NULL
for(i in 1:nrow(term.structure.df)) {
  term.structure.df$Day.in.roll.period[i] <- ifelse(term.structure.df$Expiration.Date[i + 1]==2,1,term.structure.df$Day.in.roll.period[i - 1]+1)
}

# Calculate dt = Total days in roll period
term.structure.df$dt <- NULL  # Initialize
for(i in nrow(term.structure.df):1) {
  term.structure.df$dt[i] <- ifelse(term.structure.df$Expiration.Date[i+2]==2,term.structure.df$Day.in.roll.period[i],NA)
}
library(zoo)
dt.total.days.in.roll.period  <- na.locf(term.structure.df$dt, fromLast = TRUE)
length(dt.total.days.in.roll.period)
diff <- nrow(term.structure.df) - length(dt.total.days.in.roll.period)  # Find difference in lengths
dt.total.days.in.roll.period<- c(dt.total.days.in.roll.period ,rep(NA,diff)) # Pad out with NA to match
term.structure.df <- data.frame(term.structure.df,dt.total.days.in.roll.period)

# Calculate dr = Days remaining in roll period
term.structure.df$dr <- NULL  # Initialize
for(i in nrow(term.structure.df):1) {
  term.structure.df$dr[i] <- ifelse(term.structure.df$Expiration.Date[i+2]==2,1,term.structure.df$dr[i + 1]+1)
}

# Calculate CRW(1,t) Front month weight at close of day
term.structure.df$CRW <- (term.structure.df$dr / term.structure.df$dt.total.days.in.roll.period)*1

# Adjust front and 2nd month contacts at Tuesday prior to Expiration
# Shift VIX2 at Tuesday before wednesday expiration
# Pass 1
term.structure.df$C.1.roll.prices.pass.1 <- NULL  # Initialize
for(i in nrow(term.structure.df):1) {
  term.structure.df$C.1.roll.prices.pass.1[i] <- ifelse(term.structure.df$Expiration.Date[i+1]==2,term.structure.df$C2[i],term.structure.df$C1[i])
}
# Pass 2
term.structure.df$C.1.roll.prices.pass.2 <- NULL  # Initialize
for(i in nrow(term.structure.df):1) {
  term.structure.df$C.1.roll.prices.pass.2[i] <- ifelse(term.structure.df$Expiration.Date[i]==2,term.structure.df$C2[i],term.structure.df$C.1.roll.prices.pass.1[i])
}
colnames(term.structure.df)[17] = "C1.Rolled"

# Shift VIX3 at Tuesday before wednesday expiration
# Pass 1
term.structure.df$C.2.roll.prices.pass.1 <- NULL  # Initialize
for(i in nrow(term.structure.df):1) {
  term.structure.df$C.2.roll.prices.pass.1[i] <- ifelse(term.structure.df$Expiration.Date[i+1]==2,term.structure.df$C3[i],term.structure.df$C2[i])
}
# Pass 2
term.structure.df$C.2.roll.prices.pass.2 <- NULL  # Initialize
for(i in nrow(term.structure.df):1) {
  term.structure.df$C.2.roll.prices.pass.2[i] <- ifelse(term.structure.df$Expiration.Date[i]==2,term.structure.df$C3[i],term.structure.df$C.2.roll.prices.pass.1[i])
}
colnames(term.structure.df)[19] = "C2.Rolled"

# 30 Day Constant Maturity futures price VIX1 | VIX2
# CRW(1,t) * VIX1 + (1-CRW(1,t)) * VIX2
term.structure.df$vix1.2.contstant <- (term.structure.df$CRW * term.structure.df$C1.Rolled) +((1-term.structure.df$CRW)* term.structure.df$C2.Rolled)
# Download spot VIX from CBOE
VIX <- fread("http://www.cboe.com/publish/scheduledtask/mktdata/datahouse/vixcurrent.csv", skip = 1)
VIX <- as.data.frame(VIX)
VIX$Date <- mdy(VIX$Date)
colnames(VIX)[5] <- "Spot_VIX"
head(VIX)
#VIX <- xts(VIX$vix_close, order.by=as.Date(VIX$Date, format = '%Y/%m/%d'))
#VIX1VIX2 = xts(term.structure.df$vix1.2.contstant, order.by=as.Date(term.structure.df$Date, format="%Y-%m-%d"))
tail(term.structure.df$vix1.2.contstant)
# Merge VIX1|VIX2 by common date
vix12.df <- data.frame(Date=term.structure.df$Date,VIX1VIX2=term.structure.df$vix1.2.contstant)
plot.df <- full_join(vix12.df, VIX, by = c("Date" = "Date"))

# Plot Spot Vix vs VIX1|VIX2 30 day constant maturity
ggplot() +
  geom_line(data=plot.df ,aes(x=Date,y=VIX1VIX2), colour="red3") +
  geom_line(data=plot.df,aes(x=Date,y=Spot_VIX), colour="#0072B2")+
  theme_classic()+
  scale_x_date(breaks = date_breaks("years"), labels = date_format("%Y"))+
  ggtitle("VIX Spot Vs VIX1|VIX2 30 Day Constant Maturity", subtitle = "") +
  labs(x="Date",y="Settlement")+
  theme(plot.title = element_text(hjust=0.5),plot.subtitle =element_text(hjust=0.5))+
  annotate("text", label = "Spot VIX", x = as.Date("2005-04-26"), y = 30, color = "#0072B2")+
  annotate("text", label = "VIX1|VIX2", x = as.Date("2011-10-26"), y = 60, color = "red3")

And this is the net result. Plotting Spot VIX to the VIX1|VIX2 constant 30 maturity:

Rplot136_constant

All code can be found on my github.

Thanks for reading, any questions comments or concerns please let me know!

Creating synthetic VXX and XIV data and plotting the VIX term structure using R

This is the first of a collection of volatility postings. Ilya Kipnis did a good job of writing an R script to download VIX settlement data direct from the CBOE website. Furthermore, we can use this data to re-create synthetic VXX, XIV series and kudos to Quant Bear for extending Ilya’s code, the post on how to create synthetic VXX data can be found here.

To follow this post you will need to use Ilya’s script and have the term structure data downloaded within R.

Here we plot the VIX term structure from 2005 to present:

Rplot90

Note the large change in price in 2007. This is due to a 10:1 split, more information here. We adjust all prices prior to the split date 2007-03-26 by dividing by 10. Next we re-plot the data. The code to do this:

# Prepare Data for plotting varying contract expirations
# Adjust for 10:1 split pre 03-26-2007
# Split information here: http://cfe.cboe.com/publish/CFEinfocirc/CFEIC07-003%20.pdf
# Change termStucture column names
colnames(termStructure)[1] = "F"
colnames(termStructure)[2] = "G"
colnames(termStructure)[3] = "H"
colnames(termStructure)[4] = "J"
colnames(termStructure)[5] = "K"
colnames(termStructure)[6] = "M"
colnames(termStructure)[7] = "N"
colnames(termStructure)[8] = "Q"

# Save xts series to date frame in order to melt
df <- data.frame(termStructure)
df <- setDT(df, keep.rownames = TRUE)[] # Set row names
colnames(df)[1] = "Date"
df.10.split <- subset(df, Date < as.POSIXct("2007-03-26/") ) # subset data prior to split
library(magrittr)
df.10.split[,2:9] %<>% lapply(function(x) x / 10) # appply split to all columns excluding date 1
df.post.split <- subset(df, Date >= as.POSIXct("2007-03-26/") ) # subset post split
df <- rbind(df.10.split,df.post.split) # rbind pre and post split data frames
df <- melt(data = df,id.vars = 'Date') # melt df for plotting with ggplot2
library(lubridate)
df$Date <- ymd(df$Date) # Convert date column to Date format
colnames(df)[2] = "Contract" # Rename

# Plot term structure
ggplot(data=df,aes(x=Date,y=value,colour=Contract))+
geom_line()+
scale_x_date(breaks = date_breaks("years"), labels = date_format("%Y"))+
ggtitle("VIX Term Structure",subtitle="Adjusted for 10:1 split pre 2007-03-26")+
labs(x="Year",y="Settlement")+
theme(plot.title = element_text(hjust=0.5),plot.subtitle =element_text(hjust=0.5)

Rplot93

If we want to hone in on any one time period we can subset between dates and replot by:

# Subet by date (for plotting any window for further examination)
df <- subset(df, Date >= as.POSIXct("2007-06-01") & Date <= as.POSIXct("2010-01-01") )

# Plot term structure
ggplot(data=df,aes(x=Date,y=value,colour=Contract))+
geom_line()+
scale_x_date(breaks = date_breaks("years"), labels = date_format("%Y"))+
ggtitle("VIX Term Structure",subtitle="Adjusted for 10:1 split pre 2007-03-26")+
labs(x="Year",y="Settlement")+
theme(plot.title = element_text(hjust=0.5),plot.subtitle =element_text(hjust=0.5))

Rplot94

This is a clear view of a market in backwardation during the financial crsis. More information on backwardation here. And a plot for a backwardation period in 2008:

Rplot108

A an example for a market in contango:

Rplot110

Lets plot XIV and VXX data. That way we can get a feel for what happens to these products during contango and times of backwardation. We will use the syntethic data created from quant bears R code, an extensions to Ilyna’s code. Also, please follow quant bears post in order to correctly download the risk free rate data to .csv format. The code to create synthetic VXX and XIV data based on the CBOE data:

#This script allows you to calculate proxy values for the VXX ETF before its inception on the 29t of January 2009
#Prerequisites:

#obtain historical CBOE VIX Futures data following the instructions outlined in this post by Ilya Kipnis (QuantStratTradeR)
#https://quantstrattrader.wordpress.com/2017/04/27/creating-a-vix-futures-term-structure-in-r-from-official-cboe-settlement-data/

#have the variables expiries and masterlist defined exactly as described in the post above

#have obtained TBill Rates from https://www.treasurydirect.gov/instit/annceresult/annceresult_query.htm and manipulated as described in my post

#Loading required packages
require(xts)
require(bizdays)
require(timeDate)
load_rmetrics_calendars(2004:2018)

# expiries come from http://www.macroption.com/vix-expiration-calendar/
expiries <- read.table("D:/R Projects/Final Scripts/VIX_term_structure/expiries_1.txt", header=FALSE)

#Transforming the expiries
expiries = as.Date(apply(expiries,1,function(x){paste(x,collapse=" ")}),format="%d %b %Y")

#preparing the tbillrates
tbillrates <- read.csv("D:/R Projects/Final Scripts/VIX_term_structure/tbills.2.csv", header=TRUE, stringsAsFactors = FALSE)
tbillrates[is.na(tbillrates)] <- 0
tbillrates$Date <- mdy(tbillrates$Date)
str(tbillrates)
tbillrates = xts(tbillrates[,"Rate"],as.Date(tbillrates[,"Date"]))

#defining function to calculate the contract roll weights
getCRW <- function(today){
today = as.Date(today)
periodstart = expiries[max(which(expiries<=today))] periodend = expiries[min(which(expiries>today))]
dt = bizdays(periodstart,periodend,"Rmetrics/NYSE")
dr = bizdays(today,periodend,"Rmetrics/NYSE")-1
return(c(dr/dt,(dt-dr)/dt))
}

#defining function to calculate TBR
getTBR <- function(today,lastday){
today = as.Date(today)
lastday = as.Date(lastday)
delta = as.numeric(today-lastday)
rate = tbillrates[max(which(index(tbillrates)<today))]
tbr = (1/(1-91/360*rate))^(delta/91)-1
return(tbr)
}

i=1
#calculating the index values
days = index(masterlist["2005-12-20/2009-01-29"])
indx = 100000
for(i in 2:length(days)){
crw = getCRW(days[i-1])
tbr = getTBR(days[i],days[i-1])
fut1 = masterlist[days[i-1],which(!is.na(masterlist[days[i-1]]))[1:2]]
fut2 = masterlist[days[i],which(!is.na(masterlist[days[i]]))[1:2]]
if(!names(fut1)[1]%in%names(fut2)){
fut1 = masterlist[days[i-1],which(!is.na(masterlist[days[i-1]]))[2:3]]
}
twdi = sum(crw*as.numeric(fut1))
twdo = sum(crw*as.numeric(fut2))
cdr = twdo/twdi-1
#cdr = -cdr
val = indx[length(indx)]*(1+cdr+tbr)
indx = c(indx,val)
}
indx = xts(indx,days)

#adjusting for 10:1 split
indx["2007-03-26/"] = 10*indx["2007-03-26/"]
indxret = (indx/lag(indx)-1)[-1]

#calculating VXX values
vxxvals = 107089.921875
for(i in nrow(indxret):1){
tmp = vxxvals[length(vxxvals)]/(1+indxret[i,])
vxxvals = c(vxxvals,tmp)
}
vxxvals = rev(vxxvals)
vxxvals = xts(vxxvals,index(indx))
plot(vxxvals,type="l")
head(vxxvals)

We plot synthetic VXX data:

Rplot95

Next using the alphavantage API, See my post here for details to download alphavantage data using R, we will join synthetic VXX data to Alpha Vantage VXX data. The code to do this, note you will need to insert your API key:

# Join synthetic VXX data to Alpha Vantage VXX data
# Plot with ggplot2
# Download data
require(lubridate)
require(data.table)
require(dplyr)
# Note you need tyo place your API key...your_key_here
VXX <- fread("https://www.alphavantage.co/query?function=TIME_SERIES_DAILY_ADJUSTED&symbol=VXX&outputsize=full&apikey=.your_key_here&datatype=csv") #fread() data.table for downloading directly to a data frame
VXX$timestamp <- ymd(VXX$timestamp) #Lubridate to change character date to date format
VXX <- arrange(VXX,timestamp) #dplyr to sort data frame by date ascending order
colnames(VXX)[1] <- "Date"
VXX <- as.data.frame(VXX)
head(VXX)

# store synthetic VXX vxxvals price data to data.frame
df <- data.frame(vxxvals)
df <- setDT(df, keep.rownames = TRUE)[] # Set row names
colnames(df)[1] = "Date"
colnames(df)[2] = "close"
df$Date <- ymd(df$Date) # Convert to date format
df <- as.data.frame(df)

# Retain only Date and close columns for alphavantage VXX data
VXX <- VXX[ -c(2:5, 7:9) ]
colnames(VXX)[2] <- "close"

# ggplot2 to plot synthetic VXX and alphavantage VXX
require(ggplot2)
require(scales)
ggplot() +
geom_line(data=df,aes(x=Date,y=close), colour="red") +
geom_line(data=VXX,aes(x=Date,y=close), colour="black")+
theme_classic()+
scale_x_date(breaks = date_breaks("years"), labels = date_format("%Y"))+
ggtitle("VXX", subtitle = "Synthetic prior to 2009-01-29") +
labs(x="Date",y="VXX")+
theme(plot.title = element_text(hjust=0.5),plot.subtitle =element_text(hjust=0.5))+
annotate("text", label = "Synthetic VXX", x = as.Date("2007-04-26"), y = 100000, color = "red")+
annotate("text", label = "Alphavantage VXX", x = as.Date("2015-04-26"), y = 100000, color = "black")

Rplot97

For XIV, per QUANTBEAR post, we need to inverse the sign of the synthetic VXX returns. We achieve this with the following code which is the very same code as VXX, except we flip the sign of the returns and we have tacked on the alpha vantage adjusted close prices to the synthetic data:

#adjusting for 10:1 split
indx["2007-03-26/"] = 10*indx["2007-03-26/"]
indxret = (indx/lag(indx)-1)[-1]
indxret = -indxret # Flip the sign of the returns ####

#calculating XIV values
xivvals = 9.557 ####### Note place first XIV price here (we are using first adjusted close aklpha vantage close)
for(i in nrow(indxret):1){
tmp = xivvals[length(xivvals)]/(1+indxret[i,])
xivvals = c(xivvals,tmp)
}
xivvals = rev(xivvals)
xivvals = xts(xivvals,index(indx))
plot(xivvals,type="l")

# Join synthetic XIV data to Alpha Vantage XIV data
# Plot with ggplot2
# Download data
require(lubridate)
require(data.table)
require(dplyr)
# Note you need tyo place your API key...your_key_here
XIV <- fread("https://www.alphavantage.co/query?function=TIME_SERIES_DAILY_ADJUSTED&symbol=XIV&outputsize=full&apikey=6RSYX9BPXKZVXUS9&datatype=csv") #fread() data.table for downloading directly to a data frame
XIV$timestamp <- ymd(XIV$timestamp) #Lubridate to change character date to date format
XIV <- arrange(XIV,timestamp) #dplyr to sort data frame by date ascending order
colnames(XIV)[1] <- "Date"
XIV <- as.data.frame(XIV)

# store synthetic XIV XIVvals price data to data.frame
df <- data.frame(xivvals)
df <- setDT(df, keep.rownames = TRUE)[] # Set row names
colnames(df)[1] = "Date"
colnames(df)[2] = "close"
df$Date <- ymd(df$Date) # Convert to date format
df <- as.data.frame(df)

# Retain only Date and close columns for alphavantage XIV data
XIV <- XIV[ -c(2:5, 7:9) ]
colnames(XIV)[2] <- "close"

# ggplot2 to plot synthetic XIV and alphavantage XIV
require(ggplot2)
require(scales)
ggplot() +
geom_line(data=df,aes(x=Date,y=close), colour="red") +
geom_line(data=XIV,aes(x=Date,y=close), colour="black")+
theme_classic()+
scale_x_date(breaks = date_breaks("years"), labels = date_format("%Y"))+
ggtitle("XIV", subtitle = "Synthetic prior to 2010-11-30") +
labs(x="Date",y="XIV")+
theme(plot.title = element_text(hjust=0.5),plot.subtitle =element_text(hjust=0.5))+
annotate("text", label = "Synthetic XIV", x = as.Date("2007-04-26"), y = 85, color = "red")+
annotate("text", label = "Alphavantage XIV", x = as.Date("2015-04-26"), y = 85, color = "black")

Rplot98

Lets plot times when the term structure was in backwardation. This code prepares and plots backwardation:

# Change termStucture column names
colnames(termStructure)[1] = "F"
colnames(termStructure)[2] = "G"
colnames(termStructure)[3] = "H"
colnames(termStructure)[4] = "J"
colnames(termStructure)[5] = "K"
colnames(termStructure)[6] = "M"
colnames(termStructure)[7] = "N"
colnames(termStructure)[8] = "Q"

# Prepare Data For backwardation<span 				data-mce-type="bookmark" 				id="mce_SELREST_start" 				data-mce-style="overflow:hidden;line-height:0" 				style="overflow:hidden;line-height:0" 			></span> Plot
# Obtain backwardation dates
backwardation <- data.frame(termStructure)
backwardation <- setDT(backwardation, keep.rownames = TRUE)[] # Set row names
colnames(backwardation)[1] <- "Date"
backwardation.G <- ifelse(termStructure$F > termStructure$G,1,0)
backwardation.H <- ifelse(termStructure$G > termStructure$H,1,0)
backwardation.J <- ifelse(termStructure$H > termStructure$J,1,0)
backwardation.K <- ifelse(termStructure$J > termStructure$K,1,0)
backwardation.M <- ifelse(termStructure$K > termStructure$M,1,0)
backwardation.N <- ifelse(termStructure$M > termStructure$N,1,0)
backwardation.vec <- backwardation.G + backwardation.H + backwardation.J + backwardation.K + backwardation.M + backwardation.N
dates <- c(backwardation$Date)
backwardation.df <- data.frame(dates,backwardation.vec)
colnames(backwardation.df)[1] <- "Date"
colnames(backwardation.df)[2] <- "backwardation"

Rplot105.png

Times in pink are when the market was in backwardation. Next we may see how VXX and XIV acted during times of backwardation. Plot the non adjusted VXX for clarity.

Rplot103

Rplot106

Times of backwardation essentially cause XIV to go down in price and VXX to go up in price.

As we organized the backwardation dates, lets view how often the VIX is in backwardation / contango. We can pull the data using the code below:

# Count how often VIX in backwardation / contango
total.backwardation.days <- sum(backwardation.df$backwardation > 0, na.rm=TRUE)
total.days <- nrow(backwardation.df)
backwardation.perc <- total.backwardation.days/total.days *100
contango = 100-backwardation.perc
print(contango)
print(backwardation.perc)

VIX term structure is in backwardation 29.07% of the time and in contango 70.92% of the time.

Let see what the beta is for XIV, VXX. Code courtesy of Ilya Kipnis for blog post see here. With the original post by Dr. Jonathan Kinlay on beta convexity. At the center of the idea is that an asset relative to its benchmark exhibit different betas when the benchmark is down and up in price.

# Beta Convexity 
# Download SPY data
# Note you need tyo place your API key...your_key_here
SPY <- fread("https://www.alphavantage.co/query?function=TIME_SERIES_DAILY_ADJUSTED&symbol=SPY&outputsize=full&apikey=your_key_here&datatype=csv") #fread() data.table for downloading directly to a data frame
SPY$timestamp <- ymd(SPY$timestamp)   #Lubridate to change character date to date format
SPY <- arrange(SPY,timestamp)   #dplyr to sort data frame by date ascending order
colnames(SPY)[1] <- "Date"
SPY$Date <- ymd(SPY$Date)
SPY <- as.data.frame(SPY)
SPY <- subset(SPY, Date >= as.POSIXct("2009-01-30") ) # VXX start
SPY <- subset(SPY, Date >= as.POSIXct("2000-01-03")) # XIV start

VXX <- fread("https://www.alphavantage.co/query?function=TIME_SERIES_DAILY_ADJUSTED&symbol=VXX&outputsize=full&apikey=your_key_here&datatype=csv") #fread() data.table for downloading directly to a data frame
VXX$timestamp <- ymd(VXX$timestamp)   #Lubridate to change character date to date format
VXX <- arrange(VXX,timestamp)   #dplyr to sort data frame by date ascending order
colnames(VXX)[1] <- "Date"
VXX$Date <- ymd(VXX$Date)
VXX <- as.data.frame(VXX)

XIV <- fread("https://www.alphavantage.co/query?function=TIME_SERIES_DAILY_ADJUSTED&symbol=XIV&outputsize=full&apikey=your_key_here&datatype=csv") #fread() data.table for downloading directly to a data frame
XIV$timestamp <- ymd(XIV$timestamp)   #Lubridate to change character date to date format
XIV <- arrange(XIV,timestamp)   #dplyr to sort data frame by date ascending order
colnames(XIV)[1] <- "Date"
XIV$Date <- ymd(XIV$Date)
XIV <- as.data.frame(XIV)
XIV <- subset(XIV, Date >= as.POSIXct("2000-01-03 "))

# Calculate returns 
spy.xts = xts(SPY$adjusted_close, order.by=as.Date(SPY$Date, format="%Y/%m/%d"))
vxx.xts = xts(VXX$adjusted_close, order.by=as.Date(VXX$Date, format="%Y/%m/%d"))
xiv.xts = xts(XIV$adjusted_close, order.by=as.Date(XIV$Date, format="%Y/%m/%d"))
spy.ret <- Return.calculate(spy.xts)
vxx.ret <- Return.calculate(vxx.xts)
xiv.ret <- Return.calculate(xiv.xts)

rets <- na.omit(cbind(xiv.ret, spy.ret))
colnames(rets) <- c("XIV", "SPY")

betaConvexity <- function(Ra, Rb) {
  positiveBench <- Rb[Rb > 0]
  assetPositiveBench <- Ra[index(positiveBench)]
  positiveBeta <- CAPM.beta(Ra = assetPositiveBench, Rb = positiveBench)
  
  negativeBench <- Rb[Rb < 0]
  assetNegativeBench <- Ra[index(negativeBench)]
  negativeBeta <- CAPM.beta(Ra = assetNegativeBench, Rb = negativeBench)
  
  out <- (positiveBeta - negativeBeta) ^ 2
  return(out)
}

betaConvexity(rets$XIV, rets$SPY)

> betaConvexity(rets$XIV, rets$SPY)
[1] 2.066219

> betaConvexity(rets$VXX, rets$SPY)
[1] 1.931566

For the entire XIV and VXX sample beta is approx 2 for each product, meaning spy moves 1% XIV,VXX move approx 2%.

Traditionally we may run a regression fit between the daily returns of the SPY and XIV,VXX. This however doesn’t take into effect the beta of an asset when the benchmark is up or down. We see how the beta has changed over time with a rolling linear regression. The code to do this:

# Run rolling linear regression for beta
# Stock SPY as independant variable
rolling_lms <- lapply(seq(20,nrow(rets)), function(x) lm(XIV ~ SPY, data = rets[1:x , ]) )
length(rolling_lms)
nrow(rets)
all_slopes <-unlist(sapply(1:length(rolling_lms),function(j) rolling_lms[[j]]$coefficients[2]))
all_slopes<- unlist(all_slopes)
plot(all_slopes,type="l")
# Join regression output to original data frame 
row.diff <- rep(NA, nrow(rets) - length(rolling_lms)) # make leading NAs to position data correclty with original
beta <- c(row.diff,all_slopes)
rets <- data.frame(rets,beta)
rets <- setDT(rets, keep.rownames = TRUE)[] # Set row names
colnames(rets)[1] <- "Date"
rets$Date <- ymd(rets$Date)

# Plot beta 
ggplot() +
  geom_line(data=rets,aes(x=Date,y=beta), colour="red") +
  theme_classic()+
  scale_x_date(breaks = date_breaks("years"), labels = date_format("%Y"))+
  ggtitle("Rolling Linear Regression - Plot XIV Beta", subtitle = "XIV Inception") +
  labs(x="Date",y="VXX")+
  theme(plot.title = element_text(hjust=0.5),plot.subtitle =element_text(hjust=0.5))

Rplot122

And for a visual of the regression fit for the full XIV sample period. The code to plot this:

# Plot Linear Regression
ggplot(rets, aes(x=SPY, y=XIV))+ 
  geom_point(shape=1)+
  geom_smooth(method=lm)+
  ggtitle("Linear Regression y~x XIV~SPY", subtitle = "Start date = XIV inception") +
  labs(x="SPY",y="XIV")+
  theme(plot.title = element_text(hjust=0.5),plot.subtitle =element_text(hjust=0.5))

Rplot121

We repeat the process for VXX.

Rplot118

Rplot119

Both XIV and VXX beta have increased over time. Essentially a 1x move in the SPY leads to an approx -3x move in VXX and 3.66 move in XIV.

This will be a primer for upcoming posts where we will experiment with different volatility strategies.

All code can be found on my github.

Thanks again Ilya and QUANTBEAR for sharing your code on your blogs.

Testing Cryptocurrency Coin Price Series for Momentum, Mean Reversion, Random Walk

In this post we will test a modified version of the relative strength index to test for the presence of persistent momentum, mean reversion and random walk properties of coin price series.

The Relative Strength Indicator (RSI) was developed by J. Welles Wilder and published in a 1978 book, New Concepts in Technical Trading Systems, and in Commodities magazine (now Futures magazine) in the June 1978 issue (wikipedia).

The RSI can be calculated by:
U = close − close (Upward Change)
D = close − close (Downward Change)

If U is higher, then D = 0. If D is lower, then U = 0.

An (n) period exponential moving average is calculated over the U and D values.

The ratio of these averages is the relative strength or relative strength factor:
RS = EMA(U,n) / EMA(D,n)
n denoting the look back period.

The final RS ratio is converted into an index which oscillates between 0 and 100:

RS = 100 – (100/ 1+100)

In its classic form Wilder recommended a smoothing period of 14. Furthermore, a period of 14 days has been tested on the S&P500, (Adrian Ţăran-Moroşan(2011). The author of the paper selects RSI extreme signal levels of 30 and 70. Sell short the market over 70 and buy the market below 30. The test yielded poor results, however, there may be room for improvement in the look back period that has been selected. There is no consideration for the S&P500 half life of mean reversion. For example. If the S&P500 typically mean reverts on a period of less that 14 days, the classic 14 period would mask most signals. An optimal look back period equal to the half life of mean reversion may have yielded better results.

In this post we will examine a short term RSI set at an arbitrarily chosen look back period of 2 days. We select RSI extreme signal levels of 20 and 70 respectively. The strategy will buy the market when RSI2 is over 70 and will exit the trade when RSI2 drops below 85. Conversely, the strategy will buy the market when RSI2 is below 20 and will sell when RSI2 is over 70.

In essence, the two RSI models above test for momentum (buy high and sell higher) and test for mean reversion (buy low and sell high). Those series where both momentum and mean reversion models do not yield good results, those series may be considered random walks, where random walk series may be attributed to being difficult to predict.

Persistent price anomalies may exist in price series. If this holds true then the models above may show medium / long term stability of predictable returns.

Next, we download 1568 daily coin price series to .csv format using the crypto compare API. Using R we achieve this with the following code:

# Crypto Compare Coin Data
# Download using R
# Andrew Bannerman 11.29.2017

library(jsonlite)
library(data.table)
library(ggplot2)

# Obtain coin list
response = fromJSON("https://www.cryptocompare.com/api/data/coinlist")
df = as.data.frame(data.table::rbindlist(response$Data, fill=TRUE))
# Write coin list to .csv
write.csv(df,file="D:/R Projects/Final Scripts/BitCoin Scripts/coin_list.csv")

# Load Tickets From Coin List
read.data <- read.csv("D:/R Projects/Final Scripts/BitCoin Scripts/coin_list.csv", header=TRUE, stringsAsFactors = FALSE)
tickers <- c(read.data$Symbol)
# Remove all * at end of file names
tickers <- gsub("[*]", "", tickers)

# Obtain API (information only)
#cc <- fromJSON("https://min-api.cryptocompare.com/")
#str(cc)

# Download Daily Data
# Loop for downloading all .csvs
# Save each coin close plot
# Add error catching to loop
# Add completion time of each iteration
i=1
for (i in 1:length(tickers)) {
  tryCatch({
  next.coin <- tickers[i] # next coin in ticker vector
coin_daily <- fromJSON(paste0("https://min-api.cryptocompare.com/data/histoday?fsym=",next.coin,"&tsym=USD&allData=true&e=CCCAGG"))
df <- data.frame("time" = coin_daily$Data$time, "close" = coin_daily$Data$close,"open" = coin_daily$Data$open,"high" = coin_daily$Data$high,"low" = coin_daily$Data$low,"volumefrom" = coin_daily$Data$volumefrom,"volumeto" = coin_daily$Data$volumeto)
# Save plot out put to folder
# Set path, set png plot size
# ggplot 2 to create line plot
mytitle = paste0(next.coin)
graphics.off() # Close graphic device before next plot
p <- ggplot(df, aes(time)) +
  theme_classic()+
  geom_line(aes(y=close), colour="red") +
  ggtitle(mytitle, subtitle = "") +
  labs(x="Time",y="Close")+
  theme(plot.title = element_text(hjust=0.5),plot.subtitle =element_text(hjust=0.5))
ggsave(path="D:/R Projects/Final Scripts/BitCoin Scripts/daily_plot",paste(next.coin,".png"))
# Save each file to .csv
write.csv(df,paste0(file="D:/R Projects/Final Scripts/BitCoin Scripts/data/",next.coin,".csv"))
ptm0 <span 				data-mce-type="bookmark" 				id="mce_SELREST_start" 				data-mce-style="overflow:hidden;line-height:0" 				style="overflow:hidden;line-height:0" 			></span><- proc.time()
Sys.sleep(0.1)
ptm1=proc.time() - ptm0
time=as.numeric(ptm1[3])
cat('\n','Iteration',i,'took', time, "seconds to complete")
  }, error = function(e) { print(paste("i =", i, "failed:")) })
}

This coin universe will be used to test for mean reversion, momentum and random walk using RSI2.

An R script was setup to run the RSI2 momentum and mean reversion model on each coin series. No effort was made to restrict the starting date of each series, rather the entire price series was tested in each instance.

The end of day closing prices were used for testing. To avoid look ahead bias. When an RSI2 signal was triggered. The entry was placed on the exchange at the next days market open. Subsequently, a sell signal was closed out on the same day of the signal. In real life, the script would run 15 minutes before the close and any conditions met would close out prior to end of the day. In this back test, due to my absence of higher frequency coin data, this has not been included. The results would be for guidance only not for exact specifics. However, despite the 15 minute discrepancy, it should still serve as a basis for revealing mean reversion, momentum and random walk series.

No effort has been made to optimize parameters, optimize model parameters in and out of sample data, no statistical testing for mean reversion, momentum and no half life of mean reversion calibrations. These tests may aid in selecting optimal look back periods post initial RSI2 mean reversion, momentum screening and also avoid effects of look ahead bias and over optimization.

First, the top 50 momentum price series results are provided in table 1.0. To refresh, a momentum buy signal was generated when RSI2 closed over 70 and a sell signal generated when RSI2 crossed below 85. Note these parameters have been arbitrarily chosen. Data is sorted by top momentum cumulative returns.

momo_coin
Table 1.0 – Table sorted by highest momentum cumulative return

Lastly, the top 50 mean reversion price series results are provided in table 1.1. To refresh, a mean reversion buy signal was generated when RSI2 closed below 20 and a sell signal generated when RSI2 crossed above 70. Data is sorted by top mean reversion cumulative returns.

mean_rev_coin
Table 1.1 – Table sorted by highest mean reversion cumulative return

The RSI2 momentum, mean reversion and random walk screening reveled a total of 202 momentum series, where momentum out performed buy and hold. A total of 1086 mean reversion series, where mean reversion out performed buy and hold. A total of 340 random walk series, where buy and hold out performed both momentum and mean reversion RSI2 models. Based on this test, it would appear that mean reversion is the dominating anomaly within the coin price series.

Rplot161

The series obtained from the above RSI2 screening should be further selected for in-sample optimization and out of sample testing. The below figure shows an example of how this process may be conducted and typically is referred to as the hold out method:

optimize
Example of hold out method for conducting in-sample optimization and out-of-sample testing

For each in-sample period (orange), the model parameters are optimized and subsequently selected for forward out of sample testing (green). At the end of the first out of sample test, the in-sample period is then re-set by the fixed in-sample look back period. The model parameters are once again optimized and used for subsequent out of sample testing. This process is repeated for the entire sample period.

To conclude, the RSI2 may be a simple method to test for momentum, buy high and sell higher and mean reversion, buy low and sell high. These attributes may provide an under lying anomaly which can be exploited using simple models. The RSI look back period of 2 was arbitrarily chosen. This may serve as a purpose for revealing momentum or mean reversion series.

Furthermore, the RSI 2 may not be the correct look back to use for the the specific series under examination. This would be true because each and every series display different rates of diffusion from the mean. Those with momentum properties move away from the mean at a higher rate than mean reversion series or a slower rate than random walk. Mean reversion series exhibit different variances from the mean and also different speeds of reversion to the mean. The former relating directly to how much profit can be achieved from the mean reverting series. The latter relating to the how long it takes for a series to revert to the mean. A mean reversion speed of 9 months may not be suitable for a shorter term trader, not to mention, may not be the best deployment of capital.

For further research, the speed of mean reversion may be calculated using the Ornstein-Uhlenbeck Formula which is presented in a previous post. An optimal look back period for a mean reversion series would be considered equal to the half life of mean reversion (Ernest P. Chan,2013).

In closing, it should also be noted that there is a large benefit to using statistical testing to confirm the presence of momentum, mean reversion and random walk. This is due to the fact that statistical tests cover all data points in a sample. Where a model that is pre-selected like the above RSI2, only considers days when the RSI2 trigger was active, thus eliminating many days from the sample (Ernest P. Chan,2013).

A statistical test for confirming momentum, mean reversion and random walk is the hurst exponent. This is presented in two previous posts:
Hurst exponent in R
Modelling the Hurst exponent in R

The augmented dickey fuller test may be used to confirm the presence of mean reversion. Where one may reject the null hypothesis.

Good trading!
Andrew

Free Coin Data: Batch download coin data using R

Will use the crypto compare API. The data is free. We will download daily data.

Required packages

library(jsonlite)
library(data.table)
library(ggplot2)

Next download the full coin list from crypto compare and save the list to a data frame

# Obtain coin list
response = fromJSON("https://www.cryptocompare.com/api/data/coinlist")
df = as.data.frame(data.table::rbindlist(response$Data, fill=TRUE))

The format is in a list, for ease we will save it to a .csv

# Write coin list to .csv
write.csv(df,file="D:/R Projects/Final Scripts/BitCoin Scripts/coin_list.csv")

Load the list of symbols from the .csv file and save the symbols in a vector

# Load Tickets From Coin List
read.data <- read.csv("D:/R Projects/Final Scripts/BitCoin Scripts/coin_list.csv", header=TRUE, stringsAsFactors = FALSE)
tickers <- c(read.data$Symbol)

Clean the list

# Those symbols with * at the end of their name
# Remove *
tickers <- gsub("[*]", "", tickers)

Next we write a loop to loop through the ticker symbol vector and download each symbol to .csv format. As an extra bonus, we will also plot the close price and save the plot to our output folder.

# Download Daily Data 
# Loop for downloading all .csvs 
# Save each coin close plot
# Add error catching to loop
# Add completion time of each iteration
i=1
for (i in 1:length(tickers)) {
  tryCatch({
  next.coin <- tickers[i] # next coin in ticker vector
coin_daily <- fromJSON(paste0("https://min-api.cryptocompare.com/data/histoday?fsym=",next.coin,"&tsym=USD&allData=true&e=CCCAGG"))
df <- data.frame("time" = coin_daily$Data$time, "close" = coin_daily$Data$close,"open" = coin_daily$Data$open,"high" = coin_daily$Data$high,"low" = coin_daily$Data$low,"volumefrom" = coin_daily$Data$volumefrom,"volumeto" = coin_daily$Data$volumeto)
# Save plot to output to folder
# Set path, set png plot size
# ggplot 2 to create line plot 
mytitle = paste0(next.coin)
graphics.off() # Close graphic device before next plot
p <- ggplot(df, aes(time)) +
  theme_classic()+
  geom_line(aes(y=close), colour="red") +
  ggtitle(mytitle, subtitle = "") +
  labs(x="Time",y="Close")+
  theme(plot.title = element_text(hjust=0.5),plot.subtitle =element_text(hjust=0.5))
ggsave(path="D:/R Projects/Final Scripts/BitCoin Scripts/daily_plot",paste(next.coin,".png"))
# Save each file to .csv
write.csv(df,paste0(file="D:/R Projects/Final Scripts/BitCoin Scripts/data/",next.coin,".csv"))
ptm0 <- proc.time()
Sys.sleep(0.1)  
ptm1=proc.time() - ptm0
time=as.numeric(ptm1[3])
cat('\n','Iteration',i,'took', time, "seconds to complete")
  }, error = function(e) { print(paste("i =", i, "failed:")) })
}

Tail Risk – Downside Analysis (S&P500)

Tail Risk. The enemy for long term investing with leverage.

Lets express the 1 day daily returns in a box plot.

Rplot78

# Maximum Loss and Gain 
min(df$spx.clret)
max(df$spx.clret)
# Location of min/max
min.loc <- which.min(df$spx.clret)
max.loc <- which.max(df$spx.clret)
# Date of Min 
min <- df$Date[min.loc]
max <- df$Date[max.loc]
min
max
> min
[1] "1987-10-19"
> max
[1] "1933-03-15"
> min(df$spx.clret)
[1] -0.2289973
> max(df$spx.clret)
[1] 0.1536613

The maximum one day loss, -0.3558507% which occurred October 19th 1987 stock crash. The largest 1 day gain, 0.3522206% which occurred March 15th 1933.

If using leverage we obtain a 1 day loss of (-0.7117014 %,2x leverage) or (-1.0675521%, 3x leverage). These one day ‘shocks’ are what is known as tail risk. It means that one could be down to the above magnitude in one trading day.

If we check the frequency 1 day losses over over ‘n’ loss %:

lsoses

Rplot88

Again if using leverage, these losses multiply by their leverage factor. Since 1928, there were 8 occurrences as example of > 20% one day losses. If we use 3x leverage this one day loss increases to >60%.

Furthermore, we also face bear market declines which again pose another issue when holding long term with leverage.

dd

Inclusive of 1929 bear market, peak to trough draw downs can be in excess of 50%. Top of 08 to bottom in 09 is around 56%. Again if we buy and hold a leveraged product. We will see gains 2x or 3x this amount.

So these are the ‘issues’ with buy and holding with leverage. Is there a solution?

One method of exploration is to use a simple moving average as a low pass filter. Using monthly bars, a simple moving average tends to reduce over all volatility.

Rplot83

Areas in red are cash periods or periods below the 12sma.

The below plot shows the maximum draw down for any given simple moving average from 1928 to present day. The simple moving average moves the investor to cash when below:

Rplot80

Over the same period, buy and hold draw down is 0.89783016%. Depending on the sma used, we approx reduce draw downs by 40 to 50%.

The optimal monthly simple moving average ranked by cumulative returns 1928 to present day.

Rplot79

If we perhaps consider a ‘modern market’ 1950 onwards.

Over all draw down reduces from buy and hold 0.5687671 %… to the 25/30% area depending on the simple moving average used.

Rplot81

And the optimal look back for the monthly simple moving average:

Rplot82

The message is pretty clear. Using a simple moving average as a ‘low pass filter’ often reduces volatility of simple buy and hold. This strategy does well to avoid both bear market declines post 1999 and 2007.

The over all results of monthly 12 sma, 1988 to present:

One day shocks like the 1987 stock crash can yield a 1 day drown down using 2 / 3x leverage up to 70 or 90%.

Post 1987 stock crash we see maximum draw downs for varying simple moving averages. Note short term simple moving averages are invested during bear market declines:

Rplot86

The question going forward. Are one day shocks of the 1987 stock crash possible again in the future? Is there anything structural that can change this? Does HFT provide liquidity or will it evaporate during times of distress?

A maximum draw down of 19% since 1987 using the low pass filter approach seems pretty simple to bear. One could, from 1987 to present use a 2 or 3x leverage factor and increase gains SUBSTANTIALLY whilst matching a typical buy and hold draw down of 56%.

Leverage and risk is the trade off. If believe the next 40 years will be as rosy as the previous. One may wish to take this bet. Not to mention will the S&P500 even be the ‘king’ market and continue to provide the same level of returns as the past 100 years?

No one knows 🙂 If the risk profile suits. Then that’s up to you. Otherwise, a normal buy and hold investing approach using a simple moving average will avoid large negative compounding periods.

Long over 12 monthly sma:

Rplot87

The frequency of heavy 1 day losses reduces from buy and hold, This is partly due to avoiding the majority of bear market declines.

Rplot89

I believe the message is this: regardless of what you google about holding 3x leverage products, it should be studied whether the ETF is built soundly, ie has minimal tracking error for there are many ETFs with poor design.

Next – are you comfortable or can you stomach the sickening draw downs? There is no doubt spells of volatility and this is likely to continue.

Perhaps there are LESS VOLATILE instruments with smoother equity curves, perhaps less volatile strategies with smoother equity curves. With little volatility one may add a leverage multiplier at their will.

Basic Data Handling With Python

I have recently been looking for live execution platforms for placing a strategy into production. I have found zipline for python and with the intention of using zipline as a live execution platform I figured it would be prudent to pick up some python.

The learning curve from moving to R to python doesnt look that steep and in this post I will cover some basic data handling using python. We will:

1. Load Data, view data, check formats, convert integer date to Date format
2. Plot price series
3. Calculate cumulative returns / Annualized return, Annualized sharpe ratio
4. Calculate rolling mean, standard deviation and z-score
5. Perform linear regression by preparing two series, joining two data sets by common date and running the ordinary least squares regression

# Python Code For General working of data
# Andrew Bannerman 11.4.2017

# Import library
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import statsmodels.api as sm
from statsmodels.graphics.regressionplots import abline_plot # For
plotting regression line

# Read Data
df = pd.read_csv("Your Data Dir/MASTER_DATA_DUMP/SPY.csv")

# Print First 10 Lines Of Data
print(df.head(10))

# Print First 10 Rows Of Specific Data Frame Column
print(df.Close.head(10))

# Print Last 10 Lines of Data
print(df.tail(10))

# Print Last 10 Lines of Specific Data Frame Column
print(df.Close.tail(10))

# Check Data Frame Types
print(type(df))

# Check Type Of 1 Column
print(df['Close'].dtype)
print(df['Date'].dtype)

# Convert yyyymmdd to date format
df['Date'] = pd.to_datetime(df['Date'].astype(str), format ='%Y%m%d')
print(df['Date'].dtype) # Check date is in date format
print(df.Date.head(10)) # Print head of date column

# Create Plot Of SPY Close Prices
df.plot(x='Date',y='Close')
plt.xlabel('Date')
plt.ylabel('Close Price')
plt.title('SPY Close 1993 To Present')
plt.legend(loc='upper left')
plt.show()

Figure_1

# Calculate % Differences of Close Column
# Calculate Cumualtive Return
# use Dataframe.shift(1) to move series by -1 (yesterday)
df['ret'] = (df.Close - df.Close.shift(1)) / df.Close.shift(1)
df['cum_ret'] = (1 + df.ret).cumprod # Cumulative return
print(df.cum_ret.iloc[-1]) # Print last row of cumulative returns
(use iloc[-1])

# Plot Cumulative Returns
df.plot(x='Date',y='Close')
plt.xlabel('cum_ret')
plt.ylabel('Cumulative Returns')
plt.title('SPY Cumulative Returns')
plt.legend(loc='upper left')
plt.show()

Figure_1-1

# Calculate Annualized Sharpe Ratio
sharpe_ratio = np.sqrt(252) * (df.ret.mean() / df.ret.std())
print("Annualized Share Ratio is",sharpe_ratio)

# Calculate Annualized Return
time_between = (df.Date.iloc[-1] - df.Date.iloc[0]).days
print(time_between)
cagr = (df.Close.iloc[-1] / df.Close.iloc[1]) ** (365/time_between) -1
print("Annualized Return is",cagr, "Percent")

# Calculate Maximum Draw Down
lookback = 252
rolling_max = df['Close'].rolling(lookback, min_periods=1).max()
daily_drawdown = df['Close']/rolling_max - 1
daily_drawdown.plot()
plt.ylabel('Draw Down %')
plt.title('252 Day Rolling Draw Down')
plt.legend(loc='upper left')
plt.show()

Figure_1-2

# Calculate Rolling Statistics
# Calculate Rolling Z-Score
lookback = 12
df['moving_average'] = df.Close.rolling(lookback).mean() # Rolling
Moving Average
df['rolling_std_dev'] = df.Close.rolling(lookback).std() # Rolling stdev
df['rolling_zscore'] = (df.Close - df.moving_average) /
df.rolling_std_dev  # Zscore

# Plot Rolling Mean And Stdev
df.plot(x='Date', y=['Close','moving_average'])
plt.title('SPY Close With Moving Average')
plt.show()
df.plot(x='Date', y=['rolling_zscore'])
plt.title('N Period Look back Rolling Zscore')
plt.show()

Figure_1-3

Figure_1-4

# Linear Regression Two Series
SPX = pd.read_csv("Your Data Dir/MASTER_DATA_DUMP/$SPX.csv")
VIX = pd.read_csv("Your Data Dir/MASTER_DATA_DUMP/$VIX.csv")

# Prepare For Join By Common Date
# Convert yyyymmdd to date format
SPX['Date'] = pd.to_datetime(SPX['Date'].astype(str), format ='%Y%m%d')
VIX['Date'] = pd.to_datetime(VIX['Date'].astype(str), format ='%Y%m%d')

# Join Two Data Series By Common Date
merge_df = pd.merge(left=SPX, left_on='Date', right=VIX, right_on='Date')
print(merge_df.tail(10))

# Scatter Plot of SPX and VIX
merge_df.plot(x='Close_x',y='Close_y', kind='scatter')
plt.title('Scatter Plot VIX and SPX')
plt.show()

Figure_1_5

# Perform Linear Regression On Close Prices
model = sm.OLS(merge_df.Close_y, merge_df.Close_x)
results = model.fit()
print(results.summary())

ols