In addition to the previous posts studying volatility strategies here and here we aim to study the nature of the XIV,VXX series. We subject the series to the Hurst exponent and we follow the same procedure as Ernie Chan (2013) where we take the lagged price differences and perform a regression on the log time lags vs log variance of the lagged differences, an example of this here.

For this post we will download VIX futures data from CBOE website and join synthetic XIV and XIV data to make the largest data set we possibly can per this previous post. After we have our XIV and VXX data we then proceed to create the hurst exponent of each series and the procedure for this is as follows:

1. Compute lagged differences on varying time lags. For example. lag 2 = todays close price – close price 2 days ago. lag 3 = todays close price – close price 3 days ago.

2. Next, Compute the variance of the lagged differences. Ernie chan recommends at least 100 days of data for this. Compute variance over a period of 100 days for each lagged difference.

3. Perform linear regression of the x,y, log(time_lags) ~ log(variance_lagged_differences) and divide the slope by 2 to obtain the hurst exponent.

We compute the procedure above on a rolling basis and the look back chosen for the variance is 100 trading days. We also use the R package RcppEigen and use fastlm to perform a rolling linear regression. The code that achieves this:

############################### # Hurst Exponent (varying lags) ############################### require(magrittr) require(zoo) require(lattice) ## Set lags lags <- 2:(252*6) # Function for finding differences in lags. Todays Close - 'n' lag period getLAG.DIFF <- function(lagdays) { function(term.structure.df) { c(rep(NA, lagdays), diff(term.structure.df$vxx_close, lag = lagdays, differences = 1, arithmetic = TRUE, na.pad = TRUE)) } } # Create a matrix to put the lagged differences in lag.diff.matrix <- matrix(nrow=nrow(term.structure.df), ncol=0) # Loop for filling it for (i in lags) { lag.diff.matrix <- cbind(lag.diff.matrix, getLAG.DIFF(i)(term.structure.df)) } # Rename columns colnames(lag.diff.matrix) <- sapply(lags, function(n)paste("lagged.diff.n", n, sep="")) # Bind to existing dataframe term.structure.df <- cbind(term.structure.df, lag.diff.matrix) head(term.structure.df,25) ############################################################ # Calculate rolling variances of 'n period' differences # Set variance look back to 100 days ############################################################ # Convert NA to 0 term.structure.df[is.na(term.structure.df)] <- as.Date(0) get.VAR <- function(varlag) { function(term.structure.df) { runVar(term.structure.df[,paste0("lagged.diff.n", lags[i])], y = NULL, n = 100, sample = TRUE, cumulative = FALSE) } } # Create a matrix to put the variances in lag.var.matrix <- matrix(nrow=nrow(term.structure.df), ncol=0) # Loop for filling it for (i in 1:length(lags)) { lag.var.matrix <- cbind(lag.var.matrix, get.VAR(i)(term.structure.df)) } # Rename columns colnames(lag.var.matrix) <- sapply(lags, function(n)paste("roll.var.diff.", n, sep="")) # Bind to existing dataframe term.structure.df <- cbind(term.structure.df, lag.var.matrix) ######################################## # Subset to remove all leading NA ######################################## #NonNAindex <- which(!is.na(term.structure.df)) #set_lag_threshold <- 50 # Set Variance #na <- which(!is.na(term.structure.df[,paste0("roll.var.diff.", set_lag_threshold)])) #firstNonNA <- min(na) #term.structure.df<-term.structure.df[firstNonNA:nrow(term.structure.df),] ######################################## # Rolling linear regression to compute hurst exponent ######################################## variance <- list() lag.vec <- c(2:30) # Select short term lags # Select column selection for (i in 1:nrow(term.structure.df)) { variance[i] <- list(c(term.structure.df$roll.var.diff.2[i], term.structure.df$roll.var.diff.3[i], term.structure.df$roll.var.diff.4[i], term.structure.df$roll.var.diff.5[i], term.structure.df$roll.var.diff.6[i], term.structure.df$roll.var.diff.7[i], term.structure.df$roll.var.diff.8[i], term.structure.df$roll.var.diff.9[i], term.structure.df$roll.var.diff.10[i], term.structure.df$roll.var.diff.11[i], term.structure.df$roll.var.diff.12[i], term.structure.df$roll.var.diff.13[i], term.structure.df$roll.var.diff.14[i], term.structure.df$roll.var.diff.15[i], term.structure.df$roll.var.diff.16[i], term.structure.df$roll.var.diff.17[i], term.structure.df$roll.var.diff.18[i], term.structure.df$roll.var.diff.19[i], term.structure.df$roll.var.diff.20[i], term.structure.df$roll.var.diff.21[i], term.structure.df$roll.var.diff.22[i], term.structure.df$roll.var.diff.23[i], term.structure.df$roll.var.diff.24[i], term.structure.df$roll.var.diff.25[i], term.structure.df$roll.var.diff.26[i], term.structure.df$roll.var.diff.27[i], term.structure.df$roll.var.diff.28[i], term.structure.df$roll.var.diff.29[i], term.structure.df$roll.var.diff.30[i])) } #Initialize list, pre allocate memory results<-vector("list", length(variance)) hurst<-vector("list", length(variance)) library(RcppEigen) i=1 for(i in 1:length(variance)){ results[[i]]<-fastLm( log(lag.vec) ~ log(variance[[i]]), data=variance) hurst[[i]]<- coef(results[[i]])[2]/2 ptm0 <- proc.time() Sys.sleep(0.1) ptm1=proc.time() - ptm0 time=as.numeric(ptm1[3]) cat('\n','Iteration',i,'took', time, "seconds to complete") } # Join results to data frame hurst <- do.call(rbind, hurst) hurst.df <- as.data.frame(hurst) hurst.df <- data.frame(hurst.df,Date=term.structure.df$Date) colnames(hurst.df)[1] <- "Hurst" hurst.df <- subset(hurst.df, Date >= as.POSIXct("2008-04-28") ) # subset data remove leading NA VXX only # Plot Data ggplot() + geom_line(data=hurst.df ,aes(x=Date,y=Hurst), colour="black") + theme_classic()+ scale_y_continuous(breaks = round(seq(min(hurst.df$Hurst), max(hurst.df$Hurst), by = 0.2),2))+ scale_x_date(breaks = date_breaks("years"), labels = date_format("%Y"))+ ggtitle("VXX Hurst Exponent - Daily Bars - Lags 2:30", subtitle = "Regression log(variances) ~ log(time_lags) - Hurst = Coef/2") + labs(x="Date",y="Hurst")+ theme(plot.title = element_text(hjust=0.5),plot.subtitle =element_text(hjust=0.5))+ #geom_hline(yintercept = 0.5, color = "red", size=0.5,linetype="dashed")+ geom_rect(aes(xmin=as.Date(head(hurst.df$Date,1)),xmax=as.Date(Inf),ymin=0.5,ymax=Inf),alpha=0.1,fill="green")+ geom_rect(aes(xmin=as.Date(head(hurst.df$Date,1)),xmax=as.Date(Inf),ymin=-Inf,ymax=0.5),alpha=0.1,fill="orange")+ geom_rect(aes(xmin=as.Date(head(hurst.df$Date,1)),xmax=as.Date(Inf),ymin=0.48,ymax=0.52),alpha=.7,fill="red")

The output:

This is for lagged differences of 2 to 30 days. A rolling variance of 100 days. We chose the smaller lagged differences as the strategies going forward will likely hold no longer than 30 days so it makes sense to see the nature of the series on this time period.

We can compute how often XIV and VXX is in a momentum, mean reversion and random walk regime. The code for this:

# Count how often in each regime momo <- sum(hurst.df$Hurst > 0.52, na.rm=TRUE) mean.rev <- sum(hurst.df$Hurst < 0.48 , na.rm=TRUE) random <- sum(hurst.df$Hurst >= 0.48 & hurst.df$Hurst <=.52, na.rm=TRUE) exact.random <- sum(hurst.df$Hurst >= 0.50 & hurst.df$Hurst <.51, na.rm=TRUE) total.rows <- NROW(hurst.df) # Percentage of time in momentum, mean reversion, random walk momo.perc <- momo / total.rows mean.rev.perc <- mean.rev / total.rows random.perc <- random / total.rows exact.random.perc <- exact.random / total.rows

VXX:

vxx.percs.df <- data.frame ("Momentum, Over 0.50" = momo.perc,"Mean Reversion, Less than 0.5" = mean.rev.perc, "Random Walk Band, 0.48 to 0.52" = random.perc, "Exact Random Walk, 0.50" = exact.random.perc) vxx.percs.df > vxx.percs.df Momentum..Over.0.50 Mean.Reversion..Less.than.0.5 Random.Walk.Band..0.48.to.0.52 Exact.Random.Walk..0.50 1 0.7471264 0.1395731 0.1133005 0.02791461<span data-mce-type="bookmark" id="mce_SELREST_start" data-mce-style="overflow:hidden;line-height:0" style="overflow:hidden;line-height:0" ></span>

XIV:

xiv.percs.df <- data.frame ("Momentum, Over 0.50" = momo.perc,"Mean Reversion, Less than 0.5" = mean.rev.perc, "Random Walk Band, 0.48 to 0.52" = random.perc, "Exact Random Walk, 0.50" = exact.random.perc) xiv.percs.df > xiv.percs.df Momentum..Over.0.50 Mean.Reversion..Less.than.0.5 Random.Walk.Band..0.48.to.0.52 Exact.Random.Walk..0.50 1 0.7081281 0.2085386 0.08333333 0.022578

What we see is VXX is in momentum phase 74% of the time and XIV in momentum phase 70% of the time. That is the dominating theme where mean reversion 13% (VXX) and 20% (XIV) and times of random walk 11%(VXX) and 8% (XIV).

If fitting a model to the series itself without using the volatility risk premium / roll yield as entry signals. One may try fitting models based on the theme of momentum.

In subsequent posts we will be applying models based on extracting profits when the market is in contango/backwardation as well as applying models to the series itself. We will also look at the auto correlatation of the series and this will serve as a primer for testing a strategy for robustness.

Thanks for reading!