Creating synthetic VXX and XIV data and plotting the VIX term structure using R

This is the first of a collection of volatility postings. Ilya Kipnis did a good job of writing an R script to download VIX settlement data direct from the CBOE website. Furthermore, we can use this data to re-create synthetic VXX, XIV series and kudos to Quant Bear for extending Ilya’s code, the post on how to create synthetic VXX data can be found here.

To follow this post you will need to use Ilya’s script and have the term structure data downloaded within R.

Here we plot the VIX term structure from 2005 to present:

Rplot90

Note the large change in price in 2007. This is due to a 10:1 split, more information here. We adjust all prices prior to the split date 2007-03-26 by dividing by 10. Next we re-plot the data. The code to do this:

# Prepare Data for plotting varying contract expirations
# Adjust for 10:1 split pre 03-26-2007
# Split information here: http://cfe.cboe.com/publish/CFEinfocirc/CFEIC07-003%20.pdf
# Change termStucture column names
colnames(termStructure)[1] = "F"
colnames(termStructure)[2] = "G"
colnames(termStructure)[3] = "H"
colnames(termStructure)[4] = "J"
colnames(termStructure)[5] = "K"
colnames(termStructure)[6] = "M"
colnames(termStructure)[7] = "N"
colnames(termStructure)[8] = "Q"

# Save xts series to date frame in order to melt
df <- data.frame(termStructure)
df <- setDT(df, keep.rownames = TRUE)[] # Set row names
colnames(df)[1] = "Date"
df.10.split <- subset(df, Date < as.POSIXct("2007-03-26/") ) # subset data prior to split
library(magrittr)
df.10.split[,2:9] %<>% lapply(function(x) x / 10) # appply split to all columns excluding date 1
df.post.split <- subset(df, Date >= as.POSIXct("2007-03-26/") ) # subset post split
df <- rbind(df.10.split,df.post.split) # rbind pre and post split data frames
df <- melt(data = df,id.vars = 'Date') # melt df for plotting with ggplot2
library(lubridate)
df$Date <- ymd(df$Date) # Convert date column to Date format
colnames(df)[2] = "Contract" # Rename

# Plot term structure
ggplot(data=df,aes(x=Date,y=value,colour=Contract))+
geom_line()+
scale_x_date(breaks = date_breaks("years"), labels = date_format("%Y"))+
ggtitle("VIX Term Structure",subtitle="Adjusted for 10:1 split pre 2007-03-26")+
labs(x="Year",y="Settlement")+
theme(plot.title = element_text(hjust=0.5),plot.subtitle =element_text(hjust=0.5)

Rplot93

If we want to hone in on any one time period we can subset between dates and replot by:

# Subet by date (for plotting any window for further examination)
df <- subset(df, Date >= as.POSIXct("2007-06-01") & Date <= as.POSIXct("2010-01-01") )

# Plot term structure
ggplot(data=df,aes(x=Date,y=value,colour=Contract))+
geom_line()+
scale_x_date(breaks = date_breaks("years"), labels = date_format("%Y"))+
ggtitle("VIX Term Structure",subtitle="Adjusted for 10:1 split pre 2007-03-26")+
labs(x="Year",y="Settlement")+
theme(plot.title = element_text(hjust=0.5),plot.subtitle =element_text(hjust=0.5))

Rplot94

This is a clear view of a market in backwardation during the financial crsis. More information on backwardation here. And a plot for a backwardation period in 2008:

Rplot108

A an example for a market in contango:

Rplot110

Lets plot XIV and VXX data. That way we can get a feel for what happens to these products during contango and times of backwardation. We will use the syntethic data created from quant bears R code, an extensions to Ilyna’s code. Also, please follow quant bears post in order to correctly download the risk free rate data to .csv format. The code to create synthetic VXX and XIV data based on the CBOE data:

#This script allows you to calculate proxy values for the VXX ETF before its inception on the 29t of January 2009
#Prerequisites:

#obtain historical CBOE VIX Futures data following the instructions outlined in this post by Ilya Kipnis (QuantStratTradeR)
#https://quantstrattrader.wordpress.com/2017/04/27/creating-a-vix-futures-term-structure-in-r-from-official-cboe-settlement-data/

#have the variables expiries and masterlist defined exactly as described in the post above

#have obtained TBill Rates from https://www.treasurydirect.gov/instit/annceresult/annceresult_query.htm and manipulated as described in my post

#Loading required packages
require(xts)
require(bizdays)
require(timeDate)
load_rmetrics_calendars(2004:2018)

# expiries come from http://www.macroption.com/vix-expiration-calendar/
expiries <- read.table("D:/R Projects/Final Scripts/VIX_term_structure/expiries_1.txt", header=FALSE)

#Transforming the expiries
expiries = as.Date(apply(expiries,1,function(x){paste(x,collapse=" ")}),format="%d %b %Y")

#preparing the tbillrates
tbillrates <- read.csv("D:/R Projects/Final Scripts/VIX_term_structure/tbills.2.csv", header=TRUE, stringsAsFactors = FALSE)
tbillrates[is.na(tbillrates)] <- 0
tbillrates$Date <- mdy(tbillrates$Date)
str(tbillrates)
tbillrates = xts(tbillrates[,"Rate"],as.Date(tbillrates[,"Date"]))

#defining function to calculate the contract roll weights
getCRW <- function(today){
today = as.Date(today)
periodstart = expiries[max(which(expiries<=today))] periodend = expiries[min(which(expiries>today))]
dt = bizdays(periodstart,periodend,"Rmetrics/NYSE")
dr = bizdays(today,periodend,"Rmetrics/NYSE")-1
return(c(dr/dt,(dt-dr)/dt))
}

#defining function to calculate TBR
getTBR <- function(today,lastday){
today = as.Date(today)
lastday = as.Date(lastday)
delta = as.numeric(today-lastday)
rate = tbillrates[max(which(index(tbillrates)<today))]
tbr = (1/(1-91/360*rate))^(delta/91)-1
return(tbr)
}

i=1
#calculating the index values
days = index(masterlist["2005-12-20/2009-01-29"])
indx = 100000
for(i in 2:length(days)){
crw = getCRW(days[i-1])
tbr = getTBR(days[i],days[i-1])
fut1 = masterlist[days[i-1],which(!is.na(masterlist[days[i-1]]))[1:2]]
fut2 = masterlist[days[i],which(!is.na(masterlist[days[i]]))[1:2]]
if(!names(fut1)[1]%in%names(fut2)){
fut1 = masterlist[days[i-1],which(!is.na(masterlist[days[i-1]]))[2:3]]
}
twdi = sum(crw*as.numeric(fut1))
twdo = sum(crw*as.numeric(fut2))
cdr = twdo/twdi-1
#cdr = -cdr
val = indx[length(indx)]*(1+cdr+tbr)
indx = c(indx,val)
}
indx = xts(indx,days)

#adjusting for 10:1 split
indx["2007-03-26/"] = 10*indx["2007-03-26/"]
indxret = (indx/lag(indx)-1)[-1]

#calculating VXX values
vxxvals = 107089.921875
for(i in nrow(indxret):1){
tmp = vxxvals[length(vxxvals)]/(1+indxret[i,])
vxxvals = c(vxxvals,tmp)
}
vxxvals = rev(vxxvals)
vxxvals = xts(vxxvals,index(indx))
plot(vxxvals,type="l")
head(vxxvals)

We plot synthetic VXX data:

Rplot95

Next using the alphavantage API, See my post here for details to download alphavantage data using R, we will join synthetic VXX data to Alpha Vantage VXX data. The code to do this, note you will need to insert your API key:

# Join synthetic VXX data to Alpha Vantage VXX data
# Plot with ggplot2
# Download data
require(lubridate)
require(data.table)
require(dplyr)
# Note you need tyo place your API key...your_key_here
VXX <- fread("https://www.alphavantage.co/query?function=TIME_SERIES_DAILY_ADJUSTED&symbol=VXX&outputsize=full&apikey=.your_key_here&datatype=csv") #fread() data.table for downloading directly to a data frame
VXX$timestamp <- ymd(VXX$timestamp) #Lubridate to change character date to date format
VXX <- arrange(VXX,timestamp) #dplyr to sort data frame by date ascending order
colnames(VXX)[1] <- "Date"
VXX <- as.data.frame(VXX)
head(VXX)

# store synthetic VXX vxxvals price data to data.frame
df <- data.frame(vxxvals)
df <- setDT(df, keep.rownames = TRUE)[] # Set row names
colnames(df)[1] = "Date"
colnames(df)[2] = "close"
df$Date <- ymd(df$Date) # Convert to date format
df <- as.data.frame(df)

# Retain only Date and close columns for alphavantage VXX data
VXX <- VXX[ -c(2:5, 7:9) ]
colnames(VXX)[2] <- "close"

# ggplot2 to plot synthetic VXX and alphavantage VXX
require(ggplot2)
require(scales)
ggplot() +
geom_line(data=df,aes(x=Date,y=close), colour="red") +
geom_line(data=VXX,aes(x=Date,y=close), colour="black")+
theme_classic()+
scale_x_date(breaks = date_breaks("years"), labels = date_format("%Y"))+
ggtitle("VXX", subtitle = "Synthetic prior to 2009-01-29") +
labs(x="Date",y="VXX")+
theme(plot.title = element_text(hjust=0.5),plot.subtitle =element_text(hjust=0.5))+
annotate("text", label = "Synthetic VXX", x = as.Date("2007-04-26"), y = 100000, color = "red")+
annotate("text", label = "Alphavantage VXX", x = as.Date("2015-04-26"), y = 100000, color = "black")

Rplot97

For XIV, per QUANTBEAR post, we need to inverse the sign of the synthetic VXX returns. We achieve this with the following code which is the very same code as VXX, except we flip the sign of the returns and we have tacked on the alpha vantage adjusted close prices to the synthetic data:

#adjusting for 10:1 split
indx["2007-03-26/"] = 10*indx["2007-03-26/"]
indxret = (indx/lag(indx)-1)[-1]
indxret = -indxret # Flip the sign of the returns ####

#calculating XIV values
xivvals = 9.557 ####### Note place first XIV price here (we are using first adjusted close aklpha vantage close)
for(i in nrow(indxret):1){
tmp = xivvals[length(xivvals)]/(1+indxret[i,])
xivvals = c(xivvals,tmp)
}
xivvals = rev(xivvals)
xivvals = xts(xivvals,index(indx))
plot(xivvals,type="l")

# Join synthetic XIV data to Alpha Vantage XIV data
# Plot with ggplot2
# Download data
require(lubridate)
require(data.table)
require(dplyr)
# Note you need tyo place your API key...your_key_here
XIV <- fread("https://www.alphavantage.co/query?function=TIME_SERIES_DAILY_ADJUSTED&symbol=XIV&outputsize=full&apikey=6RSYX9BPXKZVXUS9&datatype=csv") #fread() data.table for downloading directly to a data frame
XIV$timestamp <- ymd(XIV$timestamp) #Lubridate to change character date to date format
XIV <- arrange(XIV,timestamp) #dplyr to sort data frame by date ascending order
colnames(XIV)[1] <- "Date"
XIV <- as.data.frame(XIV)

# store synthetic XIV XIVvals price data to data.frame
df <- data.frame(xivvals)
df <- setDT(df, keep.rownames = TRUE)[] # Set row names
colnames(df)[1] = "Date"
colnames(df)[2] = "close"
df$Date <- ymd(df$Date) # Convert to date format
df <- as.data.frame(df)

# Retain only Date and close columns for alphavantage XIV data
XIV <- XIV[ -c(2:5, 7:9) ]
colnames(XIV)[2] <- "close"

# ggplot2 to plot synthetic XIV and alphavantage XIV
require(ggplot2)
require(scales)
ggplot() +
geom_line(data=df,aes(x=Date,y=close), colour="red") +
geom_line(data=XIV,aes(x=Date,y=close), colour="black")+
theme_classic()+
scale_x_date(breaks = date_breaks("years"), labels = date_format("%Y"))+
ggtitle("XIV", subtitle = "Synthetic prior to 2010-11-30") +
labs(x="Date",y="XIV")+
theme(plot.title = element_text(hjust=0.5),plot.subtitle =element_text(hjust=0.5))+
annotate("text", label = "Synthetic XIV", x = as.Date("2007-04-26"), y = 85, color = "red")+
annotate("text", label = "Alphavantage XIV", x = as.Date("2015-04-26"), y = 85, color = "black")

Rplot98

Lets plot times when the term structure was in backwardation. This code prepares and plots backwardation:

# Change termStucture column names
colnames(termStructure)[1] = "F"
colnames(termStructure)[2] = "G"
colnames(termStructure)[3] = "H"
colnames(termStructure)[4] = "J"
colnames(termStructure)[5] = "K"
colnames(termStructure)[6] = "M"
colnames(termStructure)[7] = "N"
colnames(termStructure)[8] = "Q"

# Prepare Data For backwardation<span 				data-mce-type="bookmark" 				id="mce_SELREST_start" 				data-mce-style="overflow:hidden;line-height:0" 				style="overflow:hidden;line-height:0" 			></span> Plot
# Obtain backwardation dates
backwardation <- data.frame(termStructure)
backwardation <- setDT(backwardation, keep.rownames = TRUE)[] # Set row names
colnames(backwardation)[1] <- "Date"
backwardation.G <- ifelse(termStructure$F > termStructure$G,1,0)
backwardation.H <- ifelse(termStructure$G > termStructure$H,1,0)
backwardation.J <- ifelse(termStructure$H > termStructure$J,1,0)
backwardation.K <- ifelse(termStructure$J > termStructure$K,1,0)
backwardation.M <- ifelse(termStructure$K > termStructure$M,1,0)
backwardation.N <- ifelse(termStructure$M > termStructure$N,1,0)
backwardation.vec <- backwardation.G + backwardation.H + backwardation.J + backwardation.K + backwardation.M + backwardation.N
dates <- c(backwardation$Date)
backwardation.df <- data.frame(dates,backwardation.vec)
colnames(backwardation.df)[1] <- "Date"
colnames(backwardation.df)[2] <- "backwardation"

Rplot105.png

Times in pink are when the market was in backwardation. Next we may see how VXX and XIV acted during times of backwardation. Plot the non adjusted VXX for clarity.

Rplot103

Rplot106

Times of backwardation essentially cause XIV to go down in price and VXX to go up in price.

As we organized the backwardation dates, lets view how often the VIX is in backwardation / contango. We can pull the data using the code below:

# Count how often VIX in backwardation / contango
total.backwardation.days <- sum(backwardation.df$backwardation > 0, na.rm=TRUE)
total.days <- nrow(backwardation.df)
backwardation.perc <- total.backwardation.days/total.days *100
contango = 100-backwardation.perc
print(contango)
print(backwardation.perc)

VIX term structure is in backwardation 29.07% of the time and in contango 70.92% of the time.

Let see what the beta is for XIV, VXX. Code courtesy of Ilya Kipnis for blog post see here. With the original post by Dr. Jonathan Kinlay on beta convexity. At the center of the idea is that an asset relative to its benchmark exhibit different betas when the benchmark is down and up in price.

# Beta Convexity 
# Download SPY data
# Note you need tyo place your API key...your_key_here
SPY <- fread("https://www.alphavantage.co/query?function=TIME_SERIES_DAILY_ADJUSTED&symbol=SPY&outputsize=full&apikey=your_key_here&datatype=csv") #fread() data.table for downloading directly to a data frame
SPY$timestamp <- ymd(SPY$timestamp)   #Lubridate to change character date to date format
SPY <- arrange(SPY,timestamp)   #dplyr to sort data frame by date ascending order
colnames(SPY)[1] <- "Date"
SPY$Date <- ymd(SPY$Date)
SPY <- as.data.frame(SPY)
SPY <- subset(SPY, Date >= as.POSIXct("2009-01-30") ) # VXX start
SPY <- subset(SPY, Date >= as.POSIXct("2000-01-03")) # XIV start

VXX <- fread("https://www.alphavantage.co/query?function=TIME_SERIES_DAILY_ADJUSTED&symbol=VXX&outputsize=full&apikey=your_key_here&datatype=csv") #fread() data.table for downloading directly to a data frame
VXX$timestamp <- ymd(VXX$timestamp)   #Lubridate to change character date to date format
VXX <- arrange(VXX,timestamp)   #dplyr to sort data frame by date ascending order
colnames(VXX)[1] <- "Date"
VXX$Date <- ymd(VXX$Date)
VXX <- as.data.frame(VXX)

XIV <- fread("https://www.alphavantage.co/query?function=TIME_SERIES_DAILY_ADJUSTED&symbol=XIV&outputsize=full&apikey=your_key_here&datatype=csv") #fread() data.table for downloading directly to a data frame
XIV$timestamp <- ymd(XIV$timestamp)   #Lubridate to change character date to date format
XIV <- arrange(XIV,timestamp)   #dplyr to sort data frame by date ascending order
colnames(XIV)[1] <- "Date"
XIV$Date <- ymd(XIV$Date)
XIV <- as.data.frame(XIV)
XIV <- subset(XIV, Date >= as.POSIXct("2000-01-03 "))

# Calculate returns 
spy.xts = xts(SPY$adjusted_close, order.by=as.Date(SPY$Date, format="%Y/%m/%d"))
vxx.xts = xts(VXX$adjusted_close, order.by=as.Date(VXX$Date, format="%Y/%m/%d"))
xiv.xts = xts(XIV$adjusted_close, order.by=as.Date(XIV$Date, format="%Y/%m/%d"))
spy.ret <- Return.calculate(spy.xts)
vxx.ret <- Return.calculate(vxx.xts)
xiv.ret <- Return.calculate(xiv.xts)

rets <- na.omit(cbind(xiv.ret, spy.ret))
colnames(rets) <- c("XIV", "SPY")

betaConvexity <- function(Ra, Rb) {
  positiveBench <- Rb[Rb > 0]
  assetPositiveBench <- Ra[index(positiveBench)]
  positiveBeta <- CAPM.beta(Ra = assetPositiveBench, Rb = positiveBench)
  
  negativeBench <- Rb[Rb < 0]
  assetNegativeBench <- Ra[index(negativeBench)]
  negativeBeta <- CAPM.beta(Ra = assetNegativeBench, Rb = negativeBench)
  
  out <- (positiveBeta - negativeBeta) ^ 2
  return(out)
}

betaConvexity(rets$XIV, rets$SPY)

> betaConvexity(rets$XIV, rets$SPY)
[1] 2.066219

> betaConvexity(rets$VXX, rets$SPY)
[1] 1.931566

For the entire XIV and VXX sample beta is approx 2 for each product, meaning spy moves 1% XIV,VXX move approx 2%.

Traditionally we may run a regression fit between the daily returns of the SPY and XIV,VXX. This however doesn’t take into effect the beta of an asset when the benchmark is up or down. We see how the beta has changed over time with a rolling linear regression. The code to do this:

# Run rolling linear regression for beta
# Stock SPY as independant variable
rolling_lms <- lapply(seq(20,nrow(rets)), function(x) lm(XIV ~ SPY, data = rets[1:x , ]) )
length(rolling_lms)
nrow(rets)
all_slopes <-unlist(sapply(1:length(rolling_lms),function(j) rolling_lms[[j]]$coefficients[2]))
all_slopes<- unlist(all_slopes)
plot(all_slopes,type="l")
# Join regression output to original data frame 
row.diff <- rep(NA, nrow(rets) - length(rolling_lms)) # make leading NAs to position data correclty with original
beta <- c(row.diff,all_slopes)
rets <- data.frame(rets,beta)
rets <- setDT(rets, keep.rownames = TRUE)[] # Set row names
colnames(rets)[1] <- "Date"
rets$Date <- ymd(rets$Date)

# Plot beta 
ggplot() +
  geom_line(data=rets,aes(x=Date,y=beta), colour="red") +
  theme_classic()+
  scale_x_date(breaks = date_breaks("years"), labels = date_format("%Y"))+
  ggtitle("Rolling Linear Regression - Plot XIV Beta", subtitle = "XIV Inception") +
  labs(x="Date",y="VXX")+
  theme(plot.title = element_text(hjust=0.5),plot.subtitle =element_text(hjust=0.5))

Rplot122

And for a visual of the regression fit for the full XIV sample period. The code to plot this:

# Plot Linear Regression
ggplot(rets, aes(x=SPY, y=XIV))+ 
  geom_point(shape=1)+
  geom_smooth(method=lm)+
  ggtitle("Linear Regression y~x XIV~SPY", subtitle = "Start date = XIV inception") +
  labs(x="SPY",y="XIV")+
  theme(plot.title = element_text(hjust=0.5),plot.subtitle =element_text(hjust=0.5))

Rplot121

We repeat the process for VXX.

Rplot118

Rplot119

Both XIV and VXX beta have increased over time. Essentially a 1x move in the SPY leads to an approx -3x move in VXX and 3.66 move in XIV.

This will be a primer for upcoming posts where we will experiment with different volatility strategies.

All code can be found on my github.

Thanks again Ilya and QUANTBEAR for sharing your code on your blogs.

Author: Andrew Bannerman

Integrity Inspector. Quantitative Analysis is a favorite past time.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s