How To Ignore Errors In a Loop – Continue Processing Each Iteration

In this post we will look at a method to process many files or data frames. In this example we will just make data frames, but in actual fact, these may point to a directory on our hard drive, see ?list.files.

First lets make some dummy data that we can work with.

# Create dummy data frames
    file1 <- as.data.frame(runif(100, 0,100))
    file2 <- as.data.frame(runif(100, 0,100))
    file3 <- as.data.frame(runif(100, 0,100))
    file4 <- as.data.frame(runif(12, 0,100))
    file5 <- as.data.frame(runif(100, 0,100))
    file6 <- as.data.frame(runif(15, 0,100))
    file7 <- as.data.frame(runif(100, 0,100))
    file8 <- as.data.frame(runif(8, 0,100))  # This is the df that its intended to fail on
    file9 <- as.data.frame(runif(100, 0,100))
    file10 <- as.data.frame(runif(100, 0,100))
    file11 <- as.data.frame(runif(100, 0,100))

    # Store all data frames in a list
    file.list <- list(file1,file2,file3,file4,file5,file6,file7,file8,file9,file10)

# Rename column names for all 11 data frames
Names <- function(x) {
  names(x) <- c("Close")
  return(x)
}
# Apply name change to all 10 data frames
file.list <- lapply(file.list, Names)

In the above we made 11 random data frames. We stored every data frame in a list, inside the file.list variable. We can print the first data frame by file.list[1] or the last file.list[11].

We create a function to rename the column names. We use lapply to run this over every data frame in the list from data frame file1 to file11.

Now that we have our dummy data, we may now create a function which will store our commands. Commands that we wish to iterate on each and every data frame.

# Create function for performing commands.
    genSMA = function(x){
      nextfile <- data.frame(file.list[[i]],stringsAsFactors=FALSE)
      new.df <- data.frame(nextfile)
      # Load packages 
      require(TTR)
      # Use TTR package to create rolling SMA n day moving average 
      getSMA <- function(numdays) {
        function(new.df) {
          SMA(new.df[,"Close"], numdays)    # Calls TTR package to create SMA
        }
      }
      # Create a matrix to put the SMAs in
      sma.matrix <- matrix(nrow=nrow(new.df), ncol=0)
      tail(sma.matrix)
      # Loop for filling it
      for (i in 2:12) {
        sma.matrix <- cbind(sma.matrix, getSMA(i)(new.df))
      }

      # Rename columns
      colnames(sma.matrix) <- sapply(2:12, function(n)paste("close.sma.n", n, sep=""))

      # Bind to existing dataframe
      new.df <-  cbind(new.df, sma.matrix)

    }

The above function will create a simple moving average on the Close column. It will calculate a 2 to 12 simple moving average. This is a very simplistic example, however in reality there may be full scripts and a multitude of calculations or operations within this function.

So we have our function with the calculations we wish to perform over all 11 data frames stored in our list. Next thing to do is write a for loop which will call the function and iterate it over every data frame in our file.list of data frames.

# Loop for running function over all data frames
for (i in 1:length(file.list)){
  tryCatch({
    genSMA(file.list[[i]])   # Start from data frame 1
  }, error = function(e) { print(paste("i =", i, "failed:")) })
}

After running the full code we should get the error:
[1] "i = 8 failed:"

This is because we purposely setup data frame file8 with only 8 data points which is less than our required 12 for a 2:12 simple moving average. For a 12 simple moving average we need 12 data points. Thus the code throws an error.

In normal conditions without using tryCatch. The loop would break and we would then have to remove the error-some file or data frame and continue to run the loop again. Perhaps dealing with only a few files this is not an issue. But if you are processing thousands of files its a real inconvenience!

tryCatch prints which iteration failed also so may perform further due diligence.

Note we can also modify the loop and function to run over a directory on our hard drive if thats where our data is stored.

We can do this with:

# Specify directory where files are stored 
 files = list.files(path = "C:/R Projects/Data/", pattern = ".", full.names = TRUE)

Then inside our function we can:

  genSMA = function(x){
        nextfile <- read.table(files[i],header=TRUE, sep=",", stringsAsFactors=FALSE)  #reads first file in directory 
        new.df <- data.frame(nextfile)  # putting inside data.frame

ALL YOUR COMMANDS HERE

}

And we call the loop in the same way. This time we want to run the loop over every single file in our directory. We placed this variable in files.

# Loop for running function over all data frames
for (i in 1:length(files)){
  tryCatch({
    genSMA(files[[i]])
  }, error = function(e) { print(paste("i =", i, "failed:")) })
}

Here we run the function over every single file in our directory that we specified in the files variable. Note I do not specify any output of the processing above. We may output a plot or store the final results in a data frame. I will revisit! However, the main topic is addressed – successfully ignore a failure in the loop and continue to process the remaining iterations.

Author: Andrew Bannerman

Integrity Inspector. Quantitative Analysis is a favorite past time.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s