R profiling basics and tools

I had to do some profiling of some functions, so needed to delve into the basics of R profiling options.

There seem to be some more heavyweight stuff available in R, including summaryRprof(), the proftools package, and the profr package. Hadley Wickham wrote the useful lineprof which can be had via

devtools::install_github("hadley/lineprof")

Another useful package to go ahead and install is the microbenchmark package.

So, with these tools and the functions that they provide, we can do some simple profiling. Let’s look at a simple profile of three different functions in R and look at the amazing difference between these implementations, as an insightful exercise for the reader. First, let’s create some sample data to work with.

library(microbenchmark)
 
data <- rbinom(10000, 1, 0.5)

Now, let’s define three different functions, one using R vectorization, one using a hybrid, and one defining the same loop behavior but with an explicit for loop.

f <- function(input) {
    output = input + 1
    return(output)
}
 
g <- function(input) {
    output = rep(0,length(input))
    output = input + 1
    return(output)
}
 
h <- function(input) {
    output = rep(0,length(input))
    for (i in 1:length(input)){
        output[i] = input[i] + 1
    }
    return(output)
}

You can see the results of all these functions, they are all idempotent in their results. But what about their performance. On my deck, I got

> microbenchmark(f(data), g(data), h(data))
Unit: nanoseconds
    expr   min      lq     mean  median      uq    max neval
 f(data)   743   859.0  1327.09  1002.5  1339.5  12740   100
 g(data)  1264  1411.0  2063.42  1722.5  2097.0  12287   100
 h(data) 83717 85269.5 90690.37 87546.0 94976.5 136717   100

which shows how costly not using vectorization is in a simple R scenario. Vectorize! Always vectorize! But `microbenchmark()` is simple to use and gives good enough results for first order analysis. There is much more you can do with `lineprof()`, etc., and I’ll return to those. Here are a couple of links for the meanwhile. Hadley’s page on profiling and a nice, unrelated page on debugging page from Duncan Murdoch.

twitter
twitter

Leave a Reply

Your email address will not be published. Required fields are marked *