Profiling & Benchmarking

Improved performance comes from iteration, and learning the most common pitfalls
Don’t sweat the small stuff - Coder time vs Run time vs Compute costs
Measure it, or it didn’t happen
“Premature optimization is the root of all evil (or at least most of it) in programming.” -Knuth

How do we measure?

Simplest tool is R’s base system.time which can be used to wrap any other call or calls.

system.time(rnorm(1e6))

##    user  system elapsed 
##   0.124   0.004   0.129

system.time(rnorm(1e4) %*% t(rnorm(1e4)))

##    user  system elapsed 
##   0.653   0.270   0.626

Better benchmarking (pt. 1)

We can do better (better precision) using the microbenchmark package

install.packages("microbenchmark")

library(microbenchmark)

d = abs(rnorm(1000))
r = microbenchmark(
      exp(log(d)/2),
      d^0.5,
      sqrt(d),
      times = 1000
    )
print(r)

## Unit: microseconds
##           expr    min      lq      mean  median      uq     max neval
##  exp(log(d)/2) 22.553 26.3470 29.395837 26.7230 29.3295 228.481  1000
##          d^0.5 38.245 41.8770 46.302536 42.2365 46.5600 160.217  1000
##        sqrt(d)  4.826  8.1985  9.316313  8.5355  9.2270 103.737  1000

boxplot(r)

Better benchmarking (pt. 2)

We can also do better using the rbenchmark package

install.packages("rbenchmark")

library(rbenchmark)

d = abs(rnorm(1000))
benchmark(
  exp(log(d)/2),
  d^0.5,
  sqrt(d),
  replications = 1000,
  order = "relative"
)

##            test replications elapsed relative user.self sys.self user.child sys.child
## 3       sqrt(d)         1000   0.011    1.000     0.010    0.000          0         0
## 1 exp(log(d)/2)         1000   0.029    2.636     0.029    0.001          0         0
## 2         d^0.5         1000   0.064    5.818     0.059    0.004          0         0

Profiling

Live Demo

Exercise 1

Earlier we mentioned that growing a vector as you collect results is bad, just how bad is it? Benchmark the following three functions and compare their performance.

good = function()
{
    res = rep(NA, 1e4)
    for(i in seq_along(res))
    {
        res[i] = sqrt(i)
    }
}

bad = function()
{
    res = numeric()
    for(i in 1:1e4)
    {
        res = c(res,sqrt(i))
    }
}

best = function()
{
    sqrt(1:1e4)
}

Exercise 2

Lets compare looping vs. the apply function vs dplyr.

First we will construct a large data frame

set.seed(523)
d = data.frame(matrix(rnorm(1e5 * 10),ncol=10))

Implement functions that will find the largest value in each row using
- The apply function
- A single for loop
- dplyr
Benchmark all of your preceding functions using data frame d, which is the fastest, why do you think this is the case? 10 replicates per function is sufficient.
Construct a smaller subset of d by taking only the first 100 rows, rerun your benchmarks on this smaller subset, did anything change?