Profiling & Benchmarking

Profiling & Benchmarking

  • Improved performance comes from iteration, and learning the most common pitfalls

  • Don’t sweat the small stuff - Coder time vs Run time vs Compute costs

  • Measure it, or it didn’t happen

  • “Premature optimization is the root of all evil (or at least most of it) in programming.” -Knuth

How do we measure?

Simplest tool is R’s base system.time which can be used to wrap any other call or calls.

system.time(rnorm(1e6))
##    user  system elapsed 
##   0.124   0.004   0.129
system.time(rnorm(1e4) %*% t(rnorm(1e4)))
##    user  system elapsed 
##   0.653   0.270   0.626

Better benchmarking (pt. 1)

We can do better (better precision) using the microbenchmark package

install.packages("microbenchmark")
library(microbenchmark)

d = abs(rnorm(1000))
r = microbenchmark(
      exp(log(d)/2),
      d^0.5,
      sqrt(d),
      times = 1000
    )
print(r)
## Unit: microseconds
##           expr    min      lq      mean  median      uq     max neval
##  exp(log(d)/2) 22.553 26.3470 29.395837 26.7230 29.3295 228.481  1000
##          d^0.5 38.245 41.8770 46.302536 42.2365 46.5600 160.217  1000
##        sqrt(d)  4.826  8.1985  9.316313  8.5355  9.2270 103.737  1000

boxplot(r)

Better benchmarking (pt. 2)

We can also do better using the rbenchmark package

install.packages("rbenchmark")
library(rbenchmark)

d = abs(rnorm(1000))
benchmark(
  exp(log(d)/2),
  d^0.5,
  sqrt(d),
  replications = 1000,
  order = "relative"
)
##            test replications elapsed relative user.self sys.self user.child sys.child
## 3       sqrt(d)         1000   0.011    1.000     0.010    0.000          0         0
## 1 exp(log(d)/2)         1000   0.029    2.636     0.029    0.001          0         0
## 2         d^0.5         1000   0.064    5.818     0.059    0.004          0         0

Profiling



Live Demo

Exercise 1

Earlier we mentioned that growing a vector as you collect results is bad, just how bad is it? Benchmark the following three functions and compare their performance.

good = function()
{
    res = rep(NA, 1e4)
    for(i in seq_along(res))
    {
        res[i] = sqrt(i)
    }
}
bad = function()
{
    res = numeric()
    for(i in 1:1e4)
    {
        res = c(res,sqrt(i))
    }
}
best = function()
{
    sqrt(1:1e4)
}

Exercise 2

Lets compare looping vs. the apply function vs dplyr.

  • First we will construct a large data frame
set.seed(523)
d = data.frame(matrix(rnorm(1e5 * 10),ncol=10))
  • Implement functions that will find the largest value in each row using

    • The apply function
    • A single for loop
    • dplyr
  • Benchmark all of your preceding functions using data frame d, which is the fastest, why do you think this is the case? 10 replicates per function is sufficient.

  • Construct a smaller subset of d by taking only the first 100 rows, rerun your benchmarks on this smaller subset, did anything change?