Improved performance comes from iteration, and learning the most common pitfalls
Don’t sweat the small stuff - Coder time vs Run time vs Compute costs
Measure it, or it didn’t happen
“Premature optimization is the root of all evil (or at least most of it) in programming.” -Knuth
Simplest tool is R’s base system.time
which can be used to wrap any other call or calls.
system.time(rnorm(1e6))
## user system elapsed
## 0.127 0.004 0.133
system.time(rnorm(1e4) %*% t(rnorm(1e4)))
## user system elapsed
## 0.686 0.302 0.684
We can do better (better precision) using the microbenchmark package
install.packages("microbenchmark")
library(microbenchmark)
d = abs(rnorm(1000))
r = microbenchmark(
exp(log(d)/2),
d^0.5,
sqrt(d),
times = 1000
)
print(r)
## Unit: microseconds
## expr min lq mean median uq max neval
## exp(log(d)/2) 22.617 26.5485 31.86481 27.0680 32.2190 274.589 1000
## d^0.5 38.218 42.0210 49.96323 42.7060 51.6990 284.438 1000
## sqrt(d) 4.796 8.3195 10.31054 8.7515 9.7475 105.697 1000
boxplot(r)
We can also do better using the rbenchmark package
install.packages("rbenchmark")
library(rbenchmark)
d = abs(rnorm(1000))
benchmark(
exp(log(d)/2),
d^0.5,
sqrt(d),
replications = 1000,
order = "relative"
)
## test replications elapsed relative user.self sys.self user.child sys.child
## 3 sqrt(d) 1000 0.009 1.000 0.008 0.001 0 0
## 1 exp(log(d)/2) 1000 0.030 3.333 0.029 0.002 0 0
## 2 d^0.5 1000 0.047 5.222 0.045 0.001 0 0
Live Demo
Earlier we mentioned that growing a vector as you collect results is bad, just how bad is it? Benchmark the following three functions and compare their performance.
good = function()
{
res = rep(NA, 1e4)
for(i in seq_along(res))
{
res[i] = sqrt(i)
}
}
bad = function()
{
res = numeric()
for(i in 1:1e4)
{
res = c(res,sqrt(i))
}
}
best = function()
{
sqrt(1:1e4)
}
Lets compare looping vs. the apply function vs dplyr.
set.seed(523)
d = data.frame(matrix(rnorm(1e5 * 10),ncol=10))
Implement functions that will find the largest value in each row using
apply
functionfor
loopBenchmark all of your preceding functions using data frame d
, which is the fastest, why do you think this is the case? 10 replicates per function is sufficient.
Construct a smaller subset of d
by taking only the first 100 rows, rerun your benchmarks on this smaller subset, did anything change?