First order functions
Pure functions
Anonymous functions
Vectorized functions
Closures
Recursion
The apply functions are a collection of tools for functional programming in R, they are variations of the map
function
??apply
##
## Help files with alias or concept or title matching ‘apply’ using fuzzy
## matching:
##
## base::apply Apply Functions Over Array Margins
## base::.subset Internal Objects in Package 'base'
## base::by Apply a Function to a Data Frame Split by Factors
## base::eapply Apply a Function Over Values in an Environment
## base::lapply Apply a Function over a List or Vector
## base::mapply Apply a Function to Multiple List or Vector Arguments
## base::rapply Recursively Apply a Function to a List
## base::tapply Apply a Function Over a Ragged Array
Usage: lapply(X, FUN, ...)
lapply
returns a list of the same length as X
, each element of which is the result of applying FUN
to the corresponding element of X
.
lapply(1:8, sqrt) %>% str()
## List of 8
## $ : num 1
## $ : num 1.41
## $ : num 1.73
## $ : num 2
## $ : num 2.24
## $ : num 2.45
## $ : num 2.65
## $ : num 2.83
lapply(1:8, function(x) (x+1)^2) %>% str()
## List of 8
## $ : num 4
## $ : num 9
## $ : num 16
## $ : num 25
## $ : num 36
## $ : num 49
## $ : num 64
## $ : num 81
lapply(1:8, function(x, pow) x^pow, pow=3) %>% str()
## List of 8
## $ : num 1
## $ : num 8
## $ : num 27
## $ : num 64
## $ : num 125
## $ : num 216
## $ : num 343
## $ : num 512
lapply(1:8, function(x, pow) x^pow, x=2) %>% str()
## List of 8
## $ : num 2
## $ : num 4
## $ : num 8
## $ : num 16
## $ : num 32
## $ : num 64
## $ : num 128
## $ : num 256
d = list(n = rnorm(100), e = rexp(100), ln = rlnorm(100))
lapply(d, quantile) %>% str()
## List of 3
## $ n : Named num [1:5] -2.831 -0.76 -0.179 0.611 2.394
## ..- attr(*, "names")= chr [1:5] "0%" "25%" "50%" "75%" ...
## $ e : Named num [1:5] 0.0103 0.2121 0.5628 1.2516 4.6757
## ..- attr(*, "names")= chr [1:5] "0%" "25%" "50%" "75%" ...
## $ ln: Named num [1:5] 0.0745 0.4207 0.7146 1.3067 12.9503
## ..- attr(*, "names")= chr [1:5] "0%" "25%" "50%" "75%" ...
Usage: sapply(X, FUN, ..., simplify = TRUE, USE.NAMES = TRUE)
sapply
is a user-friendly version and wrapper of lapply
by default returning a vector, matrix or, an array if appropriate.
sapply(1:8, sqrt)
## [1] 1.000000 1.414214 1.732051 2.000000 2.236068 2.449490 2.645751 2.828427
sapply(1:8, function(x) (x+1)^2)
## [1] 4 9 16 25 36 49 64 81
sapply(1:8, function(x) c(x, x^2, x^3, x^4))
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] 1 2 3 4 5 6 7 8
## [2,] 1 4 9 16 25 36 49 64
## [3,] 1 8 27 64 125 216 343 512
## [4,] 1 16 81 256 625 1296 2401 4096
sapply(1:8, function(x) list(x, x^2, x^3, x^4))
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] 1 2 3 4 5 6 7 8
## [2,] 1 4 9 16 25 36 49 64
## [3,] 1 8 27 64 125 216 343 512
## [4,] 1 16 81 256 625 1296 2401 4096
d = list(norm = rnorm(100), exp = rexp(100), log_norm = rlnorm(100))
sapply(d, quantile)
## norm exp log_norm
## 0% -3.62023428 0.005119925 0.1554211
## 25% -0.56898037 0.295745680 0.5194694
## 50% 0.03760506 0.767683150 1.0565506
## 75% 0.58875274 1.141757924 2.1052268
## 100% 2.17344045 5.754529196 16.9475848
sapply(2:6, seq)
## [[1]]
## [1] 1 2
##
## [[2]]
## [1] 1 2 3
##
## [[3]]
## [1] 1 2 3 4
##
## [[4]]
## [1] 1 2 3 4 5
##
## [[5]]
## [1] 1 2 3 4 5 6
Usage: vapply(X, FUN, FUN.VALUE, ..., USE.NAMES = TRUE)
vapply
is similar to sapply
, but has a pre-specified type of return value, so it can be safer (and sometimes faster) to use.
d = list(1:3, 1:7, c(1,1,2,3,4))
sapply(d, function(x) x[x==2])
## [1] 2 2 2
sapply(d, function(x) x[x==1]) %>% str()
## List of 3
## $ : int 1
## $ : int 1
## $ : num [1:2] 1 1
vapply(d, function(x) x[x==2], 1)
## [1] 2 2 2
vapply(d, function(x) x[x==1], 1)
## Error in vapply(d, function(x) x[x == 1], 1): values must be length 1,
## but FUN(X[[3]]) result is length 2
vapply(1:3, function(x) c(x,letters[x]), c(1,1))
## Error in vapply(1:3, function(x) c(x, letters[x]), c(1, 1)): values must be type 'double',
## but FUN(X[[1]]) result is type 'character'
vapply(1:3, function(x) c(x,letters[x]), c("",""))
## [,1] [,2] [,3]
## [1,] "1" "2" "3"
## [2,] "a" "b" "c"
We can easily use these functions with data frames, the key is to remember that a data frame is just a fancy list with atomic vector columns of the same length.
df = data.frame(a = 1:6, b = letters[1:6], c = c(TRUE,FALSE))
lapply(df, class) %>% str()
## List of 3
## $ a: chr "integer"
## $ b: chr "factor"
## $ c: chr "logical"
sapply(df, class)
## a b c
## "integer" "factor" "logical"
By default (usually) the results of each function call within an sapply
are placed into the columns of the results matrix. If we’d rather have the results form the rows of our results, if for example we were constructing a data frame, a useful approach is the combination of lapply
and do.call
.
l = lapply(1:8, function(x) list(LETTERS[x], x, x^2, x^3, x^4))
str(l)
## List of 8
## $ :List of 5
## ..$ : chr "A"
## ..$ : int 1
## ..$ : num 1
## ..$ : num 1
## ..$ : num 1
## $ :List of 5
## ..$ : chr "B"
## ..$ : int 2
## ..$ : num 4
## ..$ : num 8
## ..$ : num 16
## $ :List of 5
## ..$ : chr "C"
## ..$ : int 3
## ..$ : num 9
## ..$ : num 27
## ..$ : num 81
## $ :List of 5
## ..$ : chr "D"
## ..$ : int 4
## ..$ : num 16
## ..$ : num 64
## ..$ : num 256
## $ :List of 5
## ..$ : chr "E"
## ..$ : int 5
## ..$ : num 25
## ..$ : num 125
## ..$ : num 625
## $ :List of 5
## ..$ : chr "F"
## ..$ : int 6
## ..$ : num 36
## ..$ : num 216
## ..$ : num 1296
## $ :List of 5
## ..$ : chr "G"
## ..$ : int 7
## ..$ : num 49
## ..$ : num 343
## ..$ : num 2401
## $ :List of 5
## ..$ : chr "H"
## ..$ : int 8
## ..$ : num 64
## ..$ : num 512
## ..$ : num 4096
do.call(rbind, l)
## [,1] [,2] [,3] [,4] [,5]
## [1,] "A" 1 1 1 1
## [2,] "B" 2 4 8 16
## [3,] "C" 3 9 27 81
## [4,] "D" 4 16 64 256
## [5,] "E" 5 25 125 625
## [6,] "F" 6 36 216 1296
## [7,] "G" 7 49 343 2401
## [8,] "H" 8 64 512 4096
do.call(rbind, l)
is the equivalent of passing all the elements of l
as arguments to rbind
, e.g.
rbind(l[[1]], l[[2]], l[[3]], l[[4]],
l[[5]], l[[6]], l[[7]], l[[8]])
## [,1] [,2] [,3] [,4] [,5]
## [1,] "A" 1 1 1 1
## [2,] "B" 2 4 8 16
## [3,] "C" 3 9 27 81
## [4,] "D" 4 16 64 256
## [5,] "E" 5 25 125 625
## [6,] "F" 6 36 216 1296
## [7,] "G" 7 49 343 2401
## [8,] "H" 8 64 512 4096
l2 = lapply(1:8, function(x) data.frame(x, x^2, x^3, x^4))
do.call(rbind, l2)
## x x.2 x.3 x.4
## 1 1 1 1 1
## 2 2 4 8 16
## 3 3 9 27 81
## 4 4 16 64 256
## 5 5 25 125 625
## 6 6 36 216 1296
## 7 7 49 343 2401
## 8 8 64 512 4096
Usage: apply(X, MARGIN, FUN, ...)
Apply a function to margins of an array, matrix, or data frame.
(m = matrix(1:12, nrow=4, ncol=3))
## [,1] [,2] [,3]
## [1,] 1 5 9
## [2,] 2 6 10
## [3,] 3 7 11
## [4,] 4 8 12
apply(m, 1, mean)
## [1] 5 6 7 8
apply(m, 2, mean)
## [1] 2.5 6.5 10.5
apply(m, 1:2, mean)
## [,1] [,2] [,3]
## [1,] 1 5 9
## [2,] 2 6 10
## [3,] 3 7 11
## [4,] 4 8 12
(df = data.frame(a=1:3, b=4:6, c=7:9))
## a b c
## 1 1 4 7
## 2 2 5 8
## 3 3 6 9
apply(df, 1, mean)
## [1] 4 5 6
apply(df, 1, mean) %>% str()
## num [1:3] 4 5 6
apply(df, 2, mean)
## a b c
## 2 5 8
apply(df, 2, mean) %>% str()
## Named num [1:3] 2 5 8
## - attr(*, "names")= chr [1:3] "a" "b" "c"
(a = array(1:27,c(3,3,3)))
## , , 1
##
## [,1] [,2] [,3]
## [1,] 1 4 7
## [2,] 2 5 8
## [3,] 3 6 9
##
## , , 2
##
## [,1] [,2] [,3]
## [1,] 10 13 16
## [2,] 11 14 17
## [3,] 12 15 18
##
## , , 3
##
## [,1] [,2] [,3]
## [1,] 19 22 25
## [2,] 20 23 26
## [3,] 21 24 27
apply(a, 1, sum)
## [1] 117 126 135
apply(a, 2, sum)
## [1] 99 126 153
apply(a, 3, sum)
## [1] 45 126 207
apply(a, 1:2, sum)
## [,1] [,2] [,3]
## [1,] 30 39 48
## [2,] 33 42 51
## [3,] 36 45 54
Usage: tapply(X, INDEX, FUN = NULL, ..., simplify = TRUE)
Apply a function to each (non-empty) group of values from X
as specified by a unique combination of the levels of INDEX
.
(df = data.frame(data = 3:11, cat1 = rep(1:3,3),
cat2=rep(1:2,c(4,5))))
## data cat1 cat2
## 1 3 1 1
## 2 4 2 1
## 3 5 3 1
## 4 6 1 1
## 5 7 2 2
## 6 8 3 2
## 7 9 1 2
## 8 10 2 2
## 9 11 3 2
tapply(df$data, df$cat1, sum)
## 1 2 3
## 18 21 24
tapply(df$data, df[,2:3], sum)
## cat2
## cat1 1 2
## 1 9 9
## 2 4 17
## 3 5 19
A Hadley package which improves functional programming in R with a focus on pure and type stable functions.
Basic functions for looping over an object and returning a value (of a specific type) - replacement for lapply
/sapply
/vapply
.
map()
- returns a list.
map_lgl()
- returns a logical vector.
map_int()
- returns a integer vector.
map_dbl()
- returns a double vector.
map_chr()
- returns a character vector.
map_df()
- returns a data frame.
walk()
- returns nothing, call function exclusively for its side effects
R is a weakly / dynamically typed language which means there is no way to define a function which enforces the argument or return types.
This flexibility can be useful at times, but often it makes it hard to reason about your code and requires more verbose code to handle edge cases.
map_dbl(list(rnorm(1e3),rnorm(1e3),rnorm(1e3)), mean)
## [1] -0.02980877 -0.02168100 0.04525821
map_chr(list(rnorm(1e3),rnorm(1e3),rnorm(1e3)), mean)
## [1] "0.051568" "0.012061" "0.010361"
map_int(list(rnorm(1e3),rnorm(1e3),rnorm(1e3)), mean)
## Error: Can't coerce element 1 from a double to a integer
An anonymous function is one that is never given a name (assigned to a variable)
sapply(1:10, function(x) x^(x+1))
## [1] 1 8 81 1024 15625 279936 5764801 134217728
## [9] 3486784401 100000000000
purrr lets us write anonymous functions using one sided formulas where the first arguments
map_dbl(1:10, ~ .^(.+1))
## [1] 1 8 81 1024 15625 279936 5764801 134217728
## [9] 3486784401 100000000000
Very often we want to extract only certain (named) values from a list, purrr
provides a shortcut for this operation when you provide either a character or numeric value instead of a function to apply.
x = list(list(a=1L,b=2L,c=list(d=3L,e=4L)),
list(a=5L,b=6L,c=list(d=7L,e=8L)))
map_int(x, "a")
## [1] 1 5
map_dbl(x, c("c","e"))
## [1] 4 8
map_df(x, 3)
## # A tibble: 2 × 2
## d e
## <int> <int>
## 1 3 4
## 2 7 8
map_chr(x, c(3,1))
## [1] "3" "7"
Above materials are derived in part from the following sources:
Hadley Wickham - Adv-R Functionals
Hadley Wickham - R for Data Science
Neil Saunders - A brief introduction to “apply” in R
Jenny Bryan - Purrr Tutorial