The fundamental building block of data in R are vectors (collections of related values, objects, other data structures, etc).
R has two fundamental vector classes:
Vectors (atomic vectors)
Lists (generic vectors)
R has six atomic vector types:
typeof | mode | storage.mode |
---|---|---|
logical | logical | logical |
double | numeric | double |
integer | numeric | integer |
character | character | character |
complex | complex | complex |
raw | raw | raw |
For now we'll mainly worry about the first type, we'll discuss the following three next time (final two rarely come up).
Operator | Operation | Vectorized? |
---|---|---|
x | y |
or | Yes |
x & y |
and | Yes |
!x |
not | Yes |
x || y |
or | No |
x && y |
and | No |
xor(x,y) |
exclusive or | Yes |
x = c(TRUE,FALSE,TRUE) y = c(FALSE,TRUE,TRUE)
x | y
## [1] TRUE TRUE TRUE
x || y
## [1] TRUE
x & y
## [1] FALSE FALSE TRUE
x && y
## [1] FALSE
x = c(TRUE,FALSE,TRUE) y = c(TRUE) z = c(FALSE,TRUE)
x | y
## [1] TRUE TRUE TRUE
y | z
## [1] TRUE TRUE
x | z
## Warning in x | z: longer object length is not ## a multiple of shorter object length
## [1] TRUE TRUE TRUE
x & y
## [1] TRUE FALSE TRUE
y & z
## [1] FALSE TRUE
x & z
## Warning in x & z: longer object length is not ## a multiple of shorter object length
## [1] FALSE FALSE FALSE
Operator | Comparison | Vectorized? |
---|---|---|
x < y |
less than | Yes |
x > y |
greater than | Yes |
x <= y |
less than or equal to | Yes |
x >= y |
greater than or equal to | Yes |
x != y |
not equal to | Yes |
x == y |
equal to | Yes |
x %in% y |
contains | Yes (for x ) |
x = c("A","B","C") z = c("A")
x == z
## [1] TRUE FALSE FALSE
x != z
## [1] FALSE TRUE TRUE
x > z
## [1] FALSE TRUE TRUE
x %in% z
## [1] TRUE FALSE FALSE
z %in% x
## [1] TRUE
Conditional execution of code blocks is achieved via if
statements. Note that if
statements are not vectorized.
x = c(3,4) if (3 %in% x) print("Here!")
## [1] "Here!"
if (x >= 2) print("Now Here!")
## Warning in if (x >= 2) print("Now Here!"): the condition has length > 1 and ## only the first element will be used
## [1] "Now Here!"
There are a couple of helper functions for collapsing a logical vector down to a single value: any
, all
x = c(3,4)
any(x >= 2)
## [1] TRUE
all(x >= 2)
## [1] TRUE
!any(x >= 2)
## [1] FALSE
if (any(x >= 2)) print("Now There!")
## [1] "Now There!"
if
, else if
, and else
x = 3 if (x < 0) { print("Negative") } else if (x > 0) { print("Positive") } else { print("Zero") }
## [1] "Positive"
x = 0 if (x < 0) { print("Negative") } else if (x > 0) { print("Positive") } else { print("Zero") }
## [1] "Zero"
for
loopsSimplest, and most common type of loop in R - given a vector iterate through the elements and evaluate the code block for each.
for(x in 1:10) { cat(x^2,"") }
## 1 4 9 16 25 36 49 64 81 100
for(y in list(1:3, LETTERS[1:7], c(TRUE,FALSE))) { cat(length(y),"") }
## 3 7 2
while
Repeat until the given condition is not met (i.e. results in FALSE
)
i = 1 res = rep(NA,10) while (i <= 10) { res[i] = i^2 i = i+1 } res
## [1] 1 4 9 16 25 36 49 64 81 100
repeat
Repeat until break
i = 1 res = rep(NA,10) repeat { res[i] = i^2 i = i+1 if (i > 10) break } res
## [1] 1 4 9 16 25 36 49 64 81 100
break
and next
These are special actions that only work inside of a loop
break
- ends the current (inner-most) loopnext
- ends the current iterationfor(i in 1:10) { if (i %% 2 == 0) break cat(i,"") }
## 1
for(i in 1:10) { if (i %% 2 == 0) next cat(i,"") }
## 1 3 5 7 9
It is almost always better to create an object to store your results first, rather than growing the object as you go.
# Good res = rep(NA,10) for(x in 1:10) { res[x] = x^2 } res
## [1] 1 4 9 16 25 36 49 64 81 100
# Bad res = c() for (x in 1:10) { res = c(res,x^2) } res
## [1] 1 4 9 16 25 36 49 64 81 100
for
loopsOften we want to use a loop across the indexes of an object and not the elements themselves. There are several useful functions to help you do this: :
, seq
, seq_along
, seq_len
, etc.
l = list(1:3, LETTERS[1:7], c(TRUE,FALSE)) res = rep(NA, length(l)) for(x in seq_along(l)) { res[x] = length(l[[x]]) } res
## [1] 3 7 2
1:length(l)
## [1] 1 2 3
seq_along(l)
## [1] 1 2 3
seq_len(length(l))
## [1] 1 2 3
Best Practice:
good = function(x) { for(i in seq_along(x)) cat(1,"") }
Antipattern:
bad = function(x) { for(i in 1:length(x)) cat(1,"") }
good(c(1,2,3))
## 1 1 1
good(c())
bad(c(1,2,3))
## 1 1 1
bad(c())
## 1 1
[]
) or*apply
or purrr
)Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.
Below is the list of primes between 2 and 100:
2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97
If you were given the vector x = c(3, 4, 12, 19, 23, 48, 50, 61, 63, 78)
, write out the R code necessary to print only the values of x
that are not prime (without using subsetting or the %in%
operator).
Your code should use nested loops to iterate through the vector of primes and x
.
stop
and stopifnot
Often we want to validate user input or function arguments - if these tests do not pass then we often want to report the error and stop execution.
ok = FALSE if (!ok) stop("Things are not ok.")
## Error in eval(expr, envir, enclos): Things are not ok.
stopifnot(ok)
## Error: ok is not TRUE
Note - an error (like the one generated by stop
) will prevent an RMarkdown document from compiling unless error=TRUE
is set for that code block.
do_stuff_v1 = function(x) { if (condition_one) { ## ## Do stuff ## } else if (condition_two) { ## ## Do other stuff ## } else if (condition_error) { stop("Condition error occured") } }
do_stuff_v2 = function(x) { if (condition_error) { stop("Condition error occured") } if (condition_one) { ## ## Do stuff ## } else if (condition_two) { ## ## Do other stuff ## } }
Above materials are derived in part from the following sources: