The fundamental building block of data in R are vectors (collections of related values, objects, other data structures, etc).
R has two fundamental vector classes:
Vectors (atomic vectors)
Lists (generic vectors)
R has six atomic vector types:
typeof | mode | storage.mode |
---|---|---|
logical | logical | logical |
double | numeric | double |
integer | numeric | integer |
character | character | character |
complex | complex | complex |
raw | raw | raw |
For now we’ll mainly worry about the first type, we’ll discuss the following three next time (final two rarely come up).
Operator | Operation | Vectorized? |
---|---|---|
x | y |
or | Yes |
x & y |
and | Yes |
!x |
not | Yes |
x || y |
or | No |
x && y |
and | No |
xor(x,y) |
exclusive or | Yes |
x = c(TRUE,FALSE,TRUE)
y = c(FALSE,TRUE,TRUE)
x | y
## [1] TRUE TRUE TRUE
x || y
## [1] TRUE
x & y
## [1] FALSE FALSE TRUE
x && y
## [1] FALSE
x = c(TRUE,FALSE,TRUE)
y = c(TRUE)
z = c(FALSE,TRUE)
x | y
## [1] TRUE TRUE TRUE
y | z
## [1] TRUE TRUE
x | z
## Warning in x | z: longer object length is not
## a multiple of shorter object length
## [1] TRUE TRUE TRUE
x & y
## [1] TRUE FALSE TRUE
y & z
## [1] FALSE TRUE
x & z
## Warning in x & z: longer object length is not
## a multiple of shorter object length
## [1] FALSE FALSE FALSE
Operator | Comparison | Vectorized? |
---|---|---|
x < y |
less than | Yes |
x > y |
greater than | Yes |
x <= y |
less than or equal to | Yes |
x >= y |
greater than or equal to | Yes |
x != y |
not equal to | Yes |
x == y |
equal to | Yes |
x %in% y |
contains | Yes (for x ) |
x = c("A","B","C")
z = c("A")
x == z
## [1] TRUE FALSE FALSE
x != z
## [1] FALSE TRUE TRUE
x > z
## [1] FALSE TRUE TRUE
x %in% z
## [1] TRUE FALSE FALSE
z %in% x
## [1] TRUE
Conditional execution of code blocks is achieved via if
statements. Note that if
statements are not vectorized.
x = c(3,4)
if (3 %in% x)
print("Here!")
## [1] "Here!"
if (x >= 2)
print("Now Here!")
## Warning in if (x >= 2) print("Now Here!"): the condition has length > 1 and
## only the first element will be used
## [1] "Now Here!"
There are a couple of helper functions for collapsing a logical vector down to a single value: any
, all
x = c(3,4)
any(x >= 2)
## [1] TRUE
all(x >= 2)
## [1] TRUE
!any(x >= 2)
## [1] FALSE
if (any(x >= 2))
print("Now There!")
## [1] "Now There!"
if
, else if
, and else
x = 3
if (x < 0) {
print("Negative")
} else if (x > 0) {
print("Positive")
} else {
print("Zero")
}
## [1] "Positive"
x = 0
if (x < 0) {
print("Negative")
} else if (x > 0) {
print("Positive")
} else {
print("Zero")
}
## [1] "Zero"
for
loopsSimplest, and most common type of loop in R - given a vector iterate through the elements and evaluate the code block for each.
for(x in 1:10)
{
cat(x^2,"")
}
## 1 4 9 16 25 36 49 64 81 100
for(y in list(1:3, LETTERS[1:7], c(TRUE,FALSE)))
{
cat(length(y),"")
}
## 3 7 2
while
Repeat until the given condition is not met (i.e. results in FALSE
)
i = 1
res = rep(NA,10)
while (i <= 10)
{
res[i] = i^2
i = i+1
}
res
## [1] 1 4 9 16 25 36 49 64 81 100
repeat
Repeat until break
i = 1
res = rep(NA,10)
repeat
{
res[i] = i^2
i = i+1
if (i > 10)
break
}
res
## [1] 1 4 9 16 25 36 49 64 81 100
break
and next
These are special actions that only work inside of a loop
break
- ends the current (inner-most) loopnext
- ends the current iterationfor(i in 1:10)
{
if (i %% 2 == 0)
break
cat(i,"")
}
## 1
for(i in 1:10)
{
if (i %% 2 == 0)
next
cat(i,"")
}
## 1 3 5 7 9
It is almost always better to create an object to store your results first, rather than growing the object as you go.
# Good
res = rep(NA,10)
for(x in 1:10)
{
res[x] = x^2
}
res
## [1] 1 4 9 16 25 36 49 64 81 100
# Bad
res = c()
for (x in 1:10)
{
res = c(res,x^2)
}
res
## [1] 1 4 9 16 25 36 49 64 81 100
for
loopsOften we want to use a loop across the indexes of an object and not the elements themselves. There are several useful functions to help you do this: :
, seq
, seq_along
, seq_len
, etc.
l = list(1:3, LETTERS[1:7], c(TRUE,FALSE))
res = rep(NA, length(l))
for(x in seq_along(l))
{
res[x] = length(l[[x]])
}
res
## [1] 3 7 2
1:length(l)
## [1] 1 2 3
seq_along(l)
## [1] 1 2 3
seq_len(length(l))
## [1] 1 2 3
Best Practice:
good = function(x)
{
for(i in seq_along(x))
cat(1,"")
}
Antipattern:
bad = function(x)
{
for(i in 1:length(x))
cat(1,"")
}
good(c(1,2,3))
## 1 1 1
good(c())
bad(c(1,2,3))
## 1 1 1
bad(c())
## 1 1
[]
) or*apply
or purrr
)Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.
Below is the list of primes between 2 and 100:
2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41,
43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97
If you were given the vector x = c(3, 4, 12, 19, 23, 48, 50, 61, 63, 78)
, write out the R code necessary to print only the values of x
that are not prime (without using subsetting or the %in%
operator).
Your code should use nested loops to iterate through the vector of primes and x
.
stop
and stopifnot
Often we want to validate user input or function arguments - if these tests do not pass then we often want to report the error and stop execution.
ok = FALSE
if (!ok)
stop("Things are not ok.")
## Error in eval(expr, envir, enclos): Things are not ok.
stopifnot(ok)
## Error: ok is not TRUE
Note - an error (like the one generated by stop
) will prevent an RMarkdown document from compiling unless error=TRUE
is set for that code block.
do_stuff_v1 = function(x) {
if (condition_one) {
##
## Do stuff
##
} else if (condition_two) {
##
## Do other stuff
##
} else if (condition_error) {
stop("Condition error occured")
}
}
do_stuff_v2 = function(x) {
if (condition_error) {
stop("Condition error occured")
}
if (condition_one) {
##
## Do stuff
##
} else if (condition_two) {
##
## Do other stuff
##
}
}
Above materials are derived in part from the following sources: