---
title: "Logic in R"
author: "Colin Rundel"
date: "2017-01-09"
output:
xaringan::moon_reader:
css: "slides.css"
lib_dir: libs
nature:
highlightStyle: github
highlightLines: true
countIncrementalSlides: false
---
exclude: true
```{r, message=FALSE, warning=FALSE, include=FALSE}
options(
htmltools.dir.version = FALSE, # for blogdown
width=80
)
library(emo)
htmltools::tagList(rmarkdown::html_dependency_font_awesome())
```
---
class: middle
count: false
# (Almost) Everything is a Vector
---
## Types of vectors
The fundamental building block of data in R are vectors (collections of related values, objects, other data structures, etc).
R has two fundamental vector classes:
* Vectors (atomic vectors)
- collections of values that are all of the *same* type (e.g. all logical values, all numbers, or all character strings).
* Lists (generic vectors)
- collections of *any* type of R object, even other lists (meaning they can have a hierarchical/tree-like structure).
---
## Atomic Vectors
R has six atomic vector types:
| typeof | mode | storage.mode
|:-----------|:----------|:-------------
| logical | logical | logical
| double | numeric | double
| integer | numeric | integer
| character | character | character
| complex | complex | complex
| raw | raw | raw
For now we'll mainly worry about the first type, we'll discuss the following three next time (last two almost never come up).
---
count: false
# Conditionals
---
## Logical (boolean) operations
| Operator | Operation | Vectorized?
|:-----------|:--------------|:-------------
| x | y
| or | Yes
| `x & y` | and | Yes
| `!x` | not | Yes
| x || y
| or | No
| `x && y` | and | No
|`xor(x,y)` | exclusive or | Yes
---
class: split-50
## Vectorized?
```{r}
x = c(TRUE,FALSE,TRUE)
y = c(FALSE,TRUE,TRUE)
```
.column[
```{r}
x | y
x || y
```
]
.column[
```{r}
x & y
x && y
```
]
---
class: split-50
## Length coercion
```{r}
x = c(TRUE,FALSE,TRUE)
y = c(TRUE)
z = c(FALSE,TRUE)
```
.column[
```{r}
x | y
y | z
```
]
.column[
```{r}
x & y
y & z
```
]
```{r}
x | z
```
---
## Comparisons
Operator | Comparison | Vectorized?
:-----------|:---------------------------|:-----------------
`x < y` | less than | Yes
`x > y` | greater than | Yes
`x <= y` | less than or equal to | Yes
`x >= y` | greater than or equal to | Yes
`x != y` | not equal to | Yes
`x == y` | equal to | Yes
`x %in% y` | contains | Yes (for `x`)
---
class: split-50
## Comparisons
```{r}
x = c("A","B","C")
z = c("A")
```
.column[
```{r}
x == z
x != z
x > z
```
]
.column[
```{r}
x %in% z
z %in% x
```
]
---
## Conditional Control Flow
Conditional execution of code blocks is achieved via `if` statements.
*Note that `if` statements are **not** vectorized.*
```{r}
x = c(3,1)
if (3 %in% x)
"Here!"
if (x >= 2)
"Now Here!"
```
---
class: split-50
## Collapsing logical vectors
There are a couple of helper functions for collapsing a logical vector down to a single value: `any`, `all`
```{r}
x = c(3,4)
```
.column[
```{r}
any(x >= 2)
all(x >= 2)
```
]
.column[
```{r}
!any(x >= 2)
if (any(x >= 2))
print("Now There!")
```
]
---
## Nesting Conditionals
```{r}
x = 3
if (x < 0) {
"Negative"
} else if (x > 0) {
"Positive"
} else {
"Zero"
}
```
```{r}
x = 0
if (x < 0) {
"Negative"
} else if (x > 0) {
"Positive"
} else {
"Zero"
}
```
---
class: middle
count: false
# Error Checking
---
## `stop` and `stopifnot`
Often we want to validate user input or function arguments - if our assumptions are not met then we often want to report the error and stop execution.
```{r error=TRUE}
ok = FALSE
if (!ok)
stop("Things are not ok.")
stopifnot(ok)
```
*Note - an error (like the one generated by `stop`) will prevent an RMarkdown document from compiling unless `error=TRUE` is set for that code block.*
---
## Style choices
```{r eval=FALSE}
# Do stuff
if (condition_one) {
##
## Do stuff
##
} else if (condition_two) {
##
## Do other stuff
##
} else if (condition_error) {
stop("Condition error occured")
}
```
```{r eval=FALSE}
# Do stuff better
if (condition_error) {
stop("Condition error occured")
}
if (condition_one) {
##
## Do stuff
##
} else if (condition_two) {
##
## Do other stuff
##
}
```
---
## Exercise 1
Write a set of conditional(s) that satisfies the following requirements,
* If `x` is greater than 3 and `y` is less than or equal to 3 then print "Hello world!"
* Otherwise if `x` is greater than 3 print "!dlrow olleH"
* If `x` is less than or equal to 3 then print "Something else ..."
* Stop execution if x is odd and y is even and report an error, don't print any of the text strings above.
Test out your code by trying various values of `x` and `y`.
---
class: middle
count: false
# Loops
---
## `for` loops
Simplest, and most common type of loop in R - given a vector iterate through the elements and evaluate the code block for each.
```{r}
for(x in 1:10)
{
cat(x^2,"")
}
```
```{r}
for(y in list(1:3, LETTERS[1:7], c(TRUE,FALSE)))
{
cat(length(y),"")
}
```
---
## `while` loops
Repeat until the given condition is **not** met (i.e. evaluates to `FALSE`)
```{r}
i = 1
res = rep(NA,10)
while (i <= 10) {
res[i] = i^2
i = i+1
}
res
```
---
## `repeat` loops
Repeat until `break`
```{r}
i = 1
res = rep(NA,10)
repeat {
res[i] = i^2
i = i+1
if (i > 10)
break
}
res
```
---
class: split-50
## Special keywords - `break` and `next`
These are special actions that only work *inside* of a loop
* `break` - ends the current *loop* (inner-most)
* `next` - ends the current *iteration*
.column[
```{r}
for(i in 1:10) {
if (i %% 2 == 0)
break
cat(i,"")
}
```
]
.column[
```{r}
for(i in 1:10) {
if (i %% 2 == 0)
next
cat(i,"")
}
```
]
---
class: split-50
## Some helper functions
Often we want to use a loop across the indexes of an object and not the elements themselves. There are several useful functions to help you do this: `:`, `length`, `seq`, `seq_along`, `seq_len`, etc.
```{r}
4:7
length(4:7)
seq(4,7,by=1)
seq_along(4:7)
seq_len(length(4:7))
```
---
## Exercise 2
Below is the list of primes between 2 and 100:
```
2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41,
43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97
```
If you were given the vector `x = c(3, 4, 12, 19, 23, 48, 50, 61, 63, 78)`, write out the R code necessary to print only the values of `x` that are *not* prime (without using subsetting or the `%in%` operator).
Your code should use *nested* loops to iterate through the vector of primes and `x`.
---
class: middle
count: false
# Functions
---
## When to use functions
The goal of a function should be to encapsulate a *small* *reusable* piece of code.
* Name should make it clear what the function does (think in terms of simple verbs).
* Functionality should be simple enough to be quickly understood.
* The smaller and more modular the code the easier it will be to reuse elsewhere.
* Better to change code in one location than code everywhere.
---
## Function Parts
The two parts of a function are the arguments (`formals`) and the code (`body`).
```{r}
gcd = function(long1, lat1, long2, lat2) {
R = 6371 # Earth mean radius in km
# distance in km
acos(sin(lat1)*sin(lat2) + cos(lat1)*cos(lat2) * cos(long2-long1)) * R
}
formals(gcd)
body(gcd)
```
---
## Return values
There are two approaches to returning values: explicit and implicit return values.
*Explicit* - includes one or more `return` statements
```{r}
f = function(x) {
return(x*x)
}
```
*Implicit* - value of the last statement is returned.
```{r}
f = function(x) {
x*x
}
```
---
## Returning multiple values
If we want a function to return more than one value we can group things using either vectors or lists.
```{r}
f = function(x) {
c(x, x^2, x^3)
}
f(2)
f(2:3)
```
---
class: split-50
## Argument names
When defining a function we are also implicitly defining names for the arguments, when calling the function we can use these names to pass arguments in a different order.
```{r}
f = function(x,y,z) {
paste0("x=",x," y=",y," z=",z)
}
```
.column[
```{r,error=TRUE}
f(1,2,3)
f(z=1,x=2,y=3)
```
]
.column[
```{r,error=TRUE}
f(y=2,1,3)
f(y=2,1,x=3)
```
]
```{r,error=TRUE}
f(1,2,3,m=1)
```
---
## Argument defaults
It is also possible to give function arguments default values so that they don't need to be provided every time the function is called.
```{r error=TRUE}
f = function(x,y=1,z=1) {
paste0("x=",x," y=",y," z=",z)
}
```
```{r error=TRUE}
f()
f(x=3)
f(y=2,2)
```
---
## Scope
R has generous scoping rules, if it can't find a variable in the functions body, it will look for it in the next higher scope, and so on.
```{r}
y = 1
f = function(x) {
x+y
}
f(3)
```
```{r}
g = function(x) {
y=2
x+y
}
g(3)
```
---
##
Additionally, variables defined within a scope only persist for the duration of that scope, and do not overwrite variables at a higher scopes (unless you use the global assignment operator `<<-`, *which you shouldn't*)
```{r}
x = 1
y = 1
z = 1
f = function()
{
y = 2
g = function()
{
z = 3
return(x + y + z)
}
return(g())
}
f()
c(x,y,z)
```
---
## Exercise 3
What is the output of the following code? Explain why.
```{r eval=FALSE}
z = 1
f = function(x,y,z)
{
z = x+y
g = function(m=x,n=y)
{
m/z + n/z
}
z * g()
}
f(1,2,3)
```
---
## Lazy evaluation
Arguments to R functions are lazily evaluated - meaning they are not evaluated until they are used
```{r, error=TRUE}
f = function(x)
{
cat("Hello world!\n")
x
}
f(stop())
```
---
## Everything is a function
```{r}
`+`
typeof(`+`)
x = 4:1
`+`(x,2)
```
---
## Getting Help
Prefixing any function name with a `?` will open the related help file for that function.
```{r, eval=FALSE}
?`+`
?sum
```
For functions not in the base package, you can generally see their implementation by entering the function name without parentheses (or using the `body` function).
```{r}
lm
```
---
## Less Helpful Examples
```{r}
list
`[`
sum
`+`
```
---
count: false
# Acknowledgments
Above materials are derived in part from the following sources:
* Hadley Wickham - [Advanced R](http://adv-r.had.co.nz/)
* [R Language Definition](http://stat.ethz.ch/R-manual/R-devel/doc/manual/R-lang.html)