Logic in R

class: center, middle, inverse, title-slide

# Logic in R
### Colin Rundel
### 2017-01-09

---

exclude: true

---
class: middle
count: false

# (Almost) Everything is a Vector

---

## Types of vectors

The fundamental building block of data in R are vectors (collections of related values, objects, other data structures, etc).

<br/>

R has two fundamental vector classes:

* Vectors (atomic vectors)

- collections of values that are all of the *same* type (e.g. all logical values, all numbers, or all character strings).

* Lists (generic vectors)
  
    - collections of *any* type of R object, even other lists (meaning they can have a hierarchical/tree-like structure).

---

## Atomic Vectors

R has six atomic vector types:

|  typeof    |   mode    | storage.mode 
|:-----------|:----------|:-------------
| logical    | logical   | logical      
| double     | numeric   | double       
| integer    | numeric   | integer      
| character  | character | character    
| complex    | complex   | complex      
| raw        | raw       | raw

<br/>

For now we'll mainly worry about the first type, we'll discuss the following three next time (last two almost never come up).

---
count: false

# Conditionals

---

## Logical (boolean) operations

|  Operator  |  Operation    |  Vectorized? 
|:-----------|:--------------|:-------------
| <code>x &#124; y</code>    |  or           |   Yes        
| `x & y`    |  and          |   Yes        
| `!x`       |  not          |   Yes        
| <code>x &#124;&#124; y</code> |  or           |   No         
| `x && y`   |  and          |   No         
|`xor(x,y)`  |  exclusive or |   Yes

---
class: split-50

## Vectorized?

```r
x = c(TRUE,FALSE,TRUE)
y = c(FALSE,TRUE,TRUE)
```

.column[

```r
x | y
```

```
## [1] TRUE TRUE TRUE
```

```r
x || y
```

```
## [1] TRUE
```
]

.column[

```r
x & y
```

```
## [1] FALSE FALSE  TRUE
```

```r
x && y
```

```
## [1] FALSE
```
]

---
class: split-50

## Length coercion

```r
x = c(TRUE,FALSE,TRUE)
y = c(TRUE)
z = c(FALSE,TRUE)
```

.column[

```r
x | y
```

```
## [1] TRUE TRUE TRUE
```

```r
y | z
```

```
## [1] TRUE TRUE
```
]

.column[

```r
x & y
```

```
## [1]  TRUE FALSE  TRUE
```

```r
y & z
```

```
## [1] FALSE  TRUE
```
]

```r
x | z
```

```
## Warning in x | z: longer object length is not a multiple of shorter object
## length
```

```
## [1] TRUE TRUE TRUE
```

---

## Comparisons

Operator  |  Comparison                |  Vectorized?
:-----------|:---------------------------|:-----------------
 `x < y`    |  less than                 |  Yes
 `x > y`    |  greater than              |  Yes
 `x <= y`   |  less than or equal to     |  Yes
 `x >= y`   |  greater than or equal to  |  Yes
 `x != y`   |  not equal to              |  Yes
 `x == y`   |  equal to                  |  Yes
 `x %in% y` |  contains                  |  Yes (for `x`)

---
class: split-50

## Comparisons

```r
x = c("A","B","C")
z = c("A")
```

.column[

```r
x == z
```

```
## [1]  TRUE FALSE FALSE
```

```r
x != z
```

```
## [1] FALSE  TRUE  TRUE
```

```r
x > z
```

```
## [1] FALSE  TRUE  TRUE
```
]

.column[

```r
x %in% z
```

```
## [1]  TRUE FALSE FALSE
```

```r
z %in% x
```

```
## [1] TRUE
```
]

---

## Conditional Control Flow

Conditional execution of code blocks is achieved via `if` statements.

*Note that `if` statements are **not** vectorized.*

```r
x = c(3,1)

if (3 %in% x)
  "Here!"
```

```
## [1] "Here!"
```

```r
if (x >= 2)
  "Now Here!"
```

```
## Warning in if (x >= 2) "Now Here!": the condition has length > 1 and only the
## first element will be used
```

```
## [1] "Now Here!"
```

---
class: split-50

## Collapsing logical vectors

There are a couple of helper functions for collapsing a logical vector down to a single value: `any`, `all`

```r
x = c(3,4)
```

.column[

```r
any(x >= 2)
```

```
## [1] TRUE
```

```r
all(x >= 2)
```

```
## [1] TRUE
```
]

.column[

```r
!any(x >= 2)
```

```
## [1] FALSE
```

```r
if (any(x >= 2))
    print("Now There!")
```

```
## [1] "Now There!"
```
]

---

## Nesting Conditionals

```r
x = 3
if (x < 0) {
  "Negative"
} else if (x > 0) {
  "Positive"
} else {
  "Zero"
}
```

```
## [1] "Positive"
```

```r
x = 0
if (x < 0) {
  "Negative"
} else if (x > 0) {
  "Positive"
} else {
  "Zero"
}
```

```
## [1] "Zero"
```

---
class: middle
count: false

# Error Checking

---

## `stop` and `stopifnot`

Often we want to validate user input or function arguments - if our assumptions are not met then we often want to report the error and stop execution.

```r
ok = FALSE
if (!ok)
  stop("Things are not ok.")
```

```
## Error in eval(expr, envir, enclos): Things are not ok.
```

```r
stopifnot(ok)
```

```
## Error: ok is not TRUE
```

*Note - an error (like the one generated by `stop`) will prevent an RMarkdown document from compiling unless `error=TRUE` is set for that code block.*

---

## Style choices

```r
# Do stuff
if (condition_one) {
  ##
  ## Do stuff
  ##
} else if (condition_two) {
  ##
  ## Do other stuff
  ##
} else if (condition_error) {
  stop("Condition error occured")
}
```

```r
# Do stuff better
if (condition_error) {
  stop("Condition error occured")
}

if (condition_one) {
  ##
  ## Do stuff
  ##
} else if (condition_two) {
  ##
  ## Do other stuff
  ##
}
```

---

## Exercise 1

Write a set of conditional(s) that satisfies the following requirements,

* If `x` is greater than 3 and `y` is less than or equal to 3 then print "Hello world!"

* Otherwise if `x` is greater than 3 print "!dlrow olleH"

* If `x` is less than or equal to 3 then print "Something else ..."

* Stop execution if x is odd and y is even and report an error, don't print any of the text strings above.

Test out your code by trying various values of `x` and `y`.

---
class: middle
count: false

# Loops

---

## `for` loops

Simplest, and most common type of loop in R - given a vector iterate through the elements and evaluate the code block for each.

```r
for(x in 1:10)
{
  cat(x^2,"")
}
```

```
## 1 4 9 16 25 36 49 64 81 100
```

```r
for(y in list(1:3, LETTERS[1:7], c(TRUE,FALSE)))
{
  cat(length(y),"")
}
```

```
## 3 7 2
```

---

## `while` loops

Repeat until the given condition is **not** met (i.e. evaluates to `FALSE`)

```r
i = 1
res = rep(NA,10)

while (i <= 10) {
  res[i] = i^2
  i = i+1
}

res
```

```
##  [1]   1   4   9  16  25  36  49  64  81 100
```

---

## `repeat` loops

Repeat until `break`

```r
i = 1
res = rep(NA,10)

repeat {
  res[i] = i^2
  i = i+1
  if (i > 10)
    break
}

res
```

```
##  [1]   1   4   9  16  25  36  49  64  81 100
```

---
class: split-50

## Special keywords - `break` and `next`

These are special actions that only work *inside* of a loop

* `break` - ends the current *loop* (inner-most)
* `next` - ends the current *iteration*

.column[

```r
for(i in 1:10) {
    if (i %% 2 == 0)
        break
    cat(i,"")
}
```

```
## 1
```
]

.column[

```r
for(i in 1:10) {
    if (i %% 2 == 0)
        next
    cat(i,"")
}
```

```
## 1 3 5 7 9
```
]

---
class: split-50

## Some helper functions

Often we want to use a loop across the indexes of an object and not the elements themselves. There are several useful functions to help you do this: `:`, `length`, `seq`, `seq_along`, `seq_len`, etc.

```r
4:7
```

```
## [1] 4 5 6 7
```

```r
length(4:7)
```

```
## [1] 4
```

```r
seq(4,7,by=1)
```

```
## [1] 4 5 6 7
```

```r
seq_along(4:7)
```

```
## [1] 1 2 3 4
```

```r
seq_len(length(4:7))
```

```
## [1] 1 2 3 4
```

---

## Exercise 2

Below is the list of primes between 2 and 100:
```
2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 
43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97
```

If you were given the vector `x = c(3, 4, 12, 19, 23, 48, 50, 61, 63, 78)`, write out the R code necessary to print only the values of `x` that are *not* prime (without using subsetting or the `%in%` operator).

Your code should use *nested* loops to iterate through the vector of primes and `x`.

---
class: middle
count: false

# Functions

---

## When to use functions

The goal of a function should be to encapsulate a *small* *reusable* piece of code.

* Name should make it clear what the function does (think in terms of simple verbs).

* Functionality should be simple enough to be quickly understood.

* The smaller and more modular the code the easier it will be to reuse elsewhere.

* Better to change code in one location than code everywhere.

---

## Function Parts

The two parts of a function are the arguments (`formals`) and the code (`body`).

```r
gcd = function(long1, lat1, long2, lat2) {
  R = 6371 # Earth mean radius in km
  # distance in km
  acos(sin(lat1)*sin(lat2) + cos(lat1)*cos(lat2) * cos(long2-long1)) * R
}

formals(gcd)
```

```
## $long1
## 
## 
## $lat1
## 
## 
## $long2
## 
## 
## $lat2
```

```r
body(gcd)
```

```
## {
##     R = 6371
##     acos(sin(lat1) * sin(lat2) + cos(lat1) * cos(lat2) * cos(long2 - 
##         long1)) * R
## }
```

---

## Return values

There are two approaches to returning values: explicit and implicit return values.

*Explicit* - includes one or more `return` statements

```r
f = function(x) {
  return(x*x)
}
```

<br/>

*Implicit* - value of the last statement is returned.

```r
f = function(x) {
  x*x
}
```

---

## Returning multiple values

If we want a function to return more than one value we can group things using either vectors or lists.

```r
f = function(x) {
  c(x, x^2, x^3)
}

f(2)
```

```
## [1] 2 4 8
```

```r
f(2:3)
```

```
## [1]  2  3  4  9  8 27
```

---
class: split-50

## Argument names

When defining a function we are also implicitly defining names for the arguments, when calling the function we can use these names to pass arguments in a different order.

```r
f = function(x,y,z) {
  paste0("x=",x," y=",y," z=",z)
}
```

.column[

```r
f(1,2,3)
```

```
## [1] "x=1 y=2 z=3"
```

```r
f(z=1,x=2,y=3)
```

```
## [1] "x=2 y=3 z=1"
```
]

.column[

```r
f(y=2,1,3)
```

```
## [1] "x=1 y=2 z=3"
```

```r
f(y=2,1,x=3)
```

```
## [1] "x=3 y=2 z=1"
```
]

```r
f(1,2,3,m=1)
```

```
## Error in f(1, 2, 3, m = 1): unused argument (m = 1)
```

---

## Argument defaults

It is also possible to give function arguments default values so that they don't need to be provided every time the function is called.

```r
f = function(x,y=1,z=1) {
  paste0("x=",x," y=",y," z=",z)
}
```

```r
f()
```

```
## Error in paste0("x=", x, " y=", y, " z=", z): argument "x" is missing, with no default
```

```r
f(x=3)
```

```
## [1] "x=3 y=1 z=1"
```

```r
f(y=2,2)
```

```
## [1] "x=2 y=2 z=1"
```

---

## Scope

R has generous scoping rules, if it can't find a variable in the functions body, it will look for it in the next higher scope, and so on.

```r
y = 1
f = function(x) {
  x+y
}
f(3)
```

```
## [1] 4
```

```r
g = function(x) {
  y=2
  x+y
}
g(3)
```

```
## [1] 5
```

---

Additionally, variables defined within a scope only persist for the duration of that scope, and do not overwrite variables at a higher scopes (unless you use the global assignment operator `<<-`, *which you shouldn't*)

```r
x = 1
y = 1
z = 1
f = function()
{
    y = 2
    g = function()
    {
      z = 3
      return(x + y + z)
    }
    return(g())
}
f()
```

```
## [1] 6
```

```r
c(x,y,z)
```

```
## [1] 1 1 1
```

---

## Exercise 3

What is the output of the following code? Explain why.

```r
z = 1

f = function(x,y,z)
{
  z = x+y

g = function(m=x,n=y)
  {
    m/z + n/z
  }

z * g()
}

f(1,2,3)
```

---

## Lazy evaluation

Arguments to R functions are lazily evaluated - meaning they are not evaluated until they are used

```r
f = function(x)
{
  cat("Hello world!\n")
  x
}

f(stop())
```

```
## Hello world!
```

```
## Error in f(stop()):
```

---

## Everything is a function

```r
`+`
```

```
## function (e1, e2)  .Primitive("+")
```

```r
typeof(`+`)
```

```
## [1] "builtin"
```

```r
x = 4:1
`+`(x,2)
```

```
## [1] 6 5 4 3
```

---

## Getting Help

Prefixing any function name with a `?` will open the related help file for that function.

```r
?`+`
?sum
```

For functions not in the base package, you can generally see their implementation by entering the function name without parentheses (or using the `body` function).

```r
lm
```

```
## function (formula, data, subset, weights, na.action, method = "qr", 
##     model = TRUE, x = FALSE, y = FALSE, qr = TRUE, singular.ok = TRUE, 
##     contrasts = NULL, offset, ...) 
## {
##     ret.x <- x
##     ret.y <- y
##     cl <- match.call()
##     mf <- match.call(expand.dots = FALSE)
##     m <- match(c("formula", "data", "subset", "weights", "na.action", 
##         "offset"), names(mf), 0L)
##     mf <- mf[c(1L, m)]
##     mf$drop.unused.levels <- TRUE
##     mf[[1L]] <- quote(stats::model.frame)
##     mf <- eval(mf, parent.frame())
##     if (method == "model.frame") 
##         return(mf)
##     else if (method != "qr") 
##         warning(gettextf("method = '%s' is not supported. Using 'qr'", 
##             method), domain = NA)
##     mt <- attr(mf, "terms")
##     y <- model.response(mf, "numeric")
##     w <- as.vector(model.weights(mf))
##     if (!is.null(w) && !is.numeric(w)) 
##         stop("'weights' must be a numeric vector")
##     offset <- as.vector(model.offset(mf))
##     if (!is.null(offset)) {
##         if (length(offset) != NROW(y)) 
##             stop(gettextf("number of offsets is %d, should equal %d (number of observations)", 
##                 length(offset), NROW(y)), domain = NA)
##     }
##     if (is.empty.model(mt)) {
##         x <- NULL
##         z <- list(coefficients = if (is.matrix(y)) matrix(, 0, 
##             3) else numeric(), residuals = y, fitted.values = 0 * 
##             y, weights = w, rank = 0L, df.residual = if (!is.null(w)) sum(w != 
##             0) else if (is.matrix(y)) nrow(y) else length(y))
##         if (!is.null(offset)) {
##             z$fitted.values <- offset
##             z$residuals <- y - offset
##         }
##     }
##     else {
##         x <- model.matrix(mt, mf, contrasts)
##         z <- if (is.null(w)) 
##             lm.fit(x, y, offset = offset, singular.ok = singular.ok, 
##                 ...)
##         else lm.wfit(x, y, w, offset = offset, singular.ok = singular.ok, 
##             ...)
##     }
##     class(z) <- c(if (is.matrix(y)) "mlm", "lm")
##     z$na.action <- attr(mf, "na.action")
##     z$offset <- offset
##     z$contrasts <- attr(x, "contrasts")
##     z$xlevels <- .getXlevels(mt, mf)
##     z$call <- cl
##     z$terms <- mt
##     if (model) 
##         z$model <- mf
##     if (ret.x) 
##         z$x <- x
##     if (ret.y) 
##         z$y <- y
##     if (!qr) 
##         z$qr <- NULL
##     z
## }
## <bytecode: 0x7fe35c8fa578>
## <environment: namespace:stats>
```

---

## Less Helpful Examples

```r
list
```

```
## function (...)  .Primitive("list")
```

```r
`[`
```

```
## .Primitive("[")
```

```r
sum
```

```
## function (..., na.rm = FALSE)  .Primitive("sum")
```

```r
`+`
```

```
## function (e1, e2)  .Primitive("+")
```

---
count: false

# Acknowledgments

Above materials are derived in part from the following sources:

* Hadley Wickham - [Advanced R](http://adv-r.had.co.nz/)
* [R Language Definition](http://stat.ethz.ch/R-manual/R-devel/doc/manual/R-lang.html)