--- title: "Logic in R" author: "Colin Rundel" date: "2017-01-09" output: xaringan::moon_reader: css: "slides.css" lib_dir: libs nature: highlightStyle: github highlightLines: true countIncrementalSlides: false --- exclude: true ```{r, message=FALSE, warning=FALSE, include=FALSE} options( htmltools.dir.version = FALSE, # for blogdown width=80 ) library(emo) htmltools::tagList(rmarkdown::html_dependency_font_awesome()) ``` --- class: middle count: false # (Almost) Everything is a Vector --- ## Types of vectors The fundamental building block of data in R are vectors (collections of related values, objects, other data structures, etc).
R has two fundamental vector classes: * Vectors (atomic vectors) - collections of values that are all of the *same* type (e.g. all logical values, all numbers, or all character strings). * Lists (generic vectors) - collections of *any* type of R object, even other lists (meaning they can have a hierarchical/tree-like structure). --- ## Atomic Vectors R has six atomic vector types: | typeof | mode | storage.mode |:-----------|:----------|:------------- | logical | logical | logical | double | numeric | double | integer | numeric | integer | character | character | character | complex | complex | complex | raw | raw | raw
For now we'll mainly worry about the first type, we'll discuss the following three next time (last two almost never come up). --- count: false # Conditionals --- ## Logical (boolean) operations | Operator | Operation | Vectorized? |:-----------|:--------------|:------------- | x | y | or | Yes | `x & y` | and | Yes | `!x` | not | Yes | x || y | or | No | `x && y` | and | No |`xor(x,y)` | exclusive or | Yes --- class: split-50 ## Vectorized? ```{r} x = c(TRUE,FALSE,TRUE) y = c(FALSE,TRUE,TRUE) ``` .column[ ```{r} x | y x || y ``` ] .column[ ```{r} x & y x && y ``` ] --- class: split-50 ## Length coercion ```{r} x = c(TRUE,FALSE,TRUE) y = c(TRUE) z = c(FALSE,TRUE) ``` .column[ ```{r} x | y y | z ``` ] .column[ ```{r} x & y y & z ``` ] ```{r} x | z ``` --- ## Comparisons Operator | Comparison | Vectorized? :-----------|:---------------------------|:----------------- `x < y` | less than | Yes `x > y` | greater than | Yes `x <= y` | less than or equal to | Yes `x >= y` | greater than or equal to | Yes `x != y` | not equal to | Yes `x == y` | equal to | Yes `x %in% y` | contains | Yes (for `x`) --- class: split-50 ## Comparisons ```{r} x = c("A","B","C") z = c("A") ``` .column[ ```{r} x == z x != z x > z ``` ] .column[ ```{r} x %in% z z %in% x ``` ] --- ## Conditional Control Flow Conditional execution of code blocks is achieved via `if` statements. *Note that `if` statements are **not** vectorized.* ```{r} x = c(3,1) if (3 %in% x) "Here!" if (x >= 2) "Now Here!" ``` --- class: split-50 ## Collapsing logical vectors There are a couple of helper functions for collapsing a logical vector down to a single value: `any`, `all` ```{r} x = c(3,4) ``` .column[ ```{r} any(x >= 2) all(x >= 2) ``` ] .column[ ```{r} !any(x >= 2) if (any(x >= 2)) print("Now There!") ``` ] --- ## Nesting Conditionals ```{r} x = 3 if (x < 0) { "Negative" } else if (x > 0) { "Positive" } else { "Zero" } ``` ```{r} x = 0 if (x < 0) { "Negative" } else if (x > 0) { "Positive" } else { "Zero" } ``` --- class: middle count: false # Error Checking --- ## `stop` and `stopifnot` Often we want to validate user input or function arguments - if our assumptions are not met then we often want to report the error and stop execution. ```{r error=TRUE} ok = FALSE if (!ok) stop("Things are not ok.") stopifnot(ok) ``` *Note - an error (like the one generated by `stop`) will prevent an RMarkdown document from compiling unless `error=TRUE` is set for that code block.* --- ## Style choices ```{r eval=FALSE} # Do stuff if (condition_one) { ## ## Do stuff ## } else if (condition_two) { ## ## Do other stuff ## } else if (condition_error) { stop("Condition error occured") } ``` ```{r eval=FALSE} # Do stuff better if (condition_error) { stop("Condition error occured") } if (condition_one) { ## ## Do stuff ## } else if (condition_two) { ## ## Do other stuff ## } ``` --- ## Exercise 1 Write a set of conditional(s) that satisfies the following requirements, * If `x` is greater than 3 and `y` is less than or equal to 3 then print "Hello world!" * Otherwise if `x` is greater than 3 print "!dlrow olleH" * If `x` is less than or equal to 3 then print "Something else ..." * Stop execution if x is odd and y is even and report an error, don't print any of the text strings above. Test out your code by trying various values of `x` and `y`. --- class: middle count: false # Loops --- ## `for` loops Simplest, and most common type of loop in R - given a vector iterate through the elements and evaluate the code block for each. ```{r} for(x in 1:10) { cat(x^2,"") } ``` ```{r} for(y in list(1:3, LETTERS[1:7], c(TRUE,FALSE))) { cat(length(y),"") } ``` --- ## `while` loops Repeat until the given condition is **not** met (i.e. evaluates to `FALSE`) ```{r} i = 1 res = rep(NA,10) while (i <= 10) { res[i] = i^2 i = i+1 } res ``` --- ## `repeat` loops Repeat until `break` ```{r} i = 1 res = rep(NA,10) repeat { res[i] = i^2 i = i+1 if (i > 10) break } res ``` --- class: split-50 ## Special keywords - `break` and `next` These are special actions that only work *inside* of a loop * `break` - ends the current *loop* (inner-most) * `next` - ends the current *iteration* .column[ ```{r} for(i in 1:10) { if (i %% 2 == 0) break cat(i,"") } ``` ] .column[ ```{r} for(i in 1:10) { if (i %% 2 == 0) next cat(i,"") } ``` ] --- class: split-50 ## Some helper functions Often we want to use a loop across the indexes of an object and not the elements themselves. There are several useful functions to help you do this: `:`, `length`, `seq`, `seq_along`, `seq_len`, etc. ```{r} 4:7 length(4:7) seq(4,7,by=1) seq_along(4:7) seq_len(length(4:7)) ``` --- ## Exercise 2 Below is the list of primes between 2 and 100: ``` 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97 ``` If you were given the vector `x = c(3, 4, 12, 19, 23, 48, 50, 61, 63, 78)`, write out the R code necessary to print only the values of `x` that are *not* prime (without using subsetting or the `%in%` operator). Your code should use *nested* loops to iterate through the vector of primes and `x`. --- class: middle count: false # Functions --- ## When to use functions The goal of a function should be to encapsulate a *small* *reusable* piece of code. * Name should make it clear what the function does (think in terms of simple verbs). * Functionality should be simple enough to be quickly understood. * The smaller and more modular the code the easier it will be to reuse elsewhere. * Better to change code in one location than code everywhere. --- ## Function Parts The two parts of a function are the arguments (`formals`) and the code (`body`). ```{r} gcd = function(long1, lat1, long2, lat2) { R = 6371 # Earth mean radius in km # distance in km acos(sin(lat1)*sin(lat2) + cos(lat1)*cos(lat2) * cos(long2-long1)) * R } formals(gcd) body(gcd) ``` --- ## Return values There are two approaches to returning values: explicit and implicit return values. *Explicit* - includes one or more `return` statements ```{r} f = function(x) { return(x*x) } ```
*Implicit* - value of the last statement is returned. ```{r} f = function(x) { x*x } ``` --- ## Returning multiple values If we want a function to return more than one value we can group things using either vectors or lists. ```{r} f = function(x) { c(x, x^2, x^3) } f(2) f(2:3) ``` --- class: split-50 ## Argument names When defining a function we are also implicitly defining names for the arguments, when calling the function we can use these names to pass arguments in a different order. ```{r} f = function(x,y,z) { paste0("x=",x," y=",y," z=",z) } ``` .column[ ```{r,error=TRUE} f(1,2,3) f(z=1,x=2,y=3) ``` ] .column[ ```{r,error=TRUE} f(y=2,1,3) f(y=2,1,x=3) ``` ] ```{r,error=TRUE} f(1,2,3,m=1) ``` --- ## Argument defaults It is also possible to give function arguments default values so that they don't need to be provided every time the function is called. ```{r error=TRUE} f = function(x,y=1,z=1) { paste0("x=",x," y=",y," z=",z) } ``` ```{r error=TRUE} f() f(x=3) f(y=2,2) ``` --- ## Scope R has generous scoping rules, if it can't find a variable in the functions body, it will look for it in the next higher scope, and so on. ```{r} y = 1 f = function(x) { x+y } f(3) ``` ```{r} g = function(x) { y=2 x+y } g(3) ``` --- ## Additionally, variables defined within a scope only persist for the duration of that scope, and do not overwrite variables at a higher scopes (unless you use the global assignment operator `<<-`, *which you shouldn't*) ```{r} x = 1 y = 1 z = 1 f = function() { y = 2 g = function() { z = 3 return(x + y + z) } return(g()) } f() c(x,y,z) ``` --- ## Exercise 3 What is the output of the following code? Explain why. ```{r eval=FALSE} z = 1 f = function(x,y,z) { z = x+y g = function(m=x,n=y) { m/z + n/z } z * g() } f(1,2,3) ``` --- ## Lazy evaluation Arguments to R functions are lazily evaluated - meaning they are not evaluated until they are used ```{r, error=TRUE} f = function(x) { cat("Hello world!\n") x } f(stop()) ``` --- ## Everything is a function ```{r} `+` typeof(`+`) x = 4:1 `+`(x,2) ``` --- ## Getting Help Prefixing any function name with a `?` will open the related help file for that function. ```{r, eval=FALSE} ?`+` ?sum ``` For functions not in the base package, you can generally see their implementation by entering the function name without parentheses (or using the `body` function). ```{r} lm ``` --- ## Less Helpful Examples ```{r} list `[` sum `+` ``` --- count: false # Acknowledgments Above materials are derived in part from the following sources: * Hadley Wickham - [Advanced R](http://adv-r.had.co.nz/) * [R Language Definition](http://stat.ethz.ch/R-manual/R-devel/doc/manual/R-lang.html)