--- title: "More Subsetting + S3 Objects" author: "Colin Rundel" date: "2018-09-10" output: xaringan::moon_reader: css: "slides.css" lib_dir: libs nature: highlightStyle: github highlightLines: true countIncrementalSlides: false --- exclude: true ```{r, message=FALSE, warning=FALSE, include=FALSE} options( htmltools.dir.version = FALSE, # for blogdown width=80 ) htmltools::tagList(rmarkdown::html_dependency_font_awesome()) ``` --- class: middle count: false # Subsetting Matrices, Data Frames, and Arrays --- ## Subsetting Matrices ```{r} (x = matrix(1:6, nrow=2, ncol=3)) ``` .pull-left[ ```{r} x[1,3] x[1:2, 1:2] ``` ] .pull-right[ ```{r} x[, 1:2] x[-1,-3] ``` ] --- ## Preserving Subsetting Most of the time, R's `[` subset operator is a *preserving* operator, in that the returned object will have the same type as the parent. Confusingly, when used with a matrix or array `[` becomes a *simplifying* operator (does not preserve type) - this behavior is controlled by the `drop` argument. .pull-left[ ```{r} x[1, ] x[1, , drop=TRUE] x[1, , drop=FALSE] ``` ] .pull-right[ ```{r} str(x[1, ]) str(x[1, , drop=TRUE]) str(x[1, , drop=FALSE]) ``` ] --- ## Preserving vs Simplifying Subsets Type | Simplifying | Preserving :----------------|:-------------------------|:----------------------------------------------------- Atomic Vector | `x[[1]]` | `x[1]` List | `x[[1]]` | `x[1]` Matrix / Array | `x[1, ]`
`x[, 1]` | `x[1, , drop=FALSE]`
`x[, 1, drop=FALSE]` Factor | `x[1:4, drop=TRUE]` | `x[1:4]` Data frame | `x[, 1]`
`x[[1]]` | `x[, 1, drop=FALSE]`
`x[1]` --- ## Factor Subsetting ```{r} (x = factor(c("BS", "MS", "PhD", "MS"))) x[1:2] x[1:2, drop=TRUE] ``` --- ## Data Frame Subsetting If provided with a single value, data frames assume you want to subset a column or columns - multiple values then the data frame is treated as a matrix. ```{r} df = data.frame(a = 1:2, b = 3:4) df[1] df[[1]] df[, "a"] ``` --- ## ```{r} df["a"] df[, "a", drop = FALSE] df[1,] df[c("a","b","a")] ``` --- class: middle count: false # Subsetting and assignment --- ## Subsetting and assignment Subsets can also be used with assignment to update specific values within an object. ```{r} x = c(1, 4, 7) ``` ```{r} x[2] = 2 x x[x %% 2 != 0] = x[x %% 2 != 0] + 1 x x[c(1,1)] = c(2,3) x ``` --- .pull-left[ ```{r} x = 1:6 x[c(2,NA)] = 1 x ``` ```{r} x = 1:6 x[c(TRUE,NA)] = 1 x ``` ] .pull-right[ ```{r} x = 1:6 x[c(-1,-3)] = 3 x ``` ```{r} x = 1:6 x[] = 6:1 x ``` ] --- ## Deleting list (df) elements ```{r} df = data.frame(a = 1:2, b = TRUE, c = c("A", "B")) ``` ```{r} df[["b"]] = NULL str(df) ``` ```{r} df[,"c"] = NULL str(df) ``` --- ## Subsets of Subsets ```{r} df = data.frame(a = c(5,1,NA,3)) ``` ```{r} df$a[df$a == 5] = 0 df ``` ```{r} df[1][df[1] == 3] = 0 df ``` --- class: middle count: false # S3 Objects --- ## What is S3?
> S3 is R’s first and simplest OO system. It is the only OO system used in the base and stats packages, and it’s the most commonly used system in CRAN packages. S3 is informal and ad hoc, but it has a certain elegance in its minimalism: you can’t take away any part of it and still have a useful OO system. --Hadley Wickham, Advanced R .footnote[ * S3 should not be confused with R's other object oriented systems: S4, Reference classes, and R6*. ] --- ## An example .pull-left[ ```{r} print( c("A","B","A","C") ) print( factor(c("A","B","A","C")) ) ``` ] .pull-right[ ```{r} print( data.frame(a=1:3, b=4:6) ) ``` ] --
```{r} print ``` --- ## Other examples .pull-left[ ```{r} mean t.test ``` ] .pull-right[ ```{r} summary plot ``` ] ```{r} sum ``` --- ## What's going on? S3 objects and their related functions work using a very simple dispatch mechanism - a generic function is created whose sole job is to call the `UseMethod` function which then calls a class specialized function named using the convention: `generic.class`. We can see all of the specialized versions of the generic using the `methods` function. ```{r} methods("plot") ``` --- .small[ ```{r} methods("print") ``` ] --- ```{r} print.data.frame ``` --- ```{r error=TRUE} print.matrix ``` -- ```{r} print.default ``` --- ## The other way If instead we have a class and want to know what specialized functions exist for that class, then we can again use the `methods` function - this time with the `class` argument. ```{r} methods(class="data.frame") ``` --- class: small ```{r} `[.data.frame` ``` --- ## Adding methods .pull-left[ ```{r} x = structure(c(1,2,3), class="x") x ``` ] .pull-right[ ```{r} y = structure(c(1,2,3), class="y") y ``` ] --
.pull-left[ ```{r} print.x = function(x) print("Class x!") x ``` ] .pull-right[ ```{r} print.y = function(y) print("Class y!") y ``` ]
--
.pull-left[ ```{r} class(x) = "y" x ``` ] .pull-right[ ```{r} class(y) = "x" y ``` ]
--- ## Defining a new S3 Generic ```{r} shuffle = function(x, ...) { UseMethod("shuffle") } shuffle.default = function(x) { n = length(x) x[sample(seq_len(n),n)] } shuffle.data.frame = function(df) { n = length(df) df[,sample(seq_len(n),n)] } ``` -- .pull-left[ ```{r} shuffle( 1:10 ) shuffle( letters[1:5] ) ``` ] .pull-right[ ```{r} shuffle( data.frame(a=1:2, b=3:4, c=5:6) ) ``` ] --- class: middle count: false # Tibbles --- ## Modern data frames Hadley Wickham has a package that modifies data frames to be more modern, or as he calls them surly and lazy. ```{r} library(tibble) class(iris) tbl_iris = as_tibble(iris) class(tbl_iris) ``` --- ## Fancy Printing ```{r} tbl_iris ``` --- ## Fancier printing ```{r} data_frame(x = rnorm(10,sd=5), y = rnorm(10)) ``` --- ## Tibbles are lazy ```{r} tbl_iris[1,] ``` .pull-left[ ```{r} tbl_iris[,"Species"] ``` ] -- .pull-right[ ```{r} data_frame( x = 1:3, y = c("A","B","C") ) ``` ] --- ## Reverting a tbl ```{r} d = data_frame( x = 1:3, y = c("A","B","C") ) d ``` .pull-left[ ```{r} data.frame(d) ``` ] .pull-right[ ```{r} class(d) = "data.frame" d ``` ] --- ## Multiple classes ```{r} d = data_frame( x = 1:3, y = c("A","B","C") ) class(d) ``` --
```{r} class(d) = rev(class(d)) class(d) d ``` --- class: middle count: false # HW2 & Github Pull Requests --- ## Acknowledgments Above materials are derived in part from the following sources: * Hadley Wickham - [Advanced R](http://adv-r.had.co.nz/) * [R Language Definition](http://stat.ethz.ch/R-manual/R-devel/doc/manual/R-lang.html)