Data structures and subsetting

# Data structures and subsetting
## Statistical Computing & Programming
### Shawn Santo

---

## Supplementary materials

Full video lecture available in Zoom Cloud Recordings

Additional resources

- [Sections 3.3 - 3.4](https://adv-r.hadley.nz/vectors-chap.html#attributes) Advanced R
- [Chapter 4](https://adv-r.hadley.nz/subsetting.html) Advanced R

---

## To-do list

Before the next lab

- complete the Introductory Survey,

- join our GitHub organization: https://github.com/sta323-523-sp21,

- watch the warpwire video on subsetting data structures.

---

# Recall

---

## Atomic vector creation

We can use functions such as `c()`, `vector()`, and  `:` to create atomic
vectors.

```r
c(5, 10, pi, 0, -sqrt(3))
```

```
#> [1]  5.000000 10.000000  3.141593  0.000000 -1.732051
```

```r
vector(mode = "character", length = 4)
```

```
#> [1] "" "" "" ""
```

```r
vector(mode = "integer", length = 3)
```

```
#> [1] 0 0 0
```

```r
-10:-3
```

```
#> [1] -10  -9  -8  -7  -6  -5  -4  -3
```

---

## Generic vector creation

Function `list()` allows us to create a generic vector.

```r
x <- list(
 a = -100:100, 
 b = list(lower = letters, upper = LETTERS),
 cars_data = cars
 )

str(x)
```

```
#> List of 3
#>  $ a        : int [1:201] -100 -99 -98 -97 -96 -95 -94 -93 -92 -91 ...
#>  $ b        :List of 2
#>   ..$ lower: chr [1:26] "a" "b" "c" "d" ...
#>   ..$ upper: chr [1:26] "A" "B" "C" "D" ...
#>  $ cars_data:'data.frame':	50 obs. of  2 variables:
#>   ..$ speed: num [1:50] 4 4 7 7 8 9 10 10 10 11 ...
#>   ..$ dist : num [1:50] 2 10 4 22 16 10 18 26 34 17 ...
```

---

# Attributes

---

## Data structures

You may have heard of factors, matrices, arrays, and date-times. These are
just atomic vectors with special attributes.

- Attributes attach metadata to an object.

- Function `attr()` can retrieve and modify a single attribute.
 
 ```r
 attr(x, which) # get attribute
 attr(x, which) <- value # set / modify attribute
 ```

- Function `attributes()` can retrieve and set attributes en masse.
 
 ```r
 attributes(x) # get attributes
 attributes(x) <- value # set / modify attributes
 ```
 
---

## Attribute: `names`

Get or set the names of an object.

**One option:**

```r
x <- 1:4
attributes(x)
```

```
#> NULL
```

```r
attr(x = x, which = "names") <- c("a", "b", "c", "d")
attributes(x)
```

```
#> $names
#> [1] "a" "b" "c" "d"
```

```r
x
```

```
#> a b c d 
#> 1 2 3 4
```

---

**Another option:**

```r
a <- 1:4
names(a) <- c("a", "b", "c", "d")
attributes(a)
```

```
#> $names
#> [1] "a" "b" "c" "d"
```

```r
a
```

```
#> a b c d 
#> 1 2 3 4
```

Either method is okay to use, but since the replacement function option
exists, it is best to stick with that.

---

## Attribute: `dim`

Get or set the dimension of an object.

```r
z <- 1:9
z
```

```
#> [1] 1 2 3 4 5 6 7 8 9
```

```r
attr(x = z, which = "dim") <- c(3, 3)
attributes(z)
```

```
#> $dim
#> [1] 3 3
```

```r
z
```

```
#>      [,1] [,2] [,3]
#> [1,]    1    4    7
#> [2,]    2    5    8
#> [3,]    3    6    9
```

We have a 3 x 3 matrix.

---

```r
y <- matrix(z, nrow = 3, ncol = 3)
attributes(y)
```

```
#> $dim
#> [1] 3 3
```

```r
y
```

```
#>      [,1] [,2] [,3]
#> [1,]    1    4    7
#> [2,]    2    5    8
#> [3,]    3    6    9
```

---

## Exercise

Create a 3 x 3 x 2 array using the `dim` attribute with the vector below.

```r
x <- c(5, 1, 5, 5, 1, 1, 5, 3, 2, 3, 2, 6, 4, 4, 1, 2, 1, 3)
```

Try to create the same array using function `array()`. What do you notice about
how the array object is populated?

???

## Solution

```r
x <- c(5, 1, 5, 5, 1, 1, 5, 3, 2, 
 3, 2, 6, 4, 4, 1, 2, 1, 3)
attr(x = x, which = "dim") <- c(3, 3, 2)
x
```

```
#> , , 1
#> 
#>      [,1] [,2] [,3]
#> [1,]    5    5    5
#> [2,]    1    1    3
#> [3,]    5    1    2
#> 
#> , , 2
#> 
#>      [,1] [,2] [,3]
#> [1,]    3    4    2
#> [2,]    2    4    1
#> [3,]    6    1    3
```

```r
attributes(x)
```

```
#> $dim
#> [1] 3 3 2
```

```r
array(x, dim = c(3, 3, 2))
```

]

---

## Factors

Factors are built on top of integer vectors with two attributes: `class` and
`levels`. Factors are how R stores and represents categorical data.

A quick way to create a categorical variable as a factor is with function
`factor()`.

```r
x <- factor(c("walk", "single", "double", "triple", "home run"))
x
```

```
#> [1] walk     single   double   triple   home run
#> Levels: double home run single triple walk
```

```r
typeof(x)
```

```
#> [1] "integer"
```

```r
attributes(x)
```

```
#> $levels
#> [1] "double"   "home run" "single"   "triple"   "walk"    
#> 
#> $class
#> [1] "factor"
```

---

## Ordered factors

To induce an ordering we can use function `ordered()` as opposed to `factor()`.

```r
y <- ordered(c("walk", "single", "double", "triple", "home run"), 
 levels = c("walk", "single", "double", "triple", "home run"))
y
```

```
#> [1] walk single double triple home run
#> Levels: walk < single < double < triple < home run
```

```r
attributes(y)
```

```
#> $levels
#> [1] "walk"     "single"   "double"   "triple"   "home run"
#> 
#> $class
#> [1] "ordered" "factor"
```

```r
str(y)
```

```
#> Ord.factor w/ 5 levels "walk"<"single"<..: 1 2 3 4 5
```

---

## Exercise

Create a factor vector based on the vector of airport codes below. Try to do
it without using function `factor()`.

```r
airports <- c("RDU", "ABE", "DTW", "GRR", "RDU", "GRR", "GNV",
 "JFK", "JFK", "SFO", "DTW")
```

Assume all the possible levels are

```r
c("RDU", "ABE", "DTW", "GRR", "GNV", "JFK", "SFO")
```

*Hint*: Think about what type of object factors are built on.

What if the possible levels are

```r
c("RDU", "ABE", "DTW", "GRR", "GNV", "JFK", "SFO", "GSO", "ORD", "PHL")
```

???

## Solution
.tiny[

```r
z <- as.integer(c(1,2,3,4,1,4,5,6,6,7,3))
attr(x = z, which = "levels") <- c("RDU", "ABE", "DTW", 
 "GRR", "GNV", "JFK", "SFO")
attr(x = z, which = "class") <- "factor"
z
```

```
#>  [1] RDU ABE DTW GRR RDU GRR GNV JFK JFK SFO DTW
#> Levels: RDU ABE DTW GRR GNV JFK SFO
```

```r
attributes(z)
```

```
#> $levels
#> [1] "RDU" "ABE" "DTW" "GRR" "GNV" "JFK" "SFO"
#> 
#> $class
#> [1] "factor"
```
]

---

## Matrices and arrays

- Homogeneous in their type.

- Matrices are populated based on column major ordering (use `byrow` argument
  to change this).
  
- Arrays can have one, two, or more dimensions.

---

## Data frames

Data frames are built on top of lists with attributes: `names`, `row.names`,
and `class`. Here the class is `data.frame`.

```r
typeof(longley)
```

```
#> [1] "list"
```

```r
attributes(longley)
```

```
#> $names
#> [1] "GNP.deflator" "GNP"          "Unemployed"   "Armed.Forces" "Population"  
#> [6] "Year"         "Employed"    
#> 
#> $class
#> [1] "data.frame"
#> 
#> $row.names
#>  [1] 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961
#> [16] 1962
```

Here `names` refers to variable names.

---

## Data frame characteristics

- Data frames can be heterogeneous across columns.

- Data frames are rectangular in structure (not always tidy).

- They have column names and row names.

- Data frames can be subset by name or position.

---

## Data frame creation by setting attributes

Start with a list

```r
x <- list(c("48501", "48507", "48505"),
 c(3, 4, 21),
 c(2, 1, 2))
str(x)
```

```
#> List of 3
#>  $ : chr [1:3] "48501" "48507" "48505"
#>  $ : num [1:3] 3 4 21
#>  $ : num [1:3] 2 1 2
```

Add attributes

```r
attributes(x) <- list(class = "data.frame",
 names = c("zip", "lead_value", "time"),
 row.names = 1:3)
```

---

Then we have a data frame

```r
x
```

```
#>     zip lead_value time
#> 1 48501          3    2
#> 2 48507          4    1
#> 3 48505         21    2
```

```r
str(x)
```

```
#> 'data.frame':	3 obs. of  3 variables:
#>  $ zip       : chr  "48501" "48507" "48505"
#>  $ lead_value: num  3 4 21
#>  $ time      : num  2 1 2
```

Of course, we could have used function `data.frame()` to create our data
frame object. There is also function `tibble::tibble()` - it creates a 
tibble object. Similar to a data frame but with two addition class components.

```r
tibble::tibble(x)
```

```
#> # A tibble: 3 x 3
#> zip lead_value time
#> <chr> <dbl> <dbl>
#> 1 48501 3 2
#> 2 48507 4 1
#> 3 48505 21 2
```

---

## Length coercion

Coercion is slightly different for data frames.

```r
data.frame(x = 1:3, y = c("a"))
```

```
#>   x y
#> 1 1 a
#> 2 2 a
#> 3 3 a
```

]

```r
data.frame(x = 1:3, 
           y = c("a","b"))
```

```
#> Error in 
#> data.frame(x = 1:3, 
#>            y = c("a", "b")) : 
#> arguments imply differing number of 
#> rows: 3, 2
```
]

If a shorter vector is not a multiple of the longest vector an error will
occur.

What do you think will happen here?

```r
data.frame(num       = 1:6,
           treatment = c(0, 10, 20),
           type      = c("a", "b"))
```

---

## Summary

| Data Structure | Built On              | Attribute(s)                  | Quick creation                 |
|----------------|-----------------------|-------------------------------|--------------------------------|
| Matrix, Array  | Atomic vector         | `dim`                         | `matrix()`, `array()`          |
| Factor         | Atomic integer vector | `class`, `levels`             | `factor()`, `ordered()`        |
| Date           | Atomic double vector  | `class`                       | `as.Date()`                    |
| Date-times     | Atomic double vector  | `class`                       | `as.POSIXct()`, `as.POSIXlt()` |
| Data frame     | List                  | `class`, `names`, `row.names` | `data.frame()`                 |

]

---

# Subsetting

---

## Subsetting techniques

R has three operators (functions) for subsetting the vectors we've discussed:
1. `[`
2. `[[`
3. `$`

Which one you use will depend on the object you are working with, its
attributes, and what you want as a result.

We can subset with

- numeric values
- logicals
- `NULL`, `NA`
- character values

---

## Numeric (positive) subsetting

**Indexing begins at 1, not 0.** 
.tiny[

```r
x <- c("NC", "SC", "VA", "TN")
y <- list(states = x, rank = 1:4, message = "")
```
]

.tiny.pull-left[
**Atomic vector**

```r
x[1]
```

```
#> [1] "NC"
```

```r
x[c(1, 3)]
```

```
#> [1] "NC" "VA"
```

```r
x[c(1:5)]
```

```
#> [1] "NC" "SC" "VA" "TN" NA
```

```r
x[c(2.2, 3.9)]
```

```
#> [1] "SC" "VA"
```

]

.tiny.pull-right[
**List**

```r
str(y[1])
```

```
#> List of 1
#>  $ states: chr [1:4] "NC" "SC" "VA" "TN"
```

```r
str(y[c(1, 3)])
```

```
#> List of 2
#>  $ states : chr [1:4] "NC" "SC" "VA" "TN"
#>  $ message: chr ""
```

```r
str(y[c(1:4)])
```

```
#> List of 4
#>  $ states : chr [1:4] "NC" "SC" "VA" "TN"
#>  $ rank   : int [1:4] 1 2 3 4
#>  $ message: chr ""
#>  $ NA     : NULL
```
]

---

## Numeric (negative) subsetting

```r
x <- c("NC", "SC", "VA", "TN")
y <- list(states = x, rank = 1:4, message = "")
```
]

.tiny.pull-left[
**Atomic vector**

```r
x[-1]
```

```
#> [1] "SC" "VA" "TN"
```

```r
x[-c(1, 3)]
```

```
#> [1] "SC" "TN"
```

```r
x[c(-1, 3)]
```

```
#> Error in x[c(-1, 3)]: only 0's may be mixed with negative subscripts
```

```r
*x[-c(2.2, 3.9)]
```

```
#> [1] "NC" "TN"
```

]

.tiny.pull-right[
**List**

```r
str(y[-1])
```

```
#> List of 2
#>  $ rank   : int [1:4] 1 2 3 4
#>  $ message: chr ""
```

```r
str(y[-c(1, 3)])
```

```
#> List of 1
#>  $ rank: int [1:4] 1 2 3 4
```

```r
str(y[c(-1, 3)])
```

```
#> Error in y[c(-1, 3)]: only 0's may be mixed with negative subscripts
```

```r
*str(y[-c(2.2, 3.9)])
```

```
#> List of 2
#>  $ states : chr [1:4] "NC" "SC" "VA" "TN"
#>  $ message: chr ""
```
]

---

## Logical subsetting

It returns elements that correspond to `TRUE` in the logical vector. The length 
of the logical vector is expected to be of the same length as the vector 
being subset.

.tiny.pull-left[
**Atomic vector**

```r
x <- c(1, 4, 7, 12)
x[c(TRUE, TRUE, FALSE, TRUE)]
```

```
#> [1]  1  4 12
```

```r
x[c(TRUE, FALSE)]
```

```
#> [1] 1 7
```

```r
x[x %% 2 == 0]
```

```
#> [1]  4 12
```
]

.tiny.pull-right[
**List**

```r
y <- list(1, 4, 7, 12)
str(y[c(TRUE, TRUE, FALSE, TRUE)])
```

```
#> List of 3
#>  $ : num 1
#>  $ : num 4
#>  $ : num 12
```

```r
str(y[c(TRUE, FALSE)])
```

```
#> List of 2
#>  $ : num 1
#>  $ : num 7
```

```r
str(y[y %% 2 == 0])
```
```
#> Error in y%%2: non-numeric 
#> argument to binary operator
```
]

---

## Empty subsetting

Returns the original vector.

```r
x <- c(1,4,7)
x[]
```

```
#> [1] 1 4 7
```

```r
y <- list(1,4,7)
str(y[])
```

```
#> List of 3
#>  $ : num 1
#>  $ : num 4
#>  $ : num 7
```

---

## Zero subsetting

Returns an empty vector of the same type as the vector being subset.

```r
x <- c(1,4,7)
y <- list(1,4,7)
```

```r
x[0]
```

```
#> numeric(0)
```

```r
str(y[0])
```

```
#>  list()
```
]

```r
x[c(0,1)]
```

```
#> [1] 1
```

```r
y[c(0,1)]
```

```
#> [[1]]
#> [1] 1
```
]

---

## Character subsetting

If a vector has names, you can select elements whose names correspond to the 
character vector.

```r
x <- c(a = 1, b = 4, c = 7)
x["a"]
```

```
#> a 
#> 1
```

```r
x[c("a", "a")]
```

```
#> a a 
#> 1 1
```

```r
x[c("c", "b")]
```

```
#> c b 
#> 7 4
```
]

```r
y <- list(a = 1, b = 4, c = 7)
str(y["a"])
```

```
#> List of 1
#>  $ a: num 1
```

```r
str(y[c("a", "a")])
```

```
#> List of 2
#>  $ a: num 1
#>  $ a: num 1
```

```r
str(y[c("c", "b")])
```

```
#> List of 2
#>  $ c: num 7
#>  $ b: num 4
```
]

---

## Missing and NULL subsetting

```r
x <- c(1, 4, 7)
x[NA]
```

```
#> [1] NA NA NA
```

```r
x[NULL]
```

```
#> numeric(0)
```

```r
x[c(1, NA)]
```

```
#> [1]  1 NA
```
]

```r
y <- list(1, 4, 7)
str(y[NA])
```

```
#> List of 3
#>  $ : NULL
#>  $ : NULL
#>  $ : NULL
```

```r
str(y[NULL])
```

```
#>  list()
```

```r
str(y[c(1, NA)])
```

```
#> List of 2
#>  $ : num 1
#>  $ : NULL
```
]

---

## Exercise

Consider the vectors `x` and `y` below.

```r
x <- letters[1:5]
y <- list(i = 1:5, j = -3:3, k = rep(0, 4))
```

What is difference between subsetting with `[` and `[[` using integers? Try
various positive numeric indices.

---

## Understanding `[` vs. `[[` with lists

How do you get a shopping cart with only the cheese and bananas?

How do you get the bananas out of the cart?

---

## Using `$` for subsetting lists

The `$` operator only works with named lists and works similar to `[[`.
.tiny.pull-left[

```r
x <- list(a = 1:3, 
 ab = 4:6, 
 abc = 7:9)
x
```

```
#> $a
#> [1] 1 2 3
#> 
#> $ab
#> [1] 4 5 6
#> 
#> $abc
#> [1] 7 8 9
```

```r
x$a
```

```
#> [1] 1 2 3
```

```r
x$ab
```

```
#> [1] 4 5 6
```
]

.tiny.pull-right[

```r
y <- list(a = 1:3, 
 abc = 4:6, 
 abde = 7:9)
y
```

```
#> $a
#> [1] 1 2 3
#> 
#> $abc
#> [1] 4 5 6
#> 
#> $abde
#> [1] 7 8 9
```

```r
y$a
```

```
#> [1] 1 2 3
```

```r
*y$abd
```

```
#> [1] 7 8 9
```
]

---

# Subsetting matrices, arrays, and data frames

---

## Subsetting matrices and arrays

```r
(x <- matrix(1:6, nrow = 2, ncol = 3))
```

```
#>      [,1] [,2] [,3]
#> [1,]    1    3    5
#> [2,]    2    4    6
```

```r
x[1, 3]
```

```
#> [1] 5
```

```r
x[1:2, 1:2]
```

```
#>      [,1] [,2]
#> [1,]    1    3
#> [2,]    2    4
```
]

```r
x[, 1:2]
```

```
#>      [,1] [,2]
#> [1,]    1    3
#> [2,]    2    4
```

```r
x[-1, -3]
```

```
#> [1] 2 4
```
]

---

## Do I always get a matrix (array) in return?

```r
x[1, ]
```

```
#> [1] 1 3 5
```

```r
attributes(x[1, ])
```

```
#> NULL
```
]

```r
x[, 2]
```

```
#> [1] 3 4
```

```r
attributes(x[, 2])
```

```
#> NULL
```
]

For matrices and arrays `[` has an argument `drop = TRUE` that coerces the
result to the lowest possible dimension.

```r
x[1, , drop = FALSE]
```

```
#>      [,1] [,2] [,3]
#> [1,]    1    3    5
```

```r
attributes(x[1, , drop = FALSE])
```

```
#> $dim
#> [1] 1 3
```
]

---

## Preserving vs simplifying subsetting

Type | Simplifying | Preserving
:----------------|:-------------------------|:-----------------------------------------------------
Atomic Vector | `x[[1]]` | `x[1]`
List | `x[[1]]` | `x[1]`
Matrix / Array | `x[1, ]` `x[, 1]` | `x[1, , drop=FALSE]` `x[, 1, drop=FALSE]`
Factor | `x[1:4, drop=TRUE]` | `x[1:4]`
Data frame | `x[, 1]` `x[[1]]` | `x[, 1, drop=FALSE]` `x[1]`

By preserving we mean retaining the attributes. It is good practice to use
`drop = FALSE` when subsetting a n-dimensional object, where `$n > 1$`.

The drop argument for factors controls whether the levels are preserved or not.
It defaults to `drop = FALSE`.

---

## Subsetting data frames

Recall that data frames are lists with attributes `class`, `names`, `row.names`.
Thus, they can be subset using `[`, `[[`, and `$`. They also support
matrix-style subsetting (specify rows and columns to subset).

```r
df <- data.frame(coin = c("BTC", "ETH", "XRP"),
 price = c(10417.04, 172.52, .26),
 vol = c(21.29, 8.07, 1.23)
 )
```

What will the following return?

```r
df[1]

df[c(1, 3)]

df[1:2, 3]

df[, "price"]
```
]

```r
df[[1]]

df[["vol"]]

df[[c(1, 3)]]

df[[1, 3]]
```
]

???

What will the following return?

```r
df[1]
```

```
#>   coin
#> 1  BTC
#> 2  ETH
#> 3  XRP
```

```r
df[c(1, 3)]
```

```
#>   coin   vol
#> 1  BTC 21.29
#> 2  ETH  8.07
#> 3  XRP  1.23
```

```r
df[1:2, 3]
```

```
#> [1] 21.29  8.07
```

```r
df[, "price"]
```

```
#> [1] 10417.04   172.52     0.26
```
]

```r
df[[1]]
```

```
#> [1] "BTC" "ETH" "XRP"
```

```r
df[["vol"]]
```

```
#> [1] 21.29  8.07  1.23
```

```r
df[[c(1, 3)]]
```

```
#> [1] "XRP"
```

```r
df[[1, 3]]
```

```
#> [1] 21.29
```
]
]

---

# Subsetting extras

---

## Subassignment

Indexing can occur on the right-hand-side of an expression for extraction or
on the left-hand-side for replacement.

```r
x <- c(1, 4, 7)
```

```r
x[2] <- 2
x
```

```
#> [1] 1 2 7
```

```r
x[x %% 2 != 0] <- x[x %% 2 != 0] + 1
x
```

```
#> [1] 2 2 8
```

```r
x[c(1, 1, 1, 1)] <- c(0, 7, 2, 3)
```

What is `x` now?

```r
x
```

```
#> [1] 3 2 8
```

???

Subassignment is done sequentially, so if an index is specified more than 
once the latest assigned value for an index will result.

---

```r
x <- 1:6
x[c(2, NA)] <- 1
x
```

```
#> [1] 1 1 3 4 5 6
```

```r
x <- 1:6
x[c(TRUE, NA)] <- 1
x
```

```
#> [1] 1 2 1 4 1 6
```
]

```r
x <- 1:6
x[c(-1, -3)] <- 3
x
```

```
#> [1] 1 3 3 3 3 3
```

```r
x <- 1:6
x[] <- 6:1
x
```

```
#> [1] 6 5 4 3 2 1
```
]

---

## Adding list and data frame elements

```r
df <- data.frame(
 x = rnorm(4),
 y = rt(4, df = 1)
)
```

```r
df$z <- rchisq(4, df = 1)
df
```

```
#>            x          y         z
#> 1 -0.5518461  5.7648271 0.7712077
#> 2 -0.9270803 -0.4806014 1.4487278
#> 3 -1.0078601  3.3526089 1.4287586
#> 4  1.4708991  3.5458261 2.3065770
```
]

```r
df["a"] <- rexp(4)
df
```

```
#>            x          y         z         a
#> 1 -0.5518461  5.7648271 0.7712077 0.3581307
#> 2 -0.9270803 -0.4806014 1.4487278 0.8275527
#> 3 -1.0078601  3.3526089 1.4287586 1.7513987
#> 4  1.4708991  3.5458261 2.3065770 0.7897827
```
]

---

## Removing list and data frame elements

```r
df <- data.frame(coin = c("BTC", "ETH", "XRP"),
 price = c(10417.04, 172.52, .26),
 vol = c(21.29, 8.07, 1.23)
 )
```
]

```r
df["coin"] <- NULL
str(df)
```

```
#> 'data.frame':	3 obs. of  2 variables:
#>  $ price: num  10417.04 172.52 0.26
#>  $ vol  : num  21.29 8.07 1.23
```

```r
df[[1]] <- NULL
str(df)
```

```
#> 'data.frame':	3 obs. of  1 variable:
#>  $ vol: num  21.29 8.07 1.23
```

```r
df$vol <- NULL
str(df)
```

```
#> 'data.frame':	3 obs. of  0 variables
```
]

---

## Exercises

Use the built-in data frame `longley` to answer the following questions.

1. Which year was the percentage of people employed relative to the population
   highest? Return the result as a data frame.
   
2. The Korean war took place from 1950 - 1953. Filter the data frame so it only
   contains data from those years.
   
3. Which years did the number of people in the armed forces exceed the number
   of people unemployed? Give the result as an atomic vector.
   
???

## Solutions

1.
.tiny[

```r
longley[which.max(longley$Employed / longley$Population), 
        "Year", drop=FALSE]
```

```
#>      Year
#> 1956 1956
```
]
2.
.tiny[

```r
longley[longley$Year %in% 1950:1953, ]
```

```
#>      GNP.deflator     GNP Unemployed Armed.Forces Population Year Employed
#> 1950         89.5 284.599      335.1        165.0    110.929 1950   61.187
#> 1951         96.2 328.975      209.9        309.9    112.075 1951   63.221
#> 1952         98.1 346.999      193.2        359.4    113.270 1952   63.639
#> 1953         99.0 365.385      187.0        354.7    115.094 1953   64.989
```
]
3.
.tiny[

```r
longley$Year[longley$Armed.Forces > longley$Unemployed]
```

```
#> [1] 1951 1952 1953 1955 1956
```
]

---

## References

1. Wickham, H. (2021). Advanced R. https://adv-r.hadley.nz/