# Packages

``````library(DBI)
library(dbplyr)
library(dplyr)
library(nycflights13)
library(Lahman)``````

# Data

Create an in-memory database and add `airports` from `nycflights13` to the database.

``````con <- DBI::dbConnect(RSQLite::SQLite(), dbname = ":memory:")
copy_to(con, df = nycflights13::airports, "airports")``````
``airports_db <- tbl(con, "airports")``

Also, add some baseball data from the `Lahman` package to our in-memory database.

``````dbWriteTable(con, "batting", Batting)
dbWriteTable(con, "pitching", Pitching)
dbWriteTable(con, "teams", Teams)``````

# Exercise 1

## Problem

What are the corresponding SQL verbs based on the `dplyr` structure below?

``````airport_car <- airports_db %>%
filter(lat >= 33.7666, lat <= 36.588, lon >= -84.3201, lon <= -75.4129) %>%
arrange(desc(alt)) %>%
select(name, alt)``````

## Solution

``````airport_car %>%
show_query()``````
``````#> <SQL>
#> SELECT `name`, `alt`
#> FROM (SELECT *
#> FROM (SELECT *
#> FROM `airports`
#> WHERE ((`lat` >= 33.7666) AND (`lat` <= 36.588) AND (`lon` >= -84.3201) AND (`lon` <= -75.4129)))
#> ORDER BY `alt` DESC)``````

# Exercise 2

## Problem

Add `Salaries` from package `Lahman` as a table to your in-memory database.

## Solution

``dbWriteTable(con, "salaries", Salaries)``

# Exercise 3

## Problem

Compute the team salaries for each team in 2016 and display the 5 teams with the highest payroll. Which team had the lowest payroll in that year?

## Solution

``````dbGetQuery(con, paste("SELECT teamID, SUM(salary) as payroll",
"FROM salaries",
"WHERE yearID = 2016",
"GROUP BY teamID",
"ORDER BY payroll DESC",
"LIMIT 5"))``````

# Exercise 4

## Problem

Which 10 teams had the highest winning percentage since 1990? Hint: https://www.w3schools.com/sql/func_sqlserver_cast.asp

## Solution

``````dbGetQuery(con, paste("SELECT yearID, teamID, W, L, CAST(W as FLOAT) / G as wpct",
"FROM teams",
"WHERE yearID >= 1990",
"GROUP BY yearID, teamID",
"ORDER BY wpct DESC",
"LIMIT 10"))``````

# Exercise 5

## Problem

How would you combine the batting and salaries tables to match up the players and years? Take a look at `?dplyr::join`. Try to combine the R data frame objects `Batting` and `Salaries`.

## Solution

``left_join(Batting, Salaries, by = c("playerID", "yearID"))``