--- title: "Lab 5" subtitle: "Statistical Computing & Programming" author: "" institute: "" date: "06-16-20" output: xaringan::moon_reader: css: "slides.css" lib_dir: libs nature: highlightStyle: github highlightLines: true countIncrementalSlides: false editor_options: chunk_output_type: console --- ```{r include=FALSE} knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = TRUE, comment = "#>", highlight = TRUE, fig.align = "center") ``` ## Getting started - Navigate to your team repo, `lab5-[github_teamname]`
- Open an RStudio (Pawn or Rook) session; then go to - `File` > `New Project` - select `Version Control` - select `Git` - paste the repository URL - available at your GitHub repo `lab5-[github_teamname]` when you click `Clone or download` and then `Clone with HTTPS` - Click `Create Project`
- This is a team lab.

You may do this on your local machine if you have git configured with R/RStudio. --- ## Introduction A database is a structured set of data. The terminology is slightly different when working with a database management system compared to working with data in R. - field: variable or quantity - record: collection of fields - table: collection of records with all the same fields - database: collection of tables The relationship between R terminology and database terminology is explained below. | **R terminology** | **Database terminology** | |---------------------------|--------------------------| | column | field | | row | record | | data frame | table | | types of columns | table schema | | collection of data frames | database | SQL (structured query language) allows you to directly interact with a database and perform tasks such as pulling data and making updates. --- ## Verb connections | SQL | `dplyr` | |---------:|:----------------------------------------| | SELECT | `select()` | | table | data frame | | WHERE | `filter()` pre-aggregation/calculation | | GROUP_BY | `group_by()` | | HAVING | `filter()` post-aggregation/calculation | | ORDER BY | `arrange()` with possibly a `desc()` | | LIMIT | `slice()` | --- ## SQL arithmetic and comparison operators SQL supports the standard `+`, `-`, `*`, `/`, and `%` (modulo) arithmetic operators and the following comparison operators.
| Operator | Description | |:--------:|:-------------------------| | `=` | Equal to | | `>` | Greater than | | `<` | Less than | | `>=` | Greater than or equal to | | `<=` | Less than or equal to | | `<>` | Not equal to | --- ## SQL logical operators | Operator | Description | |----------:|:-------------------------------------------------------------| | `ALL` | TRUE if all of the subquery values meet the condition | | `AND` | TRUE if all the conditions separated by AND is TRUE | | `ANY` | TRUE if any of the subquery values meet the condition | | `BETWEEN` | TRUE if the operand is within the range of comparisons | | `EXISTS` | TRUE if the subquery returns one or more records | | `IN` | TRUE if the operand is equal to one of a list of expressions | | `LIKE` | TRUE if the operand matches a pattern | | `NOT` | Displays a record if the condition(s) is NOT TRUE | | `OR` | TRUE if any of the conditions separated by OR is TRUE | | `SOME` | TRUE if any of the subquery values meet the condition | --- ## Today's objectives - Follow along as the TA gets you started. - Complete Lab 5 (24 hours to submit this team lab) - Work with those in your group in a breakout room - Grade is for effort and completion - This lab will be helpful for the final homework assignment - Homework 4 - If you finish early, ask questions about your project