class: center, middle, inverse, title-slide # Lab 3 ## Statistical Computing & Programming ### 06-02-20 --- ## Getting started - Navigate to your private team repo, `lab3-[github_team]` <br/> - Open an RStudio (Pawn or Rook) session; then go to - `File` > `New Project` - select `Version Control` - select `Git` - paste the repository URL - available at your GitHub repo `lab3-[github_team]` when you click `Clone or download` and then `Clone with HTTPS` - Click `Create Project` <br/> - Since this is a team-based lab, consider using branches. <br/><br/> <i> You may do this on your local machine if you have git configured with R/RStudio. </i> --- ## Package `rvest` `rvest` is a package authored by Hadley Wickham that makes basic processing and manipulation of HTML data easy. ```r library(rvest) ``` Core functions: | Function | Description | |---------------------|-------------------------------------------------------------------| | `xml2::read_html()` | read HTML from a character string or connection | | `html_nodes()` | select specified nodes from the HTML document using CSS selectors | | `html_table()` | parse an HTML table into a data frame | | `html_text()` | extract tag pairs' content | | `html_name()` | extract tags' names | | `html_attrs()` | extract all of each tag's attributes | | `html_attr()` | extract tags' attribute value by name | --- ## Today's objectives - Follow along as the TA gets you started with web scraping and brief overview of [SelectorGadget](https://selectorgadget.com/). - Complete Lab 3 (24 hours to submit this team-based lab) - Work with those in your group in a breakout room - Grade is for effort and completion - This lab will be helpful for Homework 3 - Challenge yourself to create a spatial visualization for the data you scrape