class: center, middle, inverse, title-slide # Lab 6 ## Programming for Statistical Science --- ## Package `rvest` `rvest` is a package authored by Hadley Wickham that makes basic processing and manipulation of HTML data easy. ```r library(rvest) ``` Core functions: | Function | Description | |---------------------|-------------------------------------------------------------------| | `xml2::read_html()` | read HTML from a character string or connection | | `html_nodes()` | select specified nodes from the HTML document using CSS selectors | | `html_table()` | parse an HTML table into a data frame | | `html_text()` | extract tag pairs' content | | `html_name()` | extract tags' names | | `html_attrs()` | extract all of each tag's attributes | | `html_attr()` | extract tags' attribute value by name | --- ## Today's objectives - Follow along as the TAs gets you started with web scraping and brief overview of [SelectorGadget](https://selectorgadget.com/). - Complete Lab 6. Fork the template repo at [sta523-fa20/lab6](https://github.com/sta523-fa20/lab6) - Work with those in your breakout room. - This is not graded. - Ask questions about recent course material.