class: center, middle, inverse, title-slide # Functions and automation
🤖 --- layout: true <div class="my-footer"> <span> Dr. Mine Çetinkaya-Rundel - <a href="http://www2.stat.duke.edu/courses/Fall18/sta112.01/schedule" target="_blank">stat.duke.edu/courses/Fall18/sta112.01 </a> </span> </div> --- class: center, middle # Functions --- ## Setup ```r library(tidyverse) library(rvest) st <- read_html("http://www.imdb.com/title/tt4574334/") twd <- read_html("http://www.imdb.com/title/tt1520211/") got <- read_html("http://www.imdb.com/title/tt0944947/") ``` --- ## Why functions? - Automate common tasks in a powerful and general way than copy-and-pasting: - You can give a function an evocative name that makes your code easier to understand. - As requirements change, you only need to update code in one place, instead of many. - You eliminate the chance of making incidental mistakes when you copy and paste (i.e. updating a variable name in one place, but not in another). -- - Down the line: Improve your reach as a data scientist by writing functions (and packages!) that others use --- ## When should you write a function? Whenever you’ve copied and pasted a block of code more than twice. <br> -- .question[ Do you see any problems in the code below? ] .midi[ ```r st_episodes <- st %>% html_nodes(".np_right_arrow .bp_sub_heading") %>% html_text() %>% str_replace(" episodes", "") %>% as.numeric() got_episodes <- got %>% html_nodes(".np_right_arrow .bp_sub_heading") %>% html_text() %>% str_replace(" episodes", "") %>% as.numeric() twd_episodes <- got %>% html_nodes(".np_right_arrow .bp_sub_heading") %>% html_text() %>% str_replace(" episodes", "") %>% as.numeric() ``` ] --- ## Inputs .question[ How many inputs does the following code have? ] ```r st_episodes <- st %>% html_nodes(".np_right_arrow .bp_sub_heading") %>% html_text() %>% str_replace(" episodes", "") %>% as.numeric() ``` --- ## Turn your code into a function - Pick a short but informative **name**, preferably a verb. <br> <br> <br> <br> <br> ```r scrape_episodes <- ``` --- ## Turn your code into a function - Pick a short but informative **name**, preferably a verb. - List inputs, or **arguments**, to the function inside `function`. If we had more the call would look like `function(x, y, z)`. <br> <br> ```r scrape_episodes <- function(x){ } ``` --- ## Turn your code into a function - Pick a short but informative **name**, preferably a verb. - List inputs, or **arguments**, to the function inside `function`. If we had more the call would look like `function(x, y, z)`. - Place the **code** you have developed in body of the function, a `{` block that immediately follows `function(...)`. ```r scrape_episodes <- function(x){ x %>% html_nodes(".np_right_arrow .bp_sub_heading") %>% html_text() %>% str_replace(" episodes", "") %>% as.numeric() } ``` -- ```r scrape_episodes(st) ``` ``` ## [1] 25 ``` --- ## Check your function <img src="img/episode_twd.png" width="541" height="120" /> ```r scrape_episodes(twd) ``` ``` ## [1] 132 ``` <img src="img/episode_got.png" width="543" height="120" /> ```r scrape_episodes(got) ``` ``` ## [1] 73 ``` --- ## Naming functions > "There are only two hard things in Computer Science: cache invalidation and naming things." - Phil Karlton -- - Names should be short but clearly evoke what the function does -- - Names should be verbs, not nouns -- - Multi-word names should be separated by underscores (`snake_case` as opposed to `camelCase`) -- - A family of functions should be named similarly (`scrape_title`, `scrape_episode`, `scrape_genre`, etc.) -- - Avoid overwriting existing (especially widely used) functions --- .small[ ```r scrape_show_info <- function(x){ title <- x %>% html_node("#title-overview-widget h1") %>% html_text() %>% str_trim() genres <- x %>% html_nodes(".see-more.canwrap~ .canwrap a") %>% html_text() %>% str_trim() %>% paste(collapse = ", ") runtime <- x %>% html_node("time") %>% html_text() %>% str_replace("\\n", "") %>% str_trim() episodes <- x %>% html_nodes(".np_right_arrow .bp_sub_heading") %>% html_text() %>% str_replace(" episodes", "") %>% as.numeric() keywords <- x %>% html_nodes(".itemprop") %>% html_text() %>% str_trim() %>% paste(collapse = ", ") tibble(title = title, runtime = runtime, genres = genres, episodes = episodes, keywords = keywords) } ``` ] --- .midi[ ```r scrape_show_info(st) %>% glimpse() ``` ``` ## Observations: 1 ## Variables: 5 ## $ title <chr> "Stranger Things" ## $ runtime <chr> "51min" ## $ genres <chr> "Drama, Fantasy, Horror, Mystery, Sci-Fi, Thriller" ## $ episodes <dbl> 25 ## $ keywords <chr> "government conspiracy, mk ultra, cover up, monster, ... ``` ```r scrape_show_info(twd) %>% glimpse() ``` ``` ## Observations: 1 ## Variables: 5 ## $ title <chr> "The Walking Dead" ## $ runtime <chr> "44min" ## $ genres <chr> "Drama, Horror, Sci-Fi, Thriller" ## $ episodes <dbl> 132 ## $ keywords <chr> "zombie, survival, post apocalypse, based on comic, z... ``` ```r scrape_show_info(got) %>% glimpse() ``` ``` ## Observations: 1 ## Variables: 5 ## $ title <chr> "Game of Thrones" ## $ runtime <chr> "57min" ## $ genres <chr> "Action, Adventure, Drama, Fantasy, Romance" ## $ episodes <dbl> 73 ## $ keywords <chr> "dragon, based on novel, bloody violence, twins inces... ``` ] --- .question[ How would you update the following function to use the URL of the page as an argument? ] .xsmall[ ```r scrape_show_info <- function(x){ title <- x %>% html_node("#title-overview-widget h1") %>% html_text() %>% str_trim() genres <- x %>% html_nodes(".see-more.canwrap~ .canwrap a") %>% html_text() %>% str_trim() %>% paste(collapse = ", ") runtime <- x %>% html_node("time") %>% html_text() %>% str_replace("\\n", "") %>% str_trim() episodes <- x %>% html_nodes(".np_right_arrow .bp_sub_heading") %>% html_text() %>% str_replace(" episodes", "") %>% as.numeric() keywords <- x %>% html_nodes(".itemprop") %>% html_text() %>% str_trim() %>% paste(collapse = ", ") tibble(title = title, runtime = runtime, genres = genres, episodes = episodes, keywords = keywords) } ``` ] --- .small[ ```r scrape_show_info <- function(x){ * y <- read_html(x) * title <- y %>% html_node("#title-overview-widget h1") %>% html_text() %>% str_trim() * genres <- y %>% html_nodes(".see-more.canwrap~ .canwrap a") %>% html_text() %>% str_trim() %>% paste(collapse = ", ") * runtime <- y %>% html_node("time") %>% html_text() %>% str_replace("\\n", "") %>% str_trim() * episodes <- y %>% html_nodes(".np_right_arrow .bp_sub_heading") %>% html_text() %>% str_replace(" episodes", "") %>% as.numeric() * keywords <- y %>% html_nodes(".itemprop") %>% html_text() %>% str_trim() %>% paste(collapse = ", ") tibble(title = title, runtime = runtime, genres = genres, episodes = episodes, keywords = keywords) } ``` ] --- ## Let's check .small[ ```r st_url <- "http://www.imdb.com/title/tt4574334/" twd_url <- "http://www.imdb.com/title/tt1520211/" got_url <- "http://www.imdb.com/title/tt0944947/" ``` ] -- .small[ ```r scrape_show_info(st_url) ``` ``` ## # A tibble: 1 x 5 ## title runtime genres episodes keywords ## <chr> <chr> <chr> <dbl> <chr> ## 1 Stranger… 51min Drama, Fantasy, Hor… 25 government conspiracy, … ``` ```r scrape_show_info(twd_url) ``` ``` ## # A tibble: 1 x 5 ## title runtime genres episodes keywords ## <chr> <chr> <chr> <dbl> <chr> ## 1 The Walk… 44min Drama, Horror,… 132 zombie, survival, post apoca… ``` ```r scrape_show_info(got_url) ``` ``` ## # A tibble: 1 x 5 ## title runtime genres episodes keywords ## <chr> <chr> <chr> <dbl> <chr> ## 1 Game of … 57min Action, Adventure… 73 dragon, based on novel, b… ``` ] --- class: center, middle # Automation --- .question[ You now have a function that will scrape the relevant info on shows given its URL. Where can we get a list of URLs of top 100 most popular TV shows on IMDB? Write the code for doing this in your teams. ] --- ```r urls <- read_html("http://www.imdb.com/chart/tvmeter") %>% html_nodes(".titleColumn a") %>% html_attr("href") %>% paste("http://www.imdb.com", ., sep = "") ``` ``` ## [1] "http://www.imdb.com/title/tt6763664/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_1" ## [2] "http://www.imdb.com/title/tt1844624/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_2" ## [3] "http://www.imdb.com/title/tt1520211/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_3" ## [4] "http://www.imdb.com/title/tt1043813/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_4" ## [5] "http://www.imdb.com/title/tt3322312/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_5" ## [6] "http://www.imdb.com/title/tt6524350/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_6" ## [7] "http://www.imdb.com/title/tt1586680/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_7" ## [8] "http://www.imdb.com/title/tt0944947/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_8" ## [9] "http://www.imdb.com/title/tt5420376/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_9" ## [10] "http://www.imdb.com/title/tt3107288/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_10" ## [11] "http://www.imdb.com/title/tt0413573/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_11" ## [12] "http://www.imdb.com/title/tt7134908/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_12" ## [13] "http://www.imdb.com/title/tt0436992/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_13" ## [14] "http://www.imdb.com/title/tt0898266/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_14" ## [15] "http://www.imdb.com/title/tt6599482/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_15" ## [16] "http://www.imdb.com/title/tt2193021/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_16" ## [17] "http://www.imdb.com/title/tt6394324/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_17" ## [18] "http://www.imdb.com/title/tt5555260/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_18" ## [19] "http://www.imdb.com/title/tt5715524/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_19" ## [20] "http://www.imdb.com/title/tt5580146/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_20" ## [21] "http://www.imdb.com/title/tt0460681/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_21" ## [22] "http://www.imdb.com/title/tt8421350/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_22" ## [23] "http://www.imdb.com/title/tt6470478/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_23" ## [24] "http://www.imdb.com/title/tt1740299/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_24" ## [25] "http://www.imdb.com/title/tt3032476/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_25" ## [26] "http://www.imdb.com/title/tt5071412/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_26" ## [27] "http://www.imdb.com/title/tt7587890/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_27" ## [28] "http://www.imdb.com/title/tt4955642/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_28" ## [29] "http://www.imdb.com/title/tt4016454/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_29" ## [30] "http://www.imdb.com/title/tt7235466/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_30" ## [31] "http://www.imdb.com/title/tt2177461/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_31" ## [32] "http://www.imdb.com/title/tt4574334/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_32" ## [33] "http://www.imdb.com/title/tt8595140/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_33" ## [34] "http://www.imdb.com/title/tt2085059/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_34" ## [35] "http://www.imdb.com/title/tt0452046/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_35" ## [36] "http://www.imdb.com/title/tt2306299/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_36" ## [37] "http://www.imdb.com/title/tt0386676/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_37" ## [38] "http://www.imdb.com/title/tt5164196/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_38" ## [39] "http://www.imdb.com/title/tt2891574/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_39" ## [40] "http://www.imdb.com/title/tt0364845/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_40" ## [41] "http://www.imdb.com/title/tt0203259/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_41" ## [42] "http://www.imdb.com/title/tt2442560/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_42" ## [43] "http://www.imdb.com/title/tt6483832/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_43" ## [44] "http://www.imdb.com/title/tt4396630/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_44" ## [45] "http://www.imdb.com/title/tt0108778/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_45" ## [46] "http://www.imdb.com/title/tt3749900/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_46" ## [47] "http://www.imdb.com/title/tt1632701/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_47" ## [48] "http://www.imdb.com/title/tt5057054/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_48" ## [49] "http://www.imdb.com/title/tt3205802/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_49" ## [50] "http://www.imdb.com/title/tt7491982/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_50" ## [51] "http://www.imdb.com/title/tt0121955/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_51" ## [52] "http://www.imdb.com/title/tt0472954/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_52" ## [53] "http://www.imdb.com/title/tt1442437/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_53" ## [54] "http://www.imdb.com/title/tt2467372/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_54" ## [55] "http://www.imdb.com/title/tt7016936/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_55" ## [56] "http://www.imdb.com/title/tt4474344/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_56" ## [57] "http://www.imdb.com/title/tt0903747/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_57" ## [58] "http://www.imdb.com/title/tt6468322/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_58" ## [59] "http://www.imdb.com/title/tt6473344/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_59" ## [60] "http://www.imdb.com/title/tt7942796/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_60" ## [61] "http://www.imdb.com/title/tt7817340/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_61" ## [62] "http://www.imdb.com/title/tt2741602/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_62" ## [63] "http://www.imdb.com/title/tt7440732/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_63" ## [64] "http://www.imdb.com/title/tt7608248/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_64" ## [65] "http://www.imdb.com/title/tt3006802/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_65" ## [66] "http://www.imdb.com/title/tt5834204/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_66" ## [67] "http://www.imdb.com/title/tt5296406/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_67" ## [68] "http://www.imdb.com/title/tt4998350/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_68" ## [69] "http://www.imdb.com/title/tt2805096/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_69" ## [70] "http://www.imdb.com/title/tt2372162/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_70" ## [71] "http://www.imdb.com/title/tt1190634/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_71" ## [72] "http://www.imdb.com/title/tt2661044/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_72" ## [73] "http://www.imdb.com/title/tt7569592/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_73" ## [74] "http://www.imdb.com/title/tt1828327/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_74" ## [75] "http://www.imdb.com/title/tt4052886/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_75" ## [76] "http://www.imdb.com/title/tt8773080/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_76" ## [77] "http://www.imdb.com/title/tt6110648/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_77" ## [78] "http://www.imdb.com/title/tt2261391/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_78" ## [79] "http://www.imdb.com/title/tt3322310/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_79" ## [80] "http://www.imdb.com/title/tt2402207/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_80" ## [81] "http://www.imdb.com/title/tt8002604/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_81" ## [82] "http://www.imdb.com/title/tt0475784/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_82" ## [83] "http://www.imdb.com/title/tt1378167/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_83" ## [84] "http://www.imdb.com/title/tt6226232/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_84" ## [85] "http://www.imdb.com/title/tt1600194/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_85" ## [86] "http://www.imdb.com/title/tt1843230/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_86" ## [87] "http://www.imdb.com/title/tt4532368/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_87" ## [88] "http://www.imdb.com/title/tt3281796/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_88" ## [89] "http://www.imdb.com/title/tt6045840/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_89" ## [90] "http://www.imdb.com/title/tt7493974/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_90" ## [91] "http://www.imdb.com/title/tt2802850/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_91" ## [92] "http://www.imdb.com/title/tt5827228/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_92" ## [93] "http://www.imdb.com/title/tt6048596/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_93" ## [94] "http://www.imdb.com/title/tt5905354/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_94" ## [95] "http://www.imdb.com/title/tt1124373/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_95" ## [96] "http://www.imdb.com/title/tt1595859/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_96" ## [97] "http://www.imdb.com/title/tt8619822/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_97" ## [98] "http://www.imdb.com/title/tt2712740/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_98" ## [99] "http://www.imdb.com/title/tt3743822/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_99" ## [100] "http://www.imdb.com/title/tt5189670/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=332cb927-0342-42b3-815c-f9124e84021d&pf_rd_r=MXGQ8JYXCKZETVQF9RGD&pf_rd_s=center-1&pf_rd_t=15506&pf_rd_i=tvmeter&ref_=chttvm_tt_100" ``` --- ## Go to each page, scrape show info Now we need a way to programatically direct R to each page on the `urls` list and run the `scrape_show_info` function on that page. .midi[ ```r scrape_show_info(urls[1]) ``` ``` ## # A tibble: 1 x 5 ## title runtime genres episodes keywords ## <chr> <chr> <chr> <dbl> <chr> ## 1 The Hauntin… 50min Drama, Horror… 10 haunted house, supernatura… ``` ```r scrape_show_info(urls[2]) ``` ``` ## # A tibble: 1 x 5 ## title runtime genres episodes keywords ## <chr> <chr> <chr> <dbl> <chr> ## 1 American Ho… 1h Drama, Horror… 99 anthology, serial killer, … ``` ```r scrape_show_info(urls[3]) ``` ``` ## # A tibble: 1 x 5 ## title runtime genres episodes keywords ## <chr> <chr> <chr> <dbl> <chr> ## 1 The Walk… 44min Drama, Horror,… 132 zombie, survival, post apoca… ``` ] --- class: center, middle .large[ *uh oh, we're repeating our~~selves~~ code again* ] --- ## Automation - We need a way to programmatically repeat the code - We have two options for doing this: - using a `for` loop - `map`ping with functional programming --- ## `for` loops - `for` loops are the simplest and most common type of loop in R - Given a vector iterate through the elements and evaluate the code block for each <br> **Goal:** Scrape info from individual pages of TV shows using iteration with for loops. To keep things simple while developing the code, narrow down focus to the first `n = 5` shows only. --- ## `for` loop: ### (1) Set up a tibble to store results ```r n <- 5 top_n_shows <- tibble(title = rep(NA,n), runtime = rep(NA,n), genres = rep(NA,n), episodes = rep(NA,n), keywords = rep(NA,n) ) top_n_shows ``` ``` ## # A tibble: 5 x 5 ## title runtime genres episodes keywords ## <lgl> <lgl> <lgl> <lgl> <lgl> ## 1 NA NA NA NA NA ## 2 NA NA NA NA NA ## 3 NA NA NA NA NA ## 4 NA NA NA NA NA ## 5 NA NA NA NA NA ``` --- ## `for` loop: ### (2) Iterate through urls to scrape data and save results ```r for (i in 1:n){ top_n_shows[i, ] = scrape_show_info(urls[i]) } top_n_shows ``` ``` ## # A tibble: 5 x 5 ## title runtime genres episodes keywords ## <chr> <chr> <chr> <dbl> <chr> ## 1 The Haunti… 50min Drama, Horror, M… 10 haunted house, supernatu… ## 2 American H… 1h Drama, Horror, T… 99 anthology, serial killer… ## 3 The Walkin… 44min Drama, Horror, S… 132 zombie, survival, post a… ## 4 Titans 45min Action, Adventur… 24 based on comic, friendsh… ## 5 Daredevil 54min Action, Crime, D… 39 vigilante, lawyer, super… ``` --- ## `map`ping - `map` functions transform their input by applying a function to each element and returning an object the same length as the input -- - There are various map functions (e.g. `map_lgl()`, `map_chr()`, `map_dbl()`, `map_df()`), each of which return a different type of object (logical, character, double, and data frame, respectively) -- - `map` the `scrape_show_info` function to each element of `urls` -- - This will hit the `urls` one after another, and grab the info **Goal:** Scrape info from individual pages of TV shows using functional programming with mapping. To keep things simple while developing the code, narrow down focus to the first `n = 5` shows only. --- ## `map`ping: ```r top_n_shows <- map_df(urls[1:n], scrape_show_info) top_n_shows ``` ``` ## # A tibble: 5 x 5 ## title runtime genres episodes keywords ## <chr> <chr> <chr> <dbl> <chr> ## 1 The Haunti… 50min Drama, Horror, M… 10 haunted house, supernatu… ## 2 American H… 1h Drama, Horror, T… 99 anthology, serial killer… ## 3 The Walkin… 44min Drama, Horror, S… 132 zombie, survival, post a… ## 4 Titans 45min Action, Adventur… 24 based on comic, friendsh… ## 5 Daredevil 54min Action, Crime, D… 39 vigilante, lawyer, super… ``` --- ## Slow your roll - If you get `HTTP Error 429 (Too man requests)` you might want to slow down your hits - You can add a `Sys.sleep()` call to slow down your function: ```r scrape_show_info <- function(x){ * Sys.sleep(runif(1)) ... } ```