class: center, middle, inverse, title-slide # Advanced Visualization Techniques ## Statistical Computing & Programming ### Shawn Santo --- ## Supplementary materials Full video lecture available in Zoom Cloud Recordings Additional resources - [Extend ggplot2](https://ggplot2.tidyverse.org/articles/extending-ggplot2.html) by creating your own stat, geom, and theme - [Network visualization with `ggraph`](https://ggraph.data-imaginist.com) - [Plotly ggplot2 library](https://plotly.com/ggplot2/) - [Template themes with `ggthemes`](https://github.com/jrnold/ggthemes) --- ## Packages For these slides we will use the following packages. .normal[ ```r library(tidyverse) library(gapminder) # some data library(ggpol) # parliament plots and more library(patchwork) # combining plots library(gganimate) # animations library(ggiraph) # interactive plots ``` ] Install any CRAN packages you do not have with `install.packages("package_name")`. <br><br><br><br> **Code not shown for plots is available in the presentation notes. Press `P`.** --- class: inverse, center, middle # Annotate plots --- ## Annotation `annotate()` allows you to add additional geoms to your plot space. Because the corresponding geom is not mapped to variables of a data frame, it is very convenient for adding text or highlighting a point. You can use a single `annotate()` call or many. ```r ggplot() + annotate(geom = "text", x = 10, y = 10, label = "Text at (10, 10)", size = 10) + annotate(geom = "point", x = 20, y = -20, color = "red", size = 4) + annotate(geom = "segment", x = 0, xend = 10, y = 0, yend = 0, size = 3, color = "blue") + annotate(geom = "curve", x = 10, xend = 15, y = -10, yend = -5, size = 3, color = "green") + annotate(geom = "curve", x = 10, xend = 15, y = -5, yend = -10, size = 3, color = "orange") ``` Use the `...` in annotate to pass geom-specific arguments and values. --- ## Annotation <img src="lec_09_files/figure-html/unnamed-chunk-4-1.png" style="display: block; margin: auto;" /> --- ## Annotate to enhance Depending on your data, annotations can be a great way to enhance the reader's visual understanding. ```r url <- str_c("http://www2.stat.duke.edu/~sms185/", "data/sports/events.csv") events <- read_csv(url) ``` Let's only look at the shots that occurred during this match. ```r shots <- events %>% filter(event_name == "Shot") %>% select(team_id, start_x, start_y) ``` --- ## Annotate to enhance .tiny[ ```r ggplot(shots, mapping = aes(x = start_x, y = start_y)) + geom_point(mapping = aes(color = factor(team_id)), size = 3) + labs(color = "Team ID") + theme(legend.position = "bottom") ``` <img src="lec_09_files/figure-html/unnamed-chunk-7-1.png" style="display: block; margin: auto;" /> ] -- Not very interesting or informative! -- With a few annotations, we can enhance this a lot. --- ## Annotate to enhance .tiny[ ```r ggplot(shots, mapping = aes(x = start_x, y = start_y)) + geom_point(mapping = aes(color = factor(team_id)), size = 3) + labs(color = "Team ID") + theme_void() + theme(legend.position = "bottom") ``` <img src="lec_09_files/figure-html/unnamed-chunk-8-1.png" style="display: block; margin: auto;" /> ] -- Setting `theme_void()` doesn't help much either. --- ## Annotate to enhance Create a function called `fc_annotate_pitch()` that has all the pitch field markings. .tiny.pull-left[ ```r fcn_list <- list( boundary = "annotate", center_circle = "annotate", ⋮ lower_right_arc = "annotate", upper_right_arc = "annotate" ) ``` ] .tiny.pull-right[ ```r arg_list <- list( boundary = list( geom = "rect", xmin = x_min, xmax = x_max, ymin = y_min, ymax = y_max, color = palette_color, fill = palette_fill, size = 1.5 ), ⋮ upper_right_arc = list( geom = "curve", x = ifelse(coord_flip, x_max, x_max - 2 * x_adj), xend = ifelse(coord_flip, x_max - 2 * x_adj, x_max), y = ifelse(coord_flip, y_max - 2 * y_adj, y_max), yend = ifelse(coord_flip, y_max, y_max - 2 * y_adj), color = palette_color ) ) ``` ] Combine the functions and their parameters with `invoke_map(fcn_list, arg_list)`. --- ## Annotate to enhance .tiny[ ```r ggplot(shots, mapping = aes(x = start_x, y = start_y)) + fc_annotate_pitch(dimensions = c(100, 100)) + geom_point(mapping = aes(color = factor(team_id)), size = 3) + labs(color = "Team ID") + theme_void() + theme(legend.position = "bottom") ``` <img src="lec_09_files/figure-html/unnamed-chunk-12-1.png" style="display: block; margin: auto;" /> ] -- Much more informative! Full function: [fc_annotate_pitch()](http://www2.stat.duke.edu/~sms185/R/fc_annotate_pitch.R) --- class: inverse, center, middle # Organizing plots: package `patchwork` --- class: inverse, center, middle # Aside: parliament plots with `ggpol` --- ## Data: Congressional seats ```r url <- str_c("http://www2.stat.duke.edu/~sms185/", "data/politics/congress_long.csv") congress <- read_csv(url) congress ``` ``` #> # A tibble: 432 x 5 #> year_start year_end party branch seats #> <dbl> <dbl> <chr> <chr> <dbl> #> 1 1913 1915 dem house 290 #> 2 1913 1915 dem senate 51 #> 3 1913 1915 gop house 127 #> 4 1913 1915 gop senate 44 #> 5 1913 1915 other house 18 #> 6 1913 1915 other senate 1 #> 7 1913 1915 vacant house NA #> 8 1913 1915 vacant senate NA #> 9 1915 1917 dem house 231 #> 10 1915 1917 dem senate 56 #> # … with 422 more rows ``` --- ## Parliament plot .tiny[ ```r ggplot(data = congress[congress$year_start == 1913 & congress$branch == "house", ]) + * geom_parliament(aes(seats = seats, fill = factor(party)), show.legend = TRUE, color = "black") + scale_fill_manual(values = c("#3A89CB", "#D65454", "#BF6FF0", "Grey"), labels = c("Dem", "GOP", "Other", "Vacant")) + labs(fill = "Party") + coord_fixed() + theme_void(base_size = 20) ``` <img src="lec_09_files/figure-html/unnamed-chunk-14-1.png" style="display: block; margin: auto;" /> ] --- ## Package `ggpol` - Package `ggpol` supports a few other `geom` functions: - `geom_arcbar()`, - `geom_bartext()`, - `geom_circle()`, - `geom_tshighlight()`, - `geom_boxjitter()`. - See https://github.com/erocoar/ggpol --- ## My function: `plot_congress()` .tiny[ ```r plot_congress <- function(data, year, leg_branch, legend = TRUE, text_size = 8) { data %>% filter(year_start == year, branch == leg_branch) %>% ggplot() + geom_parliament(aes(seats = seats, fill = factor(party)), show.legend = legend, color = "black") + scale_fill_manual(values = c("#3A89CB", "#D65454", "#BF6FF0", "Grey"), labels = c("Dem", "GOP", "Other", "Vacant")) + annotate("text", x = 0, y = .5, label = paste(year, leg_branch), size = text_size) + labs(fill = "Party") + coord_fixed() + theme_void(base_size = 20) } ``` ] Use package `patchwork` to organize multiple plots in a single window. No need to facet. ```r my_plot <- ggplot() class(my_plot) ``` ``` #> [1] "gg" "ggplot" ``` --- ## Plot creation ```r ph_1993 <- plot_congress(congress, 1993, "house") ph_2001 <- plot_congress(congress, 2001, "house", legend = FALSE) ph_2009 <- plot_congress(congress, 2009, "house", legend = FALSE) ph_2017 <- plot_congress(congress, 2017, "house", legend = FALSE) ``` <br/> Object `ph_1993` has a legend, the rest do not. --- ## Horizontal patchwork ```r ph_1993 + ph_2017 ``` <img src="lec_09_files/figure-html/unnamed-chunk-18-1.png" style="display: block; margin: auto;" /> --- ## Vertical patchwork ```r ph_1993 + ph_2017 + plot_layout(ncol = 1) ``` <img src="lec_09_files/figure-html/unnamed-chunk-19-1.png" style="display: block; margin: auto;" /> --- ## Group patchwork ```r ph_1993 + (ph_2001 + ph_2009) + ph_2017 + plot_layout(ncol = 1, widths = 1) ``` <img src="lec_09_files/figure-html/unnamed-chunk-20-1.png" style="display: block; margin: auto;" /> --- ```r (ph_1993 | ph_2001) / (ph_2009 | ph_2017) ``` <img src="lec_09_files/figure-html/unnamed-chunk-21-1.png" style="display: block; margin: auto;" /> --- ```r (ps_1993 | ps_2001 | ps_2009) / ps_2017 + plot_layout(widths = 1) ``` <img src="lec_09_files/figure-html/unnamed-chunk-23-1.png" style="display: block; margin: auto;" /> ??? .tiny[ ```r ps_1993 <- plot_congress(congress, 1993, "senate", legend = FALSE, text_size = 6) ps_2001 <- plot_congress(congress, 2001, "senate", legend = FALSE, text_size = 6) ps_2009 <- plot_congress(congress, 2009, "senate", legend = TRUE, text_size = 6) ps_2017 <- plot_congress(congress, 2017, "senate", legend = FALSE, text_size = 6) ``` ] --- ## Package `patchwork` - Supports operators `+`, `-`, `|` (besides), `/` (over) - Specify layouts and spacing with `plot_layout()`, `plot_spacer()`, respectively - Add grouping with `{ }` or `( )` - Use `&` or `*` to add elements to all subplots, `*` only affects current nesting level - See https://github.com/thomasp85/patchwork --- class: inverse, center, middle # Exercise --- ## Flint water data Create a set of visualizations based on tibble object `flint`. Use patchwork to combine these in a single plot window. ```r url <- str_c("http://www2.stat.duke.edu/~sms185/", "data/health/flint.csv") flint <- read_csv(url) flint ``` ``` #> # A tibble: 271 x 6 #> id zip ward draw1 draw2 draw3 #> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 1 48504 6 0.344 0.226 0.145 #> 2 2 48507 9 8.13 10.8 2.76 #> 3 4 48504 1 1.11 0.11 0.123 #> 4 5 48507 8 8.01 7.45 3.38 #> 5 6 48505 3 1.95 0.048 0.035 #> 6 7 48507 9 7.2 1.4 0.2 #> 7 8 48507 9 40.6 9.73 6.13 #> 8 9 48503 5 1.1 2.5 0.1 #> 9 12 48507 9 10.6 1.04 1.29 #> 10 13 48505 3 6.2 4.2 2.3 #> # … with 261 more rows ``` --- class: inverse, center, middle # Animation: `gganimate()` --- ## Data: `gapminder` ```r library(gapminder) gapminder ``` ``` #> # A tibble: 1,704 x 6 #> country continent year lifeExp pop gdpPercap #> <fct> <fct> <int> <dbl> <int> <dbl> #> 1 Afghanistan Asia 1952 28.8 8425333 779. #> 2 Afghanistan Asia 1957 30.3 9240934 821. #> 3 Afghanistan Asia 1962 32.0 10267083 853. #> 4 Afghanistan Asia 1967 34.0 11537966 836. #> 5 Afghanistan Asia 1972 36.1 13079460 740. #> 6 Afghanistan Asia 1977 38.4 14880372 786. #> 7 Afghanistan Asia 1982 39.9 12881816 978. #> 8 Afghanistan Asia 1987 40.8 13867957 852. #> 9 Afghanistan Asia 1992 41.7 16317921 649. #> 10 Afghanistan Asia 1997 41.8 22227415 635. #> # … with 1,694 more rows ``` --- ## Nothing new .tiny[ ```r ggplot(gapminder, aes(x = gdpPercap, y = lifeExp, size = pop, colour = country)) + geom_point(alpha = 0.7, show.legend = FALSE) + scale_colour_manual(values = country_colors) + scale_size(range = c(2, 12)) + scale_x_log10() + facet_wrap(~continent) + theme_bw(base_size = 16) ``` <img src="lec_09_files/figure-html/unnamed-chunk-27-1.png" style="display: block; margin: auto;" /> ] --- ## Animate with `gganimate()` <img src="lec_09_files/figure-html/unnamed-chunk-28-1.gif" style="display: block; margin: auto;" /> --- ## What did we add? Base plot .tiny[ ```r ggplot(gapminder, aes(x = gdpPercap, y = lifeExp, size = pop, colour = country)) + geom_point(alpha = 0.7, show.legend = FALSE) + scale_colour_manual(values = country_colors) + scale_size(range = c(2, 12)) + scale_x_log10() + facet_wrap(~continent) + theme_bw(base_size = 16) ``` ] -- Transform to animation .tiny[ ```r ggplot(gapminder, aes(x = gdpPercap, y = lifeExp, size = pop, colour = country)) + geom_point(alpha = 0.7, show.legend = FALSE) + scale_colour_manual(values = country_colors) + scale_size(range = c(2, 12)) + scale_x_log10() + facet_wrap(~continent) + theme_bw(base_size = 16) + * labs(title = 'Year: {frame_time}', x = 'GDP per capita', y = 'Life expectancy') + * transition_time(year) + * ease_aes('linear') ``` ] --- ## Another example First, reshape the data. ```r flint_long <- flint %>% pivot_longer(cols = draw1:draw3, names_to = "draw", values_to = "pb_level") ``` -- ```r p <- flint_long %>% filter(zip == 48507, pb_level < 75) %>% mutate(flush_time = case_when( draw == "draw1" ~ 0, draw == "draw2" ~ 45, draw == "draw3" ~ 120 )) %>% ggplot(mapping = aes(x = flush_time, y = pb_level, group = id)) + geom_line() + geom_point(aes(group = seq_along(flush_time)), size = 3) + geom_point(color = "blue", size = 3) + scale_x_continuous(breaks = c(0, 45, 120), labels = c("Draw 1", "Draw 2", "Draw 3")) + geom_line(color = "grey90") + labs(y = "Lead level (ppb)", x = "") + theme_bw(base_size = 16) ``` --- class: center, middle <img src="lec_09_files/figure-html/unnamed-chunk-33-1.png" style="display: block; margin: auto;" /> --- ```r p <- p + transition_reveal(flush_time) animate(p, end_pause = 30) ``` <img src="lec_09_files/figure-html/unnamed-chunk-34-1.gif" style="display: block; margin: auto;" /> --- ## Package `gganimate` - Core functions - `transition_*()` defines how the data should be spread out and how it relates to itself across time. - `view_*()` defines how the positional scales should change along the animation. - `shadow_*()` defines how data from other points in time should be presented in the given point in time. - `enter_*()` / `exit_*()` defines how new data should appear and how old data should disappear during the course of the animation. - `ease_aes()` defines how different aesthetics should be eased during transitions. - Label variables - function dependent, use `{` `}` to access their values. - See https://gganimate.com --- class: inverse, center, middle # Interactive plots: `ggiraph` --- ## Data: NC births and SID ```r nc <- read_csv("http://www2.stat.duke.edu/~sms185/data/health/nc_birth_sid.csv") nc ``` ``` #> # A tibble: 100 x 4 #> NAME AREA BIR74 SID74 #> <chr> <dbl> <dbl> <dbl> #> 1 Ashe 0.114 1091 1 #> 2 Alleghany 0.061 487 0 #> 3 Surry 0.143 3188 5 #> 4 Currituck 0.07 508 1 #> 5 Northampton 0.153 1421 9 #> 6 Hertford 0.097 1452 7 #> 7 Camden 0.062 286 0 #> 8 Gates 0.091 420 0 #> 9 Warren 0.118 968 4 #> 10 Stokes 0.124 1612 1 #> # … with 90 more rows ``` --- ## Standard scatter plot .tiny[ <img src="lec_09_files/figure-html/unnamed-chunk-36-1.png" style="display: block; margin: auto;" /> ] --- ## Make it interactive - **using a `tooltip`** ```r gg_name <- ggplot(nc, mapping = aes(x = AREA, y = BIR74)) + * geom_point_interactive(aes(tooltip = NAME), size = 4, alpha = .5) + theme_minimal() girafe(ggobj = gg_name) ``` -- - **using hover functionality** ```r gg_hover <- ggplot(nc, mapping = aes(x = AREA, y = BIR74)) + * geom_point_interactive(aes(data_id = NAME, tooltip = NAME), size = 4, alpha = .5) + theme_minimal() girafe(ggobj = gg_hover) ``` --- - **using on-click functionality** ```r nc$wiki <- paste0('window.open(\"', "https://www.ncpedia.org/geography/", tolower(nc$NAME), '\")') gg_name <- ggplot(nc, mapping = aes(x = AREA, y = BIR74)) + * geom_point_interactive(aes(tooltip = NAME, onclick = wiki), size = 4, alpha = .5) + theme_minimal() girafe(ggobj = gg_name) ``` --- ## Package `ggiraph` - Add tooltips, animations, and JavaScript actions to ggplot graphics - In general, instead of `geom_<plot_type>()` use `geom_<plot_type>_interactive()` - Interactivity is added to ggplot geometries, legends and theme elements, via the following aesthetics: - tooltip: tooltips to be displayed when mouse is over elements - onclick: JavaScript function to be executed when elements are clicked - data_id: id to be associated with elements (used for hover and click actions) - Function `girafe()` translates the graphic into an interactive web-based graphic - See https://github.com/davidgohel/ggiraph --- ## References 1. A Grammar of Animated Graphics. (2021). https://gganimate.com/ 2. Create GIFs with gifski in knitr Documents - Yihui Xie | 谢益辉. (2021). https://yihui.org/en/2018/08/gifski-knitr/ 3. davidgohel/ggiraph. (2021). https://github.com/davidgohel/ggiraph 4. erocoar/ggpol. (2021). https://github.com/erocoar/ggpol 5. Extending ggplot2. (2021). https://ggplot2.tidyverse.org/articles/extending-ggplot2.html 6. thomasp85/patchwork. (2021). https://github.com/thomasp85/patchwork 7. Top 50 ggplot2 Visualizations - The Master List (With Full R Code). (2020). http://r-statistics.co/Top50-Ggplot2-Visualizations-MasterList-R-Code.html