Visualize relationship between life expectancy and GDP per capita in 2007 in countries. Also create a visualization
In the following exercises we’ll use dplyr (for data wrangling) and ggplot2 (for visualization) packages.
Make sure the packages are installed.
Load these packages in your markdown file:
library(dplyr)
library(ggplot2)
gapminder <- read.csv("https://stat.duke.edu/~mc301/data/gapminder.csv")
Start with the gapminder dataset
Filter for cases (rows) where year is equal to 2007
Save this new subsetted dataset as gap07
gap07 <- gapminder %>%
filter(year == 2007)
Task: Visualize the relationship between gdpPercap and lifeExp.
qplot(x = gdpPercap, y = lifeExp, data = gap07)
Task: Color the points by continent.
qplot(x = gdpPercap, y = lifeExp, color = continent, data = gap07)
Stage
Commit (with a message)
Push
The %>% operator in dplyr functions is called the pipe operator. This means you “pipe” the output of the previous line of code as the first input of the next line of code.
The + operator in ggplot2 functions is used for “layering”. This means you create the plot in layers, separated by +.
What if you wanted to now change your analysis
to subset for 1952
plot life expectancy (lifeExp) vs. population (pop)
gpdPercap)
size = gpdPercap to your plotting codeOnce you’re done, commit and push all your changes with a meaningful message.