HW 04 - Gotta catch ’em all

Individual assignment

Due: Sep 18 at 10:05am

Photo by David Grandmougin on Unsplash.

A key part of Pokémon Go is using evolutions to get stronger Pokémon, and a deeper understanding of evolutions is key to being the greatest Pokémon Go player of all time. The data set you will be working with for this assignment covers 75 Pokémon evolutions spread across four species. A wide set of variables are provided, allowing a deeper dive into what characteristics are important in predicting a Pokémon’s final combat power (CP).

Getting started

Go to your email and accept the repo for this homework assignment. The name of the repo is hw-04-pokemon-GITHUBNAME, where GITHUBNAME is your GitHub username. This repo contains a template R Markdown file that you can build on to complete your assignment.

Clone your repo to RStudio Cloud, and then configure your name and email address for Git. To do so, follow these steps:

git config --global user.email "your email"
git config --global user.name "your name"

Before you move further, also update your project name from Untitled Project to HW 04 - Pokemon.

In the R Markdown file in your project, update your name, knit the document, commit, and push.

At this point you might be getting tired of having to type your username and password everytime you push to GitHub. If you would like your git password cached for a week for this project, type the following in the Terminal:

60 seconds per minute * 60 minutes per hour * 24 hours per day * 7 days per week = 604800 seconds

git config --global credential.helper 'cache --timeout 604800'

Analysis

Data

The dataset for this assignment can be found as a csv file at here. The variable descriptions are as follows:

Packages

In this lab we will work with the tidyverse and scales packages. This package has already been installed for you, and is loaded in your R Markdown file as well.

If you’d like to run your code in the Console as well you’ll also need to load the packages there. To do so, run the following in the console.

Exercises

  1. Calculate the diference in heights pre and post evolution and save this as a new variable, height_diff. Calculate the percentage of Pokémon that grew during evolution. Also visualize the distribution of change in height by species and provide a discussion of how change in height varies across species.

This is a good place to pause, commit changes with an informative commit message, and push. Make sure to commit and push all changed files.

  1. Pick two categorical variables and make a bar plot that depicts the relationship between them. These can be variables from the original data or ones that you create based on the given data.

This is another good place to pause, commit changes with an informative commit message, and push. Make sure to commit and push all changed files.

  1. Pick a numerical and a categorical variable, and construct side-by-side box plots depicting the relationship between them.

Don’t forget to commit and push your changes sporadically while working on your homework assignment.

  1. Learn something new: violin plots! Read about them at http://ggplot2.tidyverse.org/reference/geom_violin.html, and convert your side-by-side box plots from the previous task to violin plots. What do the violin plots reveal that box plots do not? What features are apparent in the box plots but not in the violin plots?

  2. Recreate the following plot, and interpret what you see in context of the data.

  1. Recreate the following plot. Note that height_diff refers to difference in height between pre and post-evolution, and weight_diff is calculated similarly.

Hint: The colors are from the viridis color palette. Take a look at the functions starting with scale_viridis_*.

  1. Rework the previous plot using principles of effective data visualizations we have learned about in class. Then, describe the relationship between the changes in heights and weights of Pokemon and their species.

  2. Describe what the following code is doing in each line, and interpret the numbers in the output.

Hint: We use the :: operator to explicitly indicate that the FUN in PKG::FUN is in the PKG.

## # A tibble: 9 x 4
## # Groups:   species [4]
##   species  attack_weak_type_new     n prop 
##   <chr>    <chr>                <int> <chr>
## 1 Caterpie Bug                      3 30.0%
## 2 Caterpie Normal                   7 70.0%
## 3 Eevee    Electric                 1 16.7%
## 4 Eevee    Fire                     3 50.0%
## 5 Eevee    Water                    2 33.3%
## 6 Pidgey   Flying                  15 38.5%
## 7 Pidgey   Steel                   24 61.5%
## 8 Weedle   Bug                     10 50.0%
## 9 Weedle   Poison                  10 50.0%
  1. What characteristics correspond to an evolved Pokémon with a high combat power? You do not need to come up with an exhaustive list, but you should walk us through your reasoning for answering this question and include all relevant summary statistics and visualizations.

Getting help

Use the #questions channel on Slack.

You are also welcomed to discuss the homework with each other broadly (no sharing code!) as well as ask questions at office hours.