A key part of Pokémon Go is using evolutions to get stronger Pokémon, and a deeper understanding of evolutions is key to being the greatest Pokémon Go player of all time. The data set you will be working with for this assignment covers 75 Pokémon evolutions spread across four species. A wide set of variables are provided, allowing a deeper dive into what characteristics are important in predicting a Pokémon’s final combat power (CP).
Here are the steps for getting started:
and voila, you’re done! Once you push your changes back you do not need to do anything else to “submit” your work. And you can of course push multiple times throughout the assignment. At the time of the deadline we will take whatever is in your repo and consider it your final submission, and grade the state of your work at that time (which means even if you made mistakes before then, you wouldn’t be penalized for them as long as the final state of your work is correct).
Go to the codebook and find out which variabkes are included in this dataset, and what they represent.
For each exercise, write down an interpretation of your visualization (and what it reveals about the data) in 1-2 sentences max.
Calculate the diference in heights pre and post evolution and save this as a new variable. Calculate the percentage of pokemon that grew during evolution. Also visualize the distribution of change in height by species and provide a discussion of how change in height varies across species.
Recreate the following plot.
Calculate the relative frequency of attack_strong_new
becased on levels of attack_strong
. That is, the relative frequency of post-evolution stronger attack based on the pre-evolution stronger attack. Do Pokemon tend to change their strong attack post-evolution?
Pick two categorical variables and make a bar plot that depicts the relationship between them. These can be variables from the original data or ones that you create based on the given data.
Pick a numerical and a categorical variable, and construct side-by-side box plots depicting the relationship between them.
Learn something new: violin plots! Read about them at http://ggplot2.tidyverse.org/reference/geom_violin.html, and convert your side-by-side box plots from the previous task to violin plots. What, do the violin plots reveal that box plots do not? What features are apparent in the box plots but not in the violin plots?
What characteristics correspond to an evolved Pokémon with a high combat power? You do not need to come up with an exhaustive list, but you should walk us through your reasoning for answering this question and include all relevant summary statistics and visualizations.
When you’re done, review the .md document on GitHub to make sure you’re happy with the final state of your work. Then go get some rest!
Use the #questions channels on Slack to ask questions or posting on the GitHub Communiy. If your question is about an error you’re getting, make sure to clearly explain what generated the error as well as what the error says.
You are also welcomed to discuss the homework with each other broadly (no sharing code!) as well as ask questions at office hours.
Something new on this homework, and going forward on class assignments. Each time you push to GitHub a continuous integration tool (called wercker) will check to make sure your Rmd knits successfully. If it does, you’ll be able to see a badge on the README of your repo that says “build passing” (in green). If it fails, it’ll say “build failing” (in orange). If the build is failing you will need to fix your Rmd so that it knits without any errors and then push again. Some tips:
This is an individual assignment. You are welcomed to exchange ideas with classmates and ask questions on the getting help channels discussed above however you may not share your text or code answers directly with classmates.
The Duke Community Standard applies and course academic integrity policies apply. Please review them here. Specifically, the note on sharing / reusing code.
Total | 100 pts |
---|---|
Questions 1-6 | 60 pts |
Question 7 | 20 pts |
Informatively named code chunks | 5 pts |
Commit after each question (at a minimum, more commits ok) | 2 pts |
Informative commit messages | 3 pts |
Document organization | 4 pts |
Quality of writing and grammar | 6 pts |