You must turn in a knitted .pdf file to Gradescope from a Quarto Markdown file in order to receive credit. Be sure to “associate” questions appropriately on Gradescope. As a reminder, late work is not accepted outside of the 24-hour grace period for homework assignments.

The Quarto template for this assignment may be found in the repository at the following link: https://classroom.github.com/a/vaPwqU8b

These data contain pocket measurements for 80 pairs of jeans from popular US brands, as mentioned in the Pudding article available here - please read the article prior to starting this assignment (it’s short and pretty interesting!). For a description of the variables, check out the data dictionary here.

Important: Some of your grade on this assignment will also be based on meaningful commit descriptions. For the purposes of this assignment, you must commit and push your changes after Exercise 2 and again after Exercise 4 (of course, you’re welcome to commit/push more often than that!). As well, don’t forget to change your name in the Quarto template.

  1. In this exercise we will create three new variables. First, create a binary variable that indicates whether the style of a pair of jeans is skinny/slim or boot cut/regular/straight. Next, create a new variable corresponding to the "maximum rectangle" of the front pocket. This variable is defined as the max height of the front pocket multiplied by the max width of the front pocket. Finally, create a similar variable corresponding to the "maximum rectangle" of the back pocket. This variable is defined as the max height of the back pocket multiplied by the max width of the back pocket. Give all three variables meaningful names.
  2. Create a visualization that summarizes the relationship between the "maximum rectangle" of the front pocket and the "maximum rectangle" of the back pocket. In this visualization, color code the observations by whether the pair of jeans is marketed toward men or women, and facet your plot by the binary style variable you created. The faceting should be done side-by-side (i.e., one row, two graphs). In your plot, make sure you have strong labels throughout, including axes, legends, and facet titles (e.g., don’t use the defaults). Provide a meaningful title and subtitle that provide interesting data insights - do not simply describe what variables you are plotting (e.g., a title along the lines of "x vs. y vs. z").
  3. Given the basic visualizations constructed in Exercises 1 and 2, what can you say about pockets in jeans marketed to men vs. to women? What about differences by style? Do such differences themselves appear to vary between male-coded and female-coded jeans? Do your visualizations support the storyline from the Pudding article?
  4. Create a visualization that simply plots the maximum width of the back pocket against the maximum height of the back pocket (no need to separate by men’s vs. women’s-marketed jeans, but do label and title the plot meaningfully). How many points appear to be plotted? How many observations are there in the dataset? With these two things in mind, what are the potential dangers of displaying this plot? Suggest a strategy that might mitigate these issues.