You must turn in a knitted file to Gradescope from a Quarto Markdown file in order to receive credit. Be sure to “associate” questions appropriately on Gradescope. As a reminder, late work is not accepted outside of the 24-hour grace period for homework assignments.

The Quarto template for this assignment may be found in the repository at the following link: https://classroom.github.com/a/8t_6Z4C9

In this week's homework, you will be asked to interpret published literature regarding interaction models, as well as revisit HW 1 with a more sophisticated analysis. Essentially, we are reviewing class material from Feb. 22 (I hope you took good notes!).

Important: Please continue to make regular commits and follow good coding practices (e.g., with not having code run off the page). As well, suppress warnings and messages in your R code chunks.

  1. Consider Example 1 on the slides from Feb. 22. Quantify the relationship between PRS and FEV1/FVC according to the linear model model they fit. Be careful in your interpretation; use only what is available on slide 4. PRS and FEV1/FVC are both unitless.
  2. Consider Example 2 on the slides from Feb. 22. Quantify the relationship between availability of fruit and daily travel distance. You may assume daily travel distance is measured in km., observation time is in hours, and the other variables are unitless. Assume the authors fit a linear model. Use only what is available on slide 7.
  3. In Example 2 from the slides, interpret the estimate corresponding to "Frugivory" in the table (-0.041) in the context of the model.
  4. Consider Example 3 on the slides from Feb. 22, using only the title of the paper on slide 8 and the figure from slide 9, which shows 95% confidence intervals of various estimates. Suppose there are no variables in the dataset that are not depicted in the figure. Explain the relationship between income and pre/post-Proposition 56 on the predicted prevalence of current smoking. Tie your explanation to what you might expect from a formal hypothesis test at the 0.05 significance level for a specific slope term in an interaction model.
  5. We will return to the jeans dataset from Homework 1, which is available in the repository provided. Consider only the "maximum rectangle" (max height times max width) of the front pocket, whether the jeans are skinny jeans (an indicator variable corresponding to whether the style of the jean is skinny), and the gender to which the jeans are marketed. Is there evidence that the relationship between maximum rectangle area and the style of the jeans (skinny vs. non-skinny) depends on the marketed gender? Explain, referencing a relevant hypothesis test at the 0.05 level.