You must turn in a knitted file to Gradescope from a Quarto Markdown
file in order to receive credit. Be sure to “associate”
questions appropriately on Gradescope. As a reminder, late work
is not accepted outside of the 24-hour grace period for homework
assignments.
The Quarto template for this assignment may be found in the
repository at the following link: https://classroom.github.com/a/8t_6Z4C9
In this week's homework, you will be asked to interpret published
literature regarding interaction models, as well as revisit HW 1 with a
more sophisticated analysis. Essentially, we are reviewing class
material from Feb. 22 (I hope you took good notes!).
Important: Please continue to make regular commits
and follow good coding practices (e.g., with not having code run off the
page). As well, suppress warnings and messages in your R code
chunks.
- Consider Example 1 on the slides from Feb. 22. Quantify the relationship
between PRS and FEV1/FVC according to the linear model model they fit. Be
careful in your interpretation; use only what is available on slide 4.
PRS and FEV1/FVC are both unitless.
- Consider Example 2 on the slides from Feb. 22. Quantify the relationship
between availability of fruit and daily travel distance. You may assume
daily travel distance is measured in km., observation time is in hours, and
the other variables are unitless. Assume the authors fit a linear model.
Use only what is available on slide 7.
- In Example 2 from the slides, interpret the estimate corresponding
to "Frugivory" in the table (-0.041) in the context of the model.
- Consider Example 3 on the slides from Feb. 22, using only the title
of the paper on slide 8 and the figure from slide 9, which shows 95%
confidence intervals of various estimates. Suppose there are no variables
in the dataset that are not depicted in the figure. Explain the
relationship between income and pre/post-Proposition 56 on the predicted
prevalence of current smoking. Tie your explanation to what you might
expect from a formal hypothesis test at the 0.05 significance level for a
specific slope term in an interaction model.
- We will return to the jeans dataset from Homework 1, which is available
in the repository provided. Consider only the "maximum rectangle" (max height times max width) of the front pocket, whether the jeans are skinny jeans (an indicator variable corresponding to whether the style of the jean is skinny), and the gender to which the jeans are marketed. Is there evidence that the relationship between maximum rectangle area and the style of the jeans (skinny vs. non-skinny) depends on the marketed gender? Explain, referencing a relevant hypothesis test at the 0.05 level.