You must turn in a knitted file to Gradescope from a Quarto Markdown
file in order to receive credit. Be sure to “associate”
questions appropriately on Gradescope. As a reminder, late work
is not accepted outside of the 24-hour grace period for homework
assignments.
The Quarto template for this assignment may be found in the
repository at the following link: https://classroom.github.com/a/H5_1w2WT
We will be using data from Exam 1 for this homework. The data are
found in the homework repository.
Please continue to make regular commits
and follow good coding practices (e.g., with not having code run off the
page). As well, suppress warnings and messages in your R code
chunks.
- Fit a linear model with PM10 (fine particulate matter in micrograms per
cubic meter) as the response variable and with temperature (in degrees Celsius), wind direction,
and month as the predictor variables. Do not forget to treat month as
a categorical variable. Display a well-labeled residual plot and evaluate
the linearity and constant variance assumptions (no need to evaluate normality or independence for this question).
- Fit the same linear model as in Exercise 1, but use a log (base 10)
transformation of PM10. Display a well-labeled residual plot and evaluate the
linearity and constant variance assumptions (no need to evaluate normality or independence for this question).
- Interpret the slope term corresponding to temperature in your model from
Exercise 2 by comparing predicted PM10 levels. Next, interpret the term corresponding to northerly wind direction in your model from Exercise 2 by comparing predicted PM10 levels. Your interpretations should not involve "log10(PM10)," but rather be directly in terms of predicted PM10.
- Suppose you had instead fit a model with log (base 10) transformation of PM10 as the outcome, and log (base 2) transformation of temperature, wind direction, and month as predictors. What would the slope parameter corresponding to log2(temperature) represent in terms of predicted PM10 levels in this model? Your answer should neither involve "log10(PM10)," nor log2(temperature), but rather be directly in terms of predicted PM10 and the temeprature itself. Do not actually fit this model; answer in the abstract, without numerical estimates.