Must there always be a linear relationship between some predictor \(x_k\) and the outcome \(y\) in a linear regression model? If yes, provide a proof; if no, provide a counterexample of such a model and clearly demonstrate a non-linear relationship between the two.
is linear in the \(\beta\) terms so it is a linear model, but is quadratic in \(x_1\) (partial derivative with respect to \(x_1\) is \(\beta_1 + 2\beta_2x_1\), not a constant).
Exercise 2
[n.b. Palmer Penguins dataset]. Create a linear model in R that predicts the body mass of a penguin based on its flipper length, bill length, and which species it is. Provide your design matrix, and clearly label what each column corresponds to. Display the estimated regression coefficients from the lm function, and recover these same estimates from the dataset directly using matrix operations on the underlying data.
library(palmerpenguins)library(dplyr)# Some data manipulation: choosing relevant variables, creating# dummy variables, and vector of 1s for interceptdat <- penguins %>%select(body_mass_g, flipper_length_mm, bill_length_mm, species) %>%na.omit() %>%mutate(chinstrap =ifelse(species =="Chinstrap", 1, 0),gentoo =ifelse(species =="Gentoo", 1, 0),intercept =1) %>%select(-species)# Creating vector of responsey <- dat %>%select(body_mass_g)# Creating and displaying first few rows of design matrix. First# column is vector of 1s for the intercept, next two are the# data for flipper and bill length, and last two are dummy variables# corresponding to which species the penguin is (1 is yes, 0 if no).x <- dat %>%select(intercept, flipper_length_mm, bill_length_mm, chinstrap, gentoo)head(x)