L1 minimization:
- Show that \[
\min_\theta \{ (y-\theta)^2 + \lambda |\theta| \} =
\min_{u,v} \{ (y-uv)^2 + \lambda (u^2 + v^2 )/2 \},
\] where all items are scalars and \(|\theta|\) is the absolute value of \(\theta\), and conclude that if \((\hat u, \hat v)\) minimizes the right-hand side, then \(\hat\theta = \hat u \hat v\) minimizes the left-hand side.
- Show that \[
\min_\beta \{ ||y-X \beta||_2^2 + \lambda ||\beta||_1 \} =
\min_{u,v} \{ ||y-X(u\circ v)||^2 + \lambda (u^\top u + v^\top v )/2 \},
\] where \(u\circ v\) is the elementwise (Hadamard) product of the vectors \(u\) and \(v\), and conclude that if \((\hat u, \hat v)\) minimizes the right-hand side, then \(\hat\beta = \hat u \circ \hat v\) minimizes the left-hand side. Using this result, can you think of a simple iterative algorithm to compute the Lasso (\(L_1\)-penalized) regression estimate?
Conditional effects: Derive the results of Lemma 2 of the testing notes for the simple case that \(p=q=1\).
Projecting out: The \(F\)-statistic for testing \(H:\beta=0\) in the linear model \(y\sim N(W\alpha+X\beta ,\sigma^2I)\) may be computed by first projecting out \(W\), or comparing the extra sum of squares \(RSS_H-RSS\) to \(RSS\). Show that these two approaches yield the same \(F\)-statistic (you may use the identities in Lemma 2 of the notes).
Exercise: The file “hw6data.txt” contains a data matrix from an experiment of the effects of exercise on oxygen uptake, performed on 12 study participants. The first column gives a change in oxygen uptake over a one week period, the second column gives each participant’s age, and the third column gives the average number of minutes per day spent performing a specific exercise during the course of the week.
- In R or otherwise, fit a linear regression model \(y_i=\mu+\beta_{age}age_i+\epsilon_i\) and obtain a \(p\)-value for testing \(H:\beta_{age}=0\). Describe evidence for an age effect.
- Fit a linear regression model \(y_i=\mu+\beta_{exmin}exmin_i+\epsilon_i\) and obtain a \(p\)-value for testing \(H:\beta_{exmin}=0\). Describe evidence for an effect of number of minutes exercised.
- Fit a linear regression model with both predictors, and report the \(p\)-values from the output for each of the regression coefficients (excluding the intercept). That is, for each coefficient, report the \(p\)-value for testing the coefficient is zero while the other coefficient might not be zero.
- Fit a model with no predictors (except the intercept) and, using an appropriate \(F\)-test, compare it to the model in part c to evaluate the evidence of effects of either of these predictors.
- Describe how the relationships among three variables (y, age and exmin) explain the different modeling and testing output from parts a-d.
TBA