1. L1 minimization:

    1. Show that \[ \min_\theta \{ (y-\theta)^2 + \lambda |\theta| \} = \min_{u,v} \{ (y-uv)^2 + \lambda (u^2 + v^2 )/2 \}, \] where all items are scalars and \(|\theta|\) is the absolute value of \(\theta\), and conclude that if \((\hat u, \hat v)\) minimizes the right-hand side, then \(\hat\theta = \hat u \hat v\) minimizes the left-hand side.
    2. Show that \[ \min_\beta \{ ||y-X \beta||_2^2 + \lambda ||\beta||_1 \} = \min_{u,v} \{ ||y-X(u\circ v)||^2 + \lambda (u^\top u + v^\top v )/2 \}, \] where \(u\circ v\) is the elementwise (Hadamard) product of the vectors \(u\) and \(v\), and conclude that if \((\hat u, \hat v)\) minimizes the right-hand side, then \(\hat\beta = \hat u \circ \hat v\) minimizes the left-hand side. Using this result, can you think of a simple iterative algorithm to compute the Lasso (\(L_1\)-penalized) regression estimate?
  2. Conditional effects: Derive the results of Lemma 2 of the testing notes for the simple case that \(p=q=1\).

  3. Projecting out: The \(F\)-statistic for testing \(H:\beta=0\) in the linear model \(y\sim N(W\alpha+X\beta ,\sigma^2I)\) may be computed by first projecting out \(W\), or comparing the extra sum of squares \(RSS_H-RSS\) to \(RSS\). Show that these two approaches yield the same \(F\)-statistic (you may use the identities in Lemma 2 of the notes).

  4. Exercise: The file “hw6data.txt” contains a data matrix from an experiment of the effects of exercise on oxygen uptake, performed on 12 study participants. The first column gives a change in oxygen uptake over a one week period, the second column gives each participant’s age, and the third column gives the average number of minutes per day spent performing a specific exercise during the course of the week.

    1. In R or otherwise, fit a linear regression model \(y_i=\mu+\beta_{age}age_i+\epsilon_i\) and obtain a \(p\)-value for testing \(H:\beta_{age}=0\). Describe evidence for an age effect.
    2. Fit a linear regression model \(y_i=\mu+\beta_{exmin}exmin_i+\epsilon_i\) and obtain a \(p\)-value for testing \(H:\beta_{exmin}=0\). Describe evidence for an effect of number of minutes exercised.
    3. Fit a linear regression model with both predictors, and report the \(p\)-values from the output for each of the regression coefficients (excluding the intercept). That is, for each coefficient, report the \(p\)-value for testing the coefficient is zero while the other coefficient might not be zero.
    4. Fit a model with no predictors (except the intercept) and, using an appropriate \(F\)-test, compare it to the model in part c to evaluate the evidence of effects of either of these predictors.
    5. Describe how the relationships among three variables (y, age and exmin) explain the different modeling and testing output from parts a-d. 
  5. TBA