1. Simple Gauss-Markov: Let \(E[y] = X \beta\) and \(V[y] = \sigma^2 I\) for some known \(X\in \mathbb R^{n\times p}\) of full column rank, unknown \(\beta\in \mathbb R^{p}\) and unknown \(\sigma^2>0\).

    1. Let \(\ell\in \mathbb R^{n}\) and \(m \in C(X)\) be vectors such that \(E[\ell^\top y] = E[ m^\top y]\) for all \(\beta\in \mathbb R^p\). How does the space in which \(n=\ell- m\) lives relate to \(C( X)\)? Specifically, what can you say about \(d^\top n\) for \(d \in C(X)\)?
    2. Derive an expression for the variance of \(\ell^\top y\) in terms of the variance of \(m^\top y\), and determine conditions under which the latter variance is smaller.
    3. Let \(\hat{\beta} = (X^\top X)^{-1} X^\top y\). Using the results from part (b), for known \(c\in \mathbb R^p\), show that \(c^\top \hat{\beta}\) is the best linear unbiased estimator of \(c^\top {\beta}\).
  2. Linear estimators of linear estimands: Consider a linear model \(E[y]=X\beta, \beta\in \mathbb R^p\) with \(V[y]=\sigma^2 I, \ \sigma^2>0\).

    1. Let \(C y\) be a linear unbiased estimator of \(W\beta\). Show that \(Cy= W \check \beta\) for some linear unbiased estimator \(\check \beta\) of \(\beta\).
    2. Let \(\check \beta\) be a linear unbiased estimator and \(\hat\beta\) be the OLS estimator of \(\beta\). Show by example that it is possible that \(\check \beta \neq \hat\beta\) but \(W\check \beta = W \hat\beta\).
    3. Show that if \(MSE[W\check\beta] = MSE[W \hat\beta]\) then \(W\check \beta = W \hat\beta\) for all \(y\in \mathbb R^n\).
  3. OLS and GLS: Let \(V\) be an \(n\times n\) positive definite covariance matrix, let \(\hat\beta_V = (X^\top V^{-1} X)^{-1} X^\top V^{-1} y\), and let \(\hat\beta\) be the OLS estimator. Show that \(\hat\beta_V=\hat\beta\) for all \(y\) iff \(V\) can be written \[ V= X\Psi X^\top + H \Phi H^\top \] for some \(H\) such that \(H^\top X = 0\) and for some positive definite matrices \(\Psi\) and \(\Phi\) of the appropriate dimension.

  4. Centering: Most regression models used in practice include an “intercept” term, that is, a column of ones in the design matrix, whose coefficient represents the expected outcome when the values in the other columns are zero. Such a model can alternatively be written as \(y_{i} = \alpha + x_i^\top \beta + \epsilon_i\), or in matrix form as \(y= \alpha 1_n + X \beta + \epsilon\), so \(\alpha\) would be the expected value of \(y_i\) if \(x_i\) were zero. In the remainder of this exercise, assume \(E[ y] = 1_n \alpha + X\beta\) and \(V[y] = \sigma^2 I_n\).

    1. Obtain an expression \((\hat\alpha,\hat\beta)\) for the OLS estimator of the \((p+1)\)-vector \((\alpha, \beta)\) in terms of \(X\), \(y\) and \(\bar x = X^\top 1/n\).

    2. Obtain an expression for the orthogonal projection matrix onto the 1-dimensional linear subspace spanned by the vector \(1_n\), and also find the complementary projection matrix onto the null space of \(1_n\). Call this latter matrix “\(C\)” and let \(y_c = C y\). What linear model does \(y_c\) follow, i.e., what is the mean and variance of \(y_c\)?

    3. Let \(H\) be an \(n\times (n-1)\) matrix whose columns are an orthonormal basis for the null space of \(1_n\). Show that \(H H^\top = C\).
      Let \(y_h = H^\top y\). What linear model does \(y_h\) follow, i.e., what is the mean and variance of \(y_h\)?

    4. Find the OLS estimator \(\hat\beta_c\) of \(\beta\) based on \(y_c\) and the OLS estimator of \(\hat\beta_h\) of \(\beta\) based on \(y_h\). Is \(\hat\beta_h\) the BLUE among estimators based on \(y_h\)? Is \(\hat\beta_c\) the BLUE among estimators based on \(y_c\)?

    5. Describe any differences among \(\hat\beta\), \(\hat\beta_c\) and \(\hat\beta_h\). Is any precision in estimating \(\beta\) lost by using \(y_c\) or \(y_h\), instead of the “full” data \(y\)?