Orthogonal spaces: Let \(X\in \mathbb R^{n\times p}\) have linearly independent columns, with \(p\leq n\).
- Show that \(\mathcal M\equiv \{X\beta : \beta\in \mathbb R^{p} \}\) is a subspace of \(\mathbb R^n\).
- Show that \(\mathcal N \equiv \{ n : n^\top m = 0 \ \forall m\in \mathcal M\}\) is a subspace of \(\mathbb R^n\).
- Show that \(\mathbb R^n = \{ m+n: m\in \mathcal M, n\in \mathcal N\}\).
- Show that if \(m_1,m_2 \in \mathcal M\) and \(n_1,n_2 \in \mathcal N\) and \(m_1+n_1 = m_2+n_2\) then \(m_1= m_2\) and \(n_1= n_2\).
Projection matrices: Let \(P\in \mathbb R^{n\times n}\) be a projection matrix, that is, \(P^2=P\).
- Show that \(P\) is an orthogonal projection matrix (using the definition in the notes) if and only if \(P^\top (I-P) = 0\).
- Show that \(P\) is an orthogonal projection matrix if and only if \(P\) is symmetric.
Models and matrices: Suppose an experiment is going to be run \(n\) times, \(n/2\) times under condition \(A\) and \(n/2\) times under condition \(B\). Two statisticians are going to use a linear model to estimate the population means under the two conditions. The first statistician will use the model \[ (M1) \ \ y_i = \mu + a w_i + \epsilon_i \] where \(w_i\) is -1 for runs under condition \(A\), and \(w_i=+1\) for runs under condition B. The second statistician will use the model \[ (M2) \ \ y_i = \mu_A s_i + \mu_B t_i +\epsilon_i \] where \(s_i=1\) if under condition \(A\) and is zero otherwise, and \(t_i = 1\) if under condition \(B\) and is zero otherwise.
- Write each of these models in the form \(y=X_k\beta_k+\epsilon\) for \(k=1,2\), so you will have two different model matrices, \(X_1\) and \(X_2\) for each of \(M1\) and \(M2\).
- Are the model spaces different or the same? If the same, then prove it. Otherwise, find a vector that is in \(C(X_1)\) but not in \(C(X_2)\) and vice versa.
Overparametrization: Suppose \(X\in \mathbb R^{n\times p}\) but is of rank \(r<p\). Describe how to construct a rank-\(r\) matrix \(\tilde X\in \mathbb R^{n\times r}\) for which \(C(\tilde X) = C(X)\).
Denoising: The data vector \(y\) from the file yHW1.rds
represents a signal that has been corrupted by both high and low frequency noise. The signal is a “blip” that occurs somewhere in this noisy time series and it is your job to locate it.
- Low frequency noise can be represented by linear combinations of low frequency sine and cosine functions. Try to identify the main frequencies of the low frequency noise in \(y\) as follows: Construct an \(n\times 30\) matrix \(X_s\) whose \(i,j\)th entry is \(sin(2\pi ij/n )\). Similarly construct the \(n\times 30\) matrix \(X_c\) whose \(i,j\)th entry is \(cos(2\pi ij/n )\). Column-bind these to construct the \(n\times 60\) model matrix \(X\). Find the least-squares approximation \(X\hat\beta\) of \(y\) and plot it, and also plot the coefficients \(\hat\beta\) and the residual vector \(\hat\epsilon\). Can you detect the signal?
- For each \(j\in \{1,\ldots,30\}\) compute \(a_j = \sqrt{ \hat\beta_{j}^2 + \hat\beta_{j+30}^2 }\), which is the estimated amplitude of the oscillation at frequency \(j\). Plot the coefficients \(a_1,\ldots,a_{30}\). What are the main frequencies of the low-frequency noise?
- Fit another linear approximation to \(y\) using only pairs of sine and cosine vectors that have a high amplitude, compared to the bulk of the estimated amplitudes. For this approximation, plot \(X\hat\beta\), \(\hat\beta\), and the vector of residuals. Can you detect the signal?
- Describe a way to remove high-frequency noise.