1. Orthogonality: Recall the two-way ANOVA decomposition \(y_{i,j,k} = \hat\mu + \hat a_j + \hat b_k + \hat c_{j,k} + \hat\epsilon_{i,j,k}\), using the sum-to-zero identifiability constraints.
    1. Write the decomposition for the full data vector as \(y = 1 \hat\mu + \hat a^* + \hat b^* + \hat c^* + \hat\epsilon\) where each term is an \(rp_1p_2\)-dimensional vector. Show that the vectors on the right-hand-side of the equation are orthogonal to each other.
    2. Let \(\hat\sigma^2\) and \(\hat\sigma^2_0\) be the unbiased estimates of \(\sigma^2\) based on the \(SSE\)s of the full and additive models respectively. Compute the variance of each under the assumption that the additive model is correct, and comment on which is a better estimate in this case.
    3. Suppose the additive model is not correct, but we erroneously assume it is. How does this affect our \(F\)-tests of additive effects, \(H_A: a_1= \cdots = a_{p_1} =0\) and \(H_B: b_1= \cdots = b_{p_2} =0\)?
  2. Split-plot: The file potato.txt contains data from an experiment that evaluates effects of two factors on potato growth. The experiment was carried out on four plots of land. Two plots were randomly assigned to receive a low level of sulfur additive, and the remaining two received a high level of sulfur additive. Then, each plot was divided into four subplots, with two subplots per plot randomly selected to be planted with potato type A, with the other two being planted with potato type B.
    1. Examine the data and evaluate the evidence for non-additivity using some descriptive plots and an \(F\)-test that compares a model with additive effects for sulfur and type to a model that includes a distinct mean for each treatment combination. Fit the additive model and comment on the evidence for effects due to type and sulfur. Provide ANOVA tables for both model fits.
    2. Based on the estimate of error variance from the additive model fit, compute a t-statistic and p-value for evaluating differences between the two levels of type. Also compute a “non-ANOVA” two-sample t-test for differences between A and B that ignores potential effects of sulfur.
    3. Based on the data and assumptions of the additive model, report the evidence for effects of sulfur in terms of the \(F\)-test and its \(p\)-value.
    4. Suppose we suspect the plots of land might have an effect on the potato yield. Describe any difficulties with a linear model that includes additive effects for plot, sulfur and type.
    5. Evaluate the evidence for effects of sulfur using the randomization null distribution as follows: Compare the observed value of the \(F\)-statistic for testing sulfur to several thousand randomly generated \(F\)-statistics that could have been observed under different possible assignments of sulfur levels to plots under the null hypothesis H: sulfur level has no effect on outcome.
    6. How would you summarize the evidence in these data for sulfur effects?
  3. Kronecker practice: Consider the multivariate regression model for which \(y_i = B^\top x_i + e_i\) for \(i=1,\ldots,n\), where \(y_i\in \mathbb R^q\) is the \(q\)-variate response for case \(i\), \(x_i \in \mathbb R^p\) is the \(p\)-variate predictor for case \(i\), \(B\in \mathbb R^{p\times q}\) is unknown, and \(E[e_i]=0\), \(E[e_i e_j^\top]= 0\) for \(i\neq j\)
    and \(E[e_ie_i^\top] = \sigma^2 \Psi\), where \(\Psi\) is a known \(q\times q\) covariance matrix.
    1. Write the model as \(Y = X B + E\) for \(Y\in \mathbb R^{n\times q}\) and \(X\in \mathbb R^{n\times p}\) where the rows of \(Y\) and \(X\) are the \(y_i\)’s and \(x_i\)’s respectively. Use the vec-Kronecker identity to express the model as \(y = Z b + e\) where \(y =vec(Y)\) and \(b=vec(B)\).
    2. Find the OLS and GLS estimates of \(B\) and compare.