1. Let \(y = [ y_a^\top \ y_b^\top ]^\top\) be a random vector with finite second moments and let \[ \tilde{y} = f + \begin{pmatrix} G & 0 \\ 0 & H \end{pmatrix} y, \]
    where \(G\) and \(H\) are non-singular. Show that the canonical correlations of \(y\) are equal to those of \(\tilde{y}\).

  2. The dataset nhanes.RData contains demographic, diet and health information on a sample of the US population. Analyze the relationship between the 10 dietary variables “DXXXX” and the health variables “BXXXX”.

    1. Identify the first few canonical variates based on the entire sample and describe the coefficients of the linear combinations that define them.
    2. Perform one or more tests of independence of these two sets of variables.
    3. Evaluate if the results from a and b are consistent across levels of the demographic variables “RXXXX”.

There are a variety of ways to address Question 2, so I expect a lot of heterogeneity in the data analyses (but hopefully not in the overall conclusions).