2-way ANOVA

Three different varieties of tomato (Harvester, Pusa Early Dwarf, and Ife No. 1) and four different plant densitites (10, 20, 30 and 40 thousand plants per hectare) are being considered for planting in a particular region. The goal of the experiment is to determine whether variety and density affect yield ("Effects of Plant Density on Tomato Yields in Western Nigeria," Experimental Agriculture, 1976: 43:47). Yield data is given in the file yield.txt.

Column 1: Yield
Column 2: Planting Density
Column 3: Variety (H, P, or Ife)

Note that this will be treated as a two factor experiment, so we are interested in the effects of each factor as well as joint effects.

Splus tip: Be sure to change density and variety to "factor" types prior to running ANOVA models.

Check whether this is a balanced design.
Produce a table of means and standard deviations for each variety/density combination. (Do this by fitting the model, yield~density*variety, and clicking the box for displaying the means.) In a couple of sentences, comment on these statistics -- the relative sizes of means and variability in difference cells of the table, and how yield might be related to density and variety.
Produce two boxplots, yield vs. density and yield vs. variety. Do these confirm your answers to the previous problem?
Produce an interaction plot. How do the yields for density and variety differ and how to they relate to each other? Comment on whether an additive or non-additive model might be appropriate based on this plot.
Fit the saturated model to the data. Fitting ANOVA models. Perform a test of the significance of the ANOVA model; that is, test the claim that interaction model fit is an improvement over the equal means (null) model in terms of explaining the variability in the data. That is, use the anova table to compute an F-statistic that compares the saturated model to the equal means model (the model that says yield is a function of a single parameter, the mean. This is also called the null model or reduced model.) Write out the null hypothesis in words. The numerator for your F-statistic should be the SS for the model (add up the components) divided by the d.f. for the model. The denominator for your F-statistic should be the SS for Residual divided by its degrees of freedom. A large p-value for the F-statistic would indicate that there is no evidence that the interaction model fit is an improvement over the equal means (null) model in terms of explaining the variability in the data.
Produce a plot of residuals vs. fitted values for the interaction model to determine whether a transformation is needed. Comment.
If the test in (5) above is found to be significant, we can consider refining the model. We will do backward elimination to choose the best model for the data. Test for the significance of the interaction effect in the presence of main effects. Give the null hypothesis, the test statistic, and p-value. What is your conclusion written in terms of the problem?
Now consider the two main effects by fitting an additive model. Using this output, perform two tests of
1. whether there are significant variety effects,
2. whether there are significant density effects
Use careful statistical notation for each of the tests.

NOTE: If this were an unbalanced design, you would need to fit both yield~variety+density and yield~density+variety to perform these tests. More on this in class.
Use an appropriate multiple comparison procedure to identify significant differences among types of tomatoes. Give associated confidence intervals to quantify these differences, and comment on your results.
The bias-variance tradeoff in model selection. The Bayesian Information Criterion is a criterion that attempts to balance the tradeoff between models with too many parameters (overfit) and models with too few parameters (underfit). Of course, there are other criteria such as the AIC and Cp that attempt to do the same thing. Using BIC, the best models are the ones with the lowest value of BIC, meaning small sigma² (MSE) and small p. By hand, calculate the BIC (Section 12.4.2 on pages 356-7 as well as Display 12.8) as well as the R
2
(SS model/SS total) for the following models:
1. yield~variety*density
2. yield~variety+density
3. yield~variety
4. yield~density
5. yield~mu (null model) (p=1 here. sigma2 (MSE) is found by dividing SStotal by (n-1).)
Which model has the fewest parameters? Which one has the lowest SS_residual? Which one has the lowest MSE? Which model has the lowest BIC? Which model has the highest R²? What is the problem with using R² for model selection?
Solutions

Last modified: Sun Nov 17 12:44:46 EST 2002