Sta 242 / Env 255
Homework 5
Not to be turned in but strongly suggested
It is strongly suggested that you work through the homework as preparation for the midterm exam.
Conceptual exercises in Chapter 10 are recommended as review. The answers are at the end of the chapter.
Topics:
Old Faithful. Problem 15 of Chapter 10.
Data in CASE0724.ASC.
Additional instructions:
Provide an interpretation and confidence interval for each coefficient in the Full model and the Reduced model. (In the Full model, there are 9 coefficients; make sure you can provide an interpretation and CI for the intercept, coefficient for duration and the first date coefficient ("date2")). Are the confidence intervals consistent with the results of the Extra Sum of Squares test?
Splus instructions:
In this exercise we will conduct an Extra Sum of Squares F-test to see if the mean intervals adjusted for duration also depend on the date of observation. We will have 3 ways to get the F-stat 1) by hand from the ANOVA tables (you should know this for use with other packages that do not do calculate the test statistic directly and for on exams, 2) from an ANOVA table with sequential Sum of Squares and 3) using the Model Comparison option (easy to use in S-Plus, but not available in all Stat packages). You should make sure that you understand 1). We will also show how to add separate regression lines to the plot.
It is important that you put date last so that it appears last in the sequential sum of squares in the ANOVA table. In the box for Save Model Object as enter Full.lm (ie this is the full model). Under the Results page/tab, check the box for ANOVA and save the fitted values in the dataframe. Click OK to run the regression. (verify that residuals are ok, then discard plots)
In the Report Window, you will have the ANOVA table for the Full Model. Verify that the df for date are 7 and not 1. Because we are treating it as a factor we should have 1 df for every dummy variable or 8 - 1. If it is treated as a continuous variable, the df would be 1 and something likely screwed up with the options() command earlier.
Save the Model Object as Reduced.lm. Under the Results menu, click the ANOVA box, but not the fitted values box. Click OK. The ANOVA table for the reduced model will be in the Report Window.
Galileo's Data Problem 16 of Chapter 10. You can skip part (b).
Data in CASE1001.ASC.
Additional instructions: See point #3 below.
Splus instructions:
From the statistics menu, bring up the regression dialog. For the model formula enter:
The I() function serves to "protect" the meaning of the expressions inside the parentheses. Some symbols have different meanings in an Splus model formula than they do normally, i.e. X1*X2 means fit the model with X1 + X2 + X1:X2 or the main effects + interaction. In the model formula, "height - 1" would mean to fit the model with height and no intercept; using I(height - 1) would fit the model with an intercept and subtract one from each height value. So if we want to do the transformations on the fly, rather than creating a transformed variable in the dataframe use I() to be careful. Note, this I() is a function which is not the same as the indicator variable we created for the class example.
Under the Predict Tab, check off the values for Predictions, Confidence Intervals and Standard Errors (standard errors for the means at the observed height values). Specify Case1001 for the New data and the dataframe for saving results. Under the Results tab, check off the box for correlation Matrix of Estimates. Click OK.
Galileo's Data. (again.) This exercise looks at the effect of adding variables on R-squared and adjusted R-squared.
Splus Instructions:
Save the fitted values and rename then as fit3.