1) The first few questions refer to the lioness example we talked about in class.
Generate an artificial dataset with 20 rows:
a) Generate "age" by using round (sum of two uniforms * 15/2). Generate "agesq" = "age"*"age". Calculate "range" = -2*"agesq" + 30*"age" + normal*15 (sigma = 15). Get rid of any rows with negative range.
b) Fit a simple linear model predicting range with age. Check out the residuals of this model.
c) Fit the quadratic model. Check the residuals of this model.
d) Report what you have learned.
Repeat the above with 100 rows, and 1000 rows. Repeat with sigma=30.
2) Generate your own data with two continuous predictors and one categorical factor with at least 3 levels, and with some kind of interaction. You get to pick the parameters, and you get to label the variables with some names that are meaningful to you. You also get to choose the error variance. Then you get to experiment with leaving terms out of the model, adding bogus interactions, and including bogus predictors. Report what you learned.
3) An exercise exploring multicollinearity. Generate x1 however you please. Generate x2 as a linear function of x1 with a small amount of noise. Generate a y value as a linear function of x1 and x2 along with some noise. Predict y with x1 and x2 individually and then together. Look at plots of x1 versus x2. Look at the Variance Inflation Factors. Look at the whole model tests and the individual variable tests. Make the relationship between x1 and x2 weaker (more variance in the first model) and repeat the analysis. What are your conclusions?
And also, using the stockavg.jmp data set included in the Data folder with JMP, try to predict the closing Dow Jones Industrial average using a day index (1,2,3, ...). Lag the residuals three or four times. Plot the residuals (lagged and unlagged) against each other, and compute their sample correlation matrix. Observe that autocorrelation is present. Now add the Closing DJIA lagged by one as a predictor. I know that the day index is not significant, but for now leave it in. What has happened to the autocorrelation?