Statistics 110E -- Statistical and Data Analysis-Psychology/Biological Sciences

Statistic 110 Lab 10

  1. Quiz
  2. Regression and One-way Anova using JMP
Note: there is no work to be turned in this week, but you should be able to carry out regression and one-way anova, as well as interpret output from them (you may see this on the final :-)

Regression and One-way Anova

Soy-bean Example

  1. Save the soy-bean data to your directory.

  2. Start up JMP.
  3. Use the file menu to Import the text data file. This file has JMP header information that contains the column names, so you will need to select JMP header. Also specify that columns are separated by tabs, spaces, and spaces. (select all)
  4. Use JMP to reproduce the regression output on the handout from class. (Use the Analyze Menu, and select Fit Y by X). Click on the "Fitting" arrow, and select "Fit line". To create a residual plot, click on the arrow by "Linear Fit", and select "Plot Residuals". Is there a significant SO2 effect? Do the residuals indicate any problems? If you have any questions about the output/interpretation ask the TA. You should be able to interpret p-values from either a t-ratio or F-ratio, create confidence intervals, and determine if the residuals indicate any problems with the model.
  5. Repeat the One-way Anova analysis discussed in class. You will need to change the variable type of the SO2 concentration to a categorical (ordinal) type. Click on the square box with a "C" and change it to "O" for Ordinal. Now when you fit the model, the output will be based on the One-Way Anova, instead of the linear regression.
  6. The initial output shows the data points, the overall mean (all observations) and the group means. Click on the "Analysis" arrow, and select "Means, ANOVA/t-test" to get the ANOVA table and group means with Standard errors. If there are significant differences in the means, then you can explore where the differences exist. Is there a significant SO2 effect? Which SO2 concentrations lead to significant effects on yield?
  7. Which model seems more appropriate? Can we conclude that SO2 causes a decrease in soy-bean yields for plants grown under similar conditions?

Residual Plots

The following data set illustrates how residuals and scatter plots are important in regression, and how using just the numerical summaries can lead to trouble.

Save the data set to a file, and then open it in JMP (use the import, but this time with "data only". This approach is useful for importing data from Excel or other spreadsheets.

Use the first column as the explanatory variable, X. Fit 3 regressions, using the remaining 3 columns as the Y variable. Create the residual plots for each regression. Verify that the estimates and summary measures are similar for all regressions. In which cases is linear regression appropriate and cases not? Discuss in lab.