Master List of Practice Problems

  1. Practice Problems for Prerequisite topics:
  2. One-way ANOVA
  3. Simple Linear Regression, Chapter 7
  4. Assumptions of simple linear regression, anova tables for regression, lack of fit test, logarithmic transformations, examination of residuals, Chapter 8
  5. Multiple linear regression and inference, the use of specially designed explanatory variables, Chapters 9 and 10.
  • Chapter 11 Practice Problems
    Practice Problems for Prerequisite topics:

    All below are from Moore and McCabe, 3rd Edition. TAs have solutions to even-numbered problems.

    Review: Quiz on Preq. Topics from Fall 2002

    Answers: 1(a) 3 +/- (1/sqrt(100))*z-star, where z-star is the quantile that cuts off a probability equal to 0.95 (to the left on the table). Look up in table. (b) False. This is the correct definition of a CI, but the interval in (a) does not guarantee a Type I error rate of 5%. (c) False (d) Ho: mu>3.2 Ha: mu<3.2 (e) P(Z>2) is approx 2.5%. You should not need a table for this. (f) False. 2. True 3. False 4. True

    Review: A Practice Final Exam


    One-way ANOVA


    Simple Linear Regression
    1. Sleuth, Chapter 7, all conceptual exercises.
    2. Computational exercises: #12,13.
    3. Ch 7 Sleuth Exercises #19-22 concern meat processing data. Columns are "time" and "ph". Note that in order to do this problem you need to LOG TRANSFORM the "time" variable. To do this take a look at this Splus help topic. Confirm #21 by hand. Answers here.
    4. Sample problem from past final exam
    5. Spring 2001 midterm (skip problem 9,10 for now). Solutions
    6. For additional review, read Chapter 10 of Moore and McCabe. Suggested exercises:
      1. 10.7 (use Splus),
      2. 10.6 (use Splus). Follow-on questions: 10.11, 10.12, 10.13
      3. 10.14, 10.20, 10.21, 10.23 (use calculations on pages 686+).
      TAs will post solutions to even-numbered exercises to the newsgroup if requested.

    Chapter 8 practice problems
    1. All conceptual exercises.
    2. You can now do all problems in the Spring 2001 midterm. Solutions
    3. 8.22, Ecosystem Decay data. Assume that you have done the model exploration for this cases and found that the model log(species)~log(area) is the model you have chosen.
      1. Provide the fitted regression line.
      2. Give a one sentence interpretation of the slope on the original scale of measurement.
      3. Give a CI for this slope.
      4. Give an estimate and CI for the median number of species as a function of area when area=1.
      5. Although the residuals may not indicate significant lack of fit, you decide to perform a lack-of-fit test to test the claim that the simple linear fit of log(species) on log(area) is inadequate.
    4. Adapted from "Crab claw size and force", #25 Sleuth, page 194, Ch. 7. These data come from: Behrens Yamada, S. and E.G. Boulding. 1998. Claw morphology, prey size selection, and foraging efficiency in generalists and specialist shell-breaking crabs. Journal of Experimental Marine Biology and Ecology 220: 191-211 (available on e-journals). Data in crabclaw.txt. Variables: Mean closing force (Newtons) and height (mm). Results for Hemigrapsus nudus are:


      Call: lm(formula = log(force) ~ log(height))
      Residuals:
      Min 1Q Median 3Q Max
      -0.5903 -0.2775 -0.082 0.2517 0.8882

      Coefficients:
      Value Std. Error t value Pr(>|t|)
      (Intercept) 0.5191 1.1147 0.4657 0.6497
      log(height) 0.4083 0.5426 0.7525 0.4663

      Residual standard error: 0.4825 on 12 degrees of freedom
      Multiple R-Squared: 0.04506
      F-statistic: 0.5662 on 1 and 12 degrees of freedom, the p-value is 0.4663

      1. What is the effect of a tripling of the height for the Hemigrapsus nudus? Give a 95% confidence interval for this multiplicative factor in the median. What are the units of b1? (the slope of the regression line)
      2. For each crab, closing forces were measured repeatedly, by measuring the force as the claws pulled two wires together. Could this be considered an ecological regression? Why or why not? Are their any other statistics besides the mean that might be appropriate to measure closing force?
      3. Perform a test of whether closing force is an increasing function of height. Give hypotheses, test statistic and conclusion of your test. (Remember, the log transform is an order-preserving transformation. Thus, you can perform a test on the log scale and interpret it on the original scale. Effectively, you are performing a test regarding the multiplicative factor in the median.)

    Solutions to Ch 8 problems:

    1. Ecosystem Decay
      1. Fitted regression line: Estimated mean of log(species) = 3.60 + 0.18 log(area). On the log scale, a one unit increase in log(area) is associated with a 0.18 unit (additive) increase in the estimated mean of log(species).
      2. A 10-fold increase in the area is associated with an estimated [10^(0.18=1.51] 51% increase in the median number of species. (or a 1.51-fold increase).
      3. CI for beta1 on log scale is 0.18 +/- qt(.975,16-2)*(0.05). Let's say this interval is (e1,e2). To find the CI for the increase, you need to take the endpoints (e1,e2) and calculate: (10^(e1),10^(e2)) to find the interval.
      4. The F-statistic for lack of fit is compared to an F on 2, 12 df. The calculated F-statistic is 0.1053, with p-value .9009. We do not have convincing evidence to reject the hypothesis that the linear regression model is adequate.

    Chapter 9:

    1. All conceptual exercises.
    2. Multiple regression with continuous X variables: Pace of life and heart disease. These data are described on page 260 of Sleuth, problem 14.

      Data in EX0914.ASC. Variables: "bank", "walk", "talk", "heart".

      • The model statement you will enter in Splus is: heart~bank+walk+talk
      • Follow the exercises in the book in addition to the supplement below.
      • Make a scatterplot matrix for part (a). (Splus won't let you put heart on the vertical axis, I don't think.) Put all plots on a separate page.
      • Additional part (e): Give one sentence interpretations for each regression parameter, using careful language. Holding "bank clerk speed (bank)" and "postal clerk talking speed (talk)" constant, what is the effect of a one-unit (what units are they?) increase in pedestrian walking speed on mean death rate due to heart disease?

    3. Moore and McCabe Exercises 11.1, 11.2, 11.9

    Chapter 10:

    1. If you haven't already, work through all research questions/results for the bat data in Case Study 2 of Chapter 10. Computational exercise #13 is good practice. In 13b, report the confidence intervals for slopes of each of the 3 species and write 1-sentence interpretations of each (use careful language). Under the parallel regression lines model, for *each* of the three species, how would you use the computer centering trick to calculate a prediction interval for the median energy expenditure for a future observation of median body mass=200g? How would you use the computer centering trick to calculate a confidence interval for the median energy expenditure for a median body mass of 200g? How do these intervals differ? How do they change as median body mass is increased to 400g?
    2. All conceptual exercises
    3. Computational exercises: #9, 10, 11
    4. Sample quiz (long) from 2001 course
    5. Added 3/20: Use the analyses for the pollen data handed out in class for this problem. Assuming the parallel lines model is true, is there evidence that, after accounting for the amount of time on the flower, queens tend to remove a smaller proportion of pollen than workers? Perform a hypothesis test, giving test statistic and p-value. Give a confidence interval for the difference in the logit of the proportion of pollen removed.

    Chapter 11
    1. All conceptual exercises.
    2. Make sure you can go through ALL steps of the lab using Splus and answerrelated questions. If you have any questions, please plan to set up an appt. with Shannon or Sandra for the week of March 24th.