STA242/ENV255 Assignment 4

Due Friday, February 27 at Noon to S. McBride's mailbox in LSRC Faculty Mailroom. Late homework will not be accepted.

You may consult with colleagues as you prepare your homework, but what you turn in must be your own.. This includes computations, Splus output, graphs, answers to questions and discussion of results.
Assignment: Factors Affecting Extinction. 150 points This assignment has 2 parts.
  1. Writeup of data analysis 80%
  2. Supplemental questions 20%

Written Assignment, 3 pages maximum:

  1. Research questions: After accounting for the number of nesting pairs, do size or migratory status have an effect on the average extinction time? Does the effect of size differ depending on the number of nesting pairs?
  2. Overview of steps to take in model selection and analysis (see also Display 9.9):
    1. Fit the largest possible model and look at residual plots to determine whether time and/or pairs should be transformed. The largest possible model is: time~pairs+size+status+size:status+pairs:size+pairs:status+pairs:size:status, where status and size are indicator variables. If this was a take-home midterm, you would spend time exploring transformations on both "time" and "size" to determine the best transformation.
      For this problem, consider only transformating "time", and consider log(time), sqrt(time), 1/time. After looking at residual plots, proceed with the reciprocal transformation of time and do not transform the number of nesting pairs for this assignment.
    2. Do the relationships between the transformed time variable and pairs, for all four combinations of size and migratory status, appear linear? Create separate scatterplots of the transformed time variable vs. pairs for the four combinations and make an informal assessment before proceeding.
    3. You will work with the model in (a) and perform backward elimination to arrive at a suitable model for the data, creating residual plots and performing hypothesis tests at each stage below to examine the fit. For your tests below, use alpha=0.05.
      1. Examine whether the 3-factor interaction belongs in the model. Often when these high level interactions are difficult to interpret they are not included.
      2. Examine whether the set of 2-factor interactions belongs in the model; that is, is there evidence of unequal slopes in the four combinations of size and migratory status? Use an ESS F-test to answer this question. Fit two models: 1/time~pairs+size+status+size:status+pairs:size+pairs:status and 1/time~pairs+size+status. Save these "Model Objects" as ModelA and ModelB . Compare the models by doing an ESS F-test.
      3. Now determine whether there evidence of unequal intercepts in the four groups, by running an ESS F-test on the models 1/time~pairs+size+status and 1/time~pairs.
      4. If you keep both the "size" and "status" terms, confirm that each is significant in the presence of the other. (Your result from testing them jointly could actually reflect the significance of just one.)
    4. Once you have arrived at your final model, summarize regression results in the format at bottom of page 185. You'll put these in a table on page 2 of your writeup. You'll need separate equations for each of the four combinations of migratory status and size. (You will need to recode indicator variables more than once to do this. The Splus menu item "Data" - "Recode" will be helpful.)
    5. Answer the research questions by interpreting model coefficients from the models fit above. Provide confidence intervals for coefficients used to answer your research questions.
    6. Produce 2 coded scatterplots with the final model superimposed. The first will give the fitted model for resident birds, the second for migratory birds.
    7. In the model selection steps above, keep track of any unusual observations. Which ones are they? Does transformation deal with them? Are your final model results sensitive to the inclusion of particular observations?
    Supporting figures, tables, equations, Page 3: Refer to these in your writeup.
    1. Provide a Table of Summary Statistics that you will briefly discuss in your writeup.
    2. Summarize regression results in the format at bottom of page 185. You'll need separate equations for each of the four combinations of migratory status and size.
    3. Provide coded scatterplots with regression lines superimposed as stated above.
    4. Provide plots of residuals vs. fitted values and a normal probability plot with caption for the final model you choose.


    Note a mistake in the Sleuth dataset as printed on p. 301: Note that the data presented in Display 10.22 have had 1.0 added to years. There are 7 species for which average extinction time was 0, meaning that these populations survived without extinction for the entire census. These are called "censored" data, meaning that there is an extinction time, and we know it is greater than the greatest value in the dataset, but we do not know how much greater. Thus, the Sleuth's idea to say the extinction time is "1" year for these makes no sense. The dataset link provided above has omitted these observations. You do not need to discuss these in your writeup. We will discuss how to handle these later.


    Supplemental Exercises Skim through pages 757-773 of the Pimm et al. paper. Full text of article by Pimm et al. is available via E-journals at Duke. Cite is given on page 300. Pimm et al.'s analysis is different from the one you did.
    1. At the bottom of p. 766 and in the "Technical Comments" on page 767-8, Pimm et al. describe the Y variable used in mathematical terms and in terms of the problem. What was used as the response (Y) variable? Give one of the justifications for its use.
    2. Again under the "Technical Comments" on page 767-8, why might our analysis be biased by excluding those species for which a time to extinction was not observed?
    3. Read the "Effects Related to Body Size" section on p. 771, particularly the paragraph beginning "The regression analysis also illustrated that...." Use the coded scatterplots you made for migratory and resident birds to examine the predicted time to extinction for 7 mating pairs. Do his comments on "7 pairs" apply to your results? Could you adapt this type of statement for your results? If yes, do so. If no, explain why.
    4. Do Pimm's results in "Effects of Migratory Status" on page 771 and in Figure 4 mirror your results? Why or why not?
    5. Use your model to give a confidence interval for the mean extinction rate (or the CI on the "1/time" scale) for a Wood Pidgeon, using the value of average nesting pairs, size and status from Table 10.23 on page 301 of Sleuth.

    General format hints:
    Last modified: Mon Feb 23 17:31:39 EST 2004