STA216 Midterm (Due Wednesday, November 17, 1999)

You may use your notes and any books, but please work on it by yourself.

Also, please show your work. This will make it easier to grade the exams.

  1. a) Write out the formula for a 1-parameter exponential family.

  2. b) Write the binomial distribution in the 1-parameter exponential family form.
     

  3. The following table comes from a clinical trial that asked, "Does the anti-AIDS drug AZT have any benefit for patients infected with HIV who have not developed symptoms and whose CD4+ cells have not yet dropped?" (AZT is an established drug for treating AIDS patients, and an important measure of immune function among AIDS patients is the number of T-helper (CD4+) cells.) The investigators randomized a group of patients to either AZT or placebo (an inert ineffective compound) and followed these subjects for 3 years. The table shows the numbers of patients progressing to symptomatic AIDS and/or substantial drops in CD4+ counts within 3 years from randomization.
   

Progression

No

Progression

Total
AZT
76
399
475
Placebo
129
332
461
Fit a generalized linear model to the data by hand. In particular,
    1. Set up the auxiliary linear model:  and the appropriate weights. Please show all intermediate values. (You might want to consult page 200 of the text, in particular.)
    2. Carry out 1 iteration of the Fisher scoring algorithm (iterative weighted least squares) by hand. Show the initial values of model parameters. (Please, do not run a glm program and set it for one iteration. You may check your answer on the computer, but I would like you to set up the appropriate vectors, etc., and carry out the calculations by yourself.)
  1. Etoposide (also called VP-16) is a commonly used drug for treating cancer. It is related to the active compound in extracts from the mandrake or May apple plant, long used as a source of folk medicine. Etoposide is available in oral form (capsules) or for intravenous administration. A group of investigators carried out a randomized trial comparing oral etoposide to the more usual intravenously administered formulation as treatment for patients with small cell lung cancer. All patients also received the drug cisplatin. This particular data set evaluates toxicity among the patients who received oral etoposide for 21 days, followed by 7 days without drug. The primary study question: Is there an association between the amount of etoposide in the body and toxicity? For our purposes, we will consider toxicity as each patient’s lowest measured white blood cell count (WBC) during the first few weeks on the study. We also want to look at several covariates to see if they are also associated with toxicity and/or etoposide concentrations. Finally, etoposide binds to protein in the blood, so that only the free or unbound etoposide is available to kill or damage cells.
The data set  Question3.data  has the following 9 variables:
  1. patient (patient number)
  2. prewbc (WBC prior to starting treatment)
  3. nadirwbc (the lowest recorded WBC during the first 4 weeks of therapy)
  4. avg.conc (average trough concentration [microgram/milliliter] of etoposide during the first 4 weeks of therapy)
  5. free.etoposide (average trough concentration of unbound etoposide [in micrograms per milliliter] during the first 4 weeks of therapy)
  6. age (the patient’s age in years)
  7. bsa (the patient’s body-surface area. BSA is a measure of body size that is used for dosing most anticancer drugs.)
  8. albumin (the measured albumin in the patient’s blood prior to starting therapy. Albumin is a measure of circulating protein.)
  9. alkphos (the amount of alkaline phosphatase [Units/deciliter] prior to starting therapy. Alkaline phosphatase levels are associated with disease. For example, liver metastases were associated with higher levels of alkphos in this study.)
  10. marrowinvolved (a 0-1 indicator of whether or not the patient’s disease has spread to involve bone marrow)
Fit a model to the data ("Question3.data") to look at the association between the covariates and the nadir WBC. You might want to consider transforming some of the variables when carrying out your analysis. What do you think is the best fit to the data? Please try to summarize your results as if you were presenting a report to a client.