Homework #2: Due Wednesday, September 29, 1999
  1. Consider the linear model given by . Suppose we have n oberservations and only one covariate per subject (i.e., row i of the design matrix X is (1, xi). Further, let  and 

  2. be considered "known." Show that we can estimate the posterior distribution of b as the least-squares solution to the following augmented weighted regression problem.
  1. The file "GHQ.dat" has data from a study carried out by psychiatrists. The investigators were interested in studying the relation between psychiatric diagnosis ("case" or "noncase") and the subject’s score on a 12-item General Heatlh Questionnaire (GHQ) with scores lying between 0 and 12. The study population was 120 patients visiting a general clinic who filled out the questionnaire. Subsequently, a psychiatrist evaluated each patient, without knowing the subject’s GHQ score, and classified each patient as either a "case" (requiring psychiatric treatment) or a "noncase." In addition to the score, we have each subject’s gender.

  2. The file gives the number of cases and noncases by GHQ score separately for men and women. Please summarize and analyze the data and summarize the relationship between GHQ score and "case" status. Does gender matter?
     
     
  3. The data set "cholesterol.dat" contains data from a study of 1329 men, some of whom suffered from coronary heart disease (CHD). The file gives the number of men with CHD out of the "n" men having the same level of blood pressure ("bp") and serum cholesterol ("chol"). The categories for bp & chol are
bp
Blood pressure in mm mercury
0
< 127
1
127 – 146
2
147 – 166
3
> 166

 
chol
Serum cholesterol in mg/100cc
0
< 200
1
200 – 219
2
220 – 259
3
> 259
How does the risk of CHD related to blood pressure and serum cholesterol levels?