Consider the linear model given by .
Suppose we have n oberservations and only one covariate per subject (i.e.,
row i of the design matrix X is (1, xi). Further,
let and
be considered "known." Show that we can estimate the posterior distribution
of b as the least-squares solution to the following
augmented weighted regression problem.
Let
Carry out weighted least squares with weights equal to the square root
of the diagonal elements of W (i.e., W1/2).
Find the mean and variance of ,
the solution to this augmented regression. (You can assume that the
covariates are centered, so that .
The file "GHQ.dat" has data from a study carried out by psychiatrists.
The investigators were interested in studying the relation between psychiatric
diagnosis ("case" or "noncase") and the subject’s score on a 12-item General
Heatlh Questionnaire (GHQ) with scores lying between 0 and 12. The study
population was 120 patients visiting a general clinic who filled out the
questionnaire. Subsequently, a psychiatrist evaluated each patient, without
knowing the subject’s GHQ score, and classified each patient as either
a "case" (requiring psychiatric treatment) or a "noncase." In addition
to the score, we have each subject’s gender.
The file gives the number of cases and noncases by GHQ score separately
for men and women. Please summarize and analyze the data and summarize
the relationship between GHQ score and "case" status. Does gender matter?
The data set "cholesterol.dat" contains data from a study of 1329 men,
some of whom suffered from coronary heart disease (CHD). The file gives
the number of men with CHD out of the "n" men having the same level of
blood pressure ("bp") and serum cholesterol ("chol"). The categories for
bp & chol are
bp
Blood pressure in mm mercury
0
< 127
1
127 – 146
2
147 – 166
3
> 166
chol
Serum cholesterol in mg/100cc
0
< 200
1
200 – 219
2
220 – 259
3
> 259
How does the risk of CHD related to blood pressure and serum cholesterol
levels?