STA114 Homework 4

MTH136/STA114: Statistics

Homework #5

Due: Wednesday, Feb 28, 2001

A criminal suspect undergoing a polygraph test is either guilty (G) or innocent (I). His answer to the question, ``Are you innocent?'' is ``Yes.'' As a result of the test, the expert polygraph examiner declares the suspect to by lying, denoted by X=1, or truth-telling, denoted by X=0. Let theta_I = Pr(X=0|I) and theta_G = Pr(X=1|G) be the probabilities of correct determinations for innocent and guilty suspects, respectively.
To provide estimates of theta_G and theta_I a pre-trial study is performed. Here the examiner administers the polygraph test to 20 ``guilty'' people and 18 ``innocent'' people, and each was asked ``are you innocent?'' All participants are instructed to say ``yes'' so that the guilty ones are lying, the innocent ones are not lying. The conditions of the test are otherwise exactly as used in testing a real suspect.
Let Y be the number out of the 20 guilty people that the examiner correctly identifies as lying, and let Z be the number out of the 18 innocent people that the examiner correctly identifies as truth-telling.

State the distribution of Y given theta_G. State the distribution of Z given theta_I.
Assuming uniform independent priors theta_G ~ U(0,1) and theta_I ~ U(0,1), what are the corresponding posteriors for theta_G and theta_I based on the observed pre-trial outcomes Y=17 and Z=18? What are the MLEs of theta_G and theta_I? What are the posterior means of theta_G and theta_I? Compare the posterior means with the MLEs as possible point estimates.
Simulate large samples (say, k=10,000) from each of the posteriors and use these to compute samples for R = theta_G/theta_I. Summarize the posterior for R and use it to explore whether or not the examiner is in fact better at detecting truth-tellers than liars on the basis of this data. What do you conclude?
Now return to the real suspect. We are really interested in the probability p_G that he is in fact guilty when the polygraph examiner declares him to be lying. If we suppose the prior probability he is guilty to be 0.5, then Bayes' theorem gives us p_G=theta_G/(theta_G+1-theta_I). Use the posterior simulations from (3.) to compute and summarize posterior samples for p_G. How likely is it that the suspect is guilty?
A non-Bayesian approach would be to simply estimate p_G by its MLE, namely p_G-hat = theta_G-hat / (theta_G-hat+1-theta_I-hat) based on the MLEs theta_G-hat and theta_I-hat. Is this a good idea given our data? Why?