Lab 3 Objectives:In this lab, there are three problems involving probability.
The Monty Hall Problem or Let's Make a DealBefore there was Who Wants to be a Millionaire or The Weakest Link, there was a game show called Let's Make A Deal, hosted by Monty Hall. Contestants were offered a choice of three doors. Open the correct one, and you won a car or a grand vacation package. Open either of the others, and you got a donkey or a gag prize. There was a twist to the game; after the contestant had chosen, but before the door was opened, Monty would reveal the contents of one of the other doors to reveal a donkey or a gag gift and then ask the contestant if he or she wanted to stick with the first choice or switch to the other unopened door. Question: Is it a good idea to change your mind, a bad idea, or does it make no difference? This is a slippery one! Think about this and try out the game on the (Not really a biostatistics problem, but have fun!) Receiver-Operator Characteristic CurvesDownload (or copy from your POB CD) and import the dataset diabetes into S-Plus Use this to create the ROC curve for Exercise 20 in the HW. Add labels, titles, etc. What's the best cut-off point for declaring a positive test result? Updating ProbabilitiesFor this part of the lab, refer to the tuberculosis example in POB, pp 138-140. Note the specificity (0.9715), and the sensitivity (0.7333).We'll use S-Plus to construct a plot of "Probability of Disease given a Positive Test Result" versus "Disease Rate (prevalence) in the Population". We will need to create these two variables. First, create a new Dataframe; go to the File menu, select New, and then Data Set. To create a sequence of Prevalence in column 1, go to the Data menu, and select Fill....This will create a sequence of numbers. In the dialog, specify 1 for the column. For the length field enter how many prevalence values that you would like to plot, say n=25. Enter in the smallest prevalence for Start and then choose a value for Increment so that the last value in the column is the maximum prevalence that you wish to plot. i.e. Start + (n-1)*Increment. Note that the example in POB reports prevalence = 0.000093, so you may want to pick a range of values around this value. Click on Apply to create the column; if the results are what you want click on OK. Otherwise change the options and Apply again. Rename the column Prevalence or something more intuitive than V1. To create the sequence of probabilities, go to the Data menu and select Transform... Enter the name for the new column, Probability, in the Target Column. In the Expression field, enter the formula for updating probabilities as a function of Prevalence (or the name of your column 1) and the specificity and sensitivity. Use the equation on the top of page 139 for the formula. Make sure that you include parentheses so that the calculations are carried out correctly! Click on Apply, and verify by hand that you have the correct results. Then click on OK. You should have a dataframe now with two columns, Prevalence and Probability. To create a plot of these, go to the Graph menu and select 2D Plot. Choose Line Plot and then click on OK. In the popup dialog box, select Prevalence for the X column (the X-axis) and Probability for the Y-column (Y-axis). Click on OK. Edit as you see fit. Your figure should be able to stand on its own! Describe the resulting plot. You should be able to explain why a positive test result does not mean that the person actually has the disease and why the probability that a person has the disease given that they found out that they had a positive test result is not the same as the prevalence in the population. |