Postoperative sore throat is a painful complication of intubation after surgery. Reutzler et al. (2013) performed an experiment among patients who required intubation. Prior to anesthesia, patients were randomly assigned to gargle either a licorice-based solution or sugar water (as placebo).

A repository has already been created for your team and will be available in the course GitHub organization. The dataset for this assignment can be found as a csv file in the data folder of your repository. This dataset represents data for the 235 patients in the study and have the following variables:

Exercises

  1. Fit a logistic regression model that predicts whether a patient experiences any throat pain 30 minutes after surgery based on all predictors in the dataset. Show the estimated coefficients and associated p-values for hypothesis tests of whether the associated parameter equals 0.
  2. Is there evidence of a treatment effect? Comprehensively explain using a formal hypothesis test and interpreting a relevant model coefficient on the odds ratio scale.
  3. Suppose two patients have the same sex, BMI, age, and ASA disease severity score. Using the model you created, could a patient who received licorice ever have a higher predicted probability of experiencing throat pain 30 minutes after surgery than a patient who received placebo? Explain why or why not. If we cannot tell for sure, explain why.
  4. Suppose you have two female patients: a 65 year old with a BMI of 21 and mild disease receiving a large surgery, and a 50 year old with a BMI of 29 and systemic disease receiving a small surgery, both of whom receive placebo. Which patient is more likely to experience throat pain? Calculate the risk ratio between these two women. Show your code/work.
  5. Use a threshold of 75% predicted probability of experiencing pain to classify a patient as experiencing any throat pain or not. Using this threshold, what are the positive and negative predicted values for your logistic regression model used as a classifier? Over all thresholds, what is the AUC of the ROC curve for your logistic regression model used as a classifier?

There should only be one submission per team on Gradescope.