Demonstrate a thorough understanding of logistic regression
Practice using logistic regression models to make predictions
For this assignment you must have at least three meaningful commits and all of your code chunks must have informative names.
All code should follow the tidyverse style guidelines, including not exceeding the 80 character limit.
In this assignment you will be working with a dataset containing information on individuals from the Donner party. The Donner party was a group of pioneers traveling to California from Missouri on the Oregon trail by wagon train. They were trapped in the Sierra Nevada mountains by extremely heavy snowfall during the winter of 1846-1847 and eventually ran out of food supplies. Of the 90 members of the party, only 48 survived. We will use logistic regression to model the probability of survival based on age and sex. Relevant data is contained in donner.csv
.
What is the relationship between sex and survival? Effectively visualize the relationship and summarize what you observe in a brief sentence.
What is the relationship between age and survival? Effectively visualize the relationship and summarize what you observe in a brief sentence.
Fit a logistic regression model to predict survival based on sex and age. You do not need to include an interaction. Report the model output in tidy format.
Write out the logistic regression model.
Provide an interpretation of \(e^{\hat{\beta}_0}\) in the context of the problem.
Provide an interpretation of \(e^{\hat{\beta}_\text{age}}\) in the context of the problem.
Provide an interpretation of \(e^{\hat{\beta}_\text{sex}}\) in the context of the problem.
What is the predicted probability of survival for a 60 year old man? For a 20 year old man? For a female newborn?
Create a predicted probability plot showing the effect of age and sex on survival. Comment on what you observe.
How young or old must a female member of the Donner party be in order to have a predicted probability of survival greater than 0.75 based on your logistic regression model? Use algebra (not code) to answer.
What are some limitations of your model given the data? Answer in a brief paragraph.
Knit to PDF to create a PDF document. Stage and commit all remaining changes, and push your work to GitHub. Make sure all files are updated on your GitHub repo.
Only upload your PDF document to Gradescope. Before you submit the uploaded document, mark where each answer is to the exercises. If any answer spans multiple pages, then mark all pages. Associate the “Overall” section with the first page.