A study of conducted in Whickham, England recorded participants’ age, smoking status at baseline, and then 20 years later recorded their health outcome.
The data can be found in the mosaicData
package.
In your console, run the following to install this package:
install.packages("mosaicData")
Then load the package with
library(mosaicData)
and load the data with
data(Whickham)
Take a peek at the codebook with
?Whickham
or at https://www.rdocumentation.org/packages/mosaicData/versions/0.14.0/topics/Whickham.
Go to the #assignment-links channel on Slack and click on the link for mini-hw-06, and accept the assignment. This will automatically put you in the teams you created previously. You can confirm this by looking at the name of your repo (it will have your team name on it).
Then, each team member can follow the usual steps to clone the repo and get started with the analysis.
What type of study do you think these data comne from: observational or experiment? Why?
How many observations are in this dataset? What does each observation represent?
How many variables are in this dataset? What type of variable is each? Display each variable using an appropriate visualization.
What would you expect the relationship between smoking status and health outcome to be?
Create a visualization depicting the relationship between smoking status and health outcome. Briefly describe the relationship, and evaluate whether this meets your expectations. You can also create a contingency table and use the values on the table to calculate conditional probabilities to help your narrative:
Whickham %>%
count(smoker, outcome)
age_cat
using the following scheme:age <= 44 ~ "18-44"
age > 44 & age <= 64 ~ "45-64"
age > 64 ~ "65+"
age_cat
. What changed? What might explain this change? You can extend the contingency table from earlier by breaking it down by age category and use it to help your narrative.Whickham %>%
count(smoker, age_cat, outcome)
Total | 15 pts |
---|---|
Questions 1-3 | 1 pt / question - 3 pts |
Questions 4-7 | 2 pt / question - 8 pts |
Code style | 1 pt |
Informatively named code chunks | 1 pt |
Commit frequency and informative messages | 1 pt |