In this case study, and in the subsequent mini homework, you will work with the 2016 General Social Survey (GSS). The data can be found in the \data
folder of your assignment repository. This is an excerpt from the 2016 GSS containing only the variables that will be used for these two assignments. We’re not distributing the entire dataset in order to keep the size of the dataset reasonable.
You can load the data with
gss16 <- read_csv("data/gss16_excerpt.csv")
Some of the questions ask about conditions for inference. Note that the GSS employs random sampling.
Go to the #assignment-links channel on Slack, click on the link, and accept the assignment.
As a follow up to the case study you worked on in class, you will evaluate whether Americans who identify as Republican and Democrat feel differently about evolution. In addition to the EVOLUTION
variable you will also use the PARTYID
variable. This variable stores answers to the following question:
Generally speaking, do you usually think of yourself as a Republican, Democrat, Independent, or what?
PARTYID
variable. Also in this data frame combine the levels of STRONG DEMOCRAT and NOT STR DEMOCRAT to DEMOCRAT and STRONG REPUBLICAN and NOT STR REPUBLICAN to REPUBLICAN. How many observations are in this data frame?Using CLT based methods (only if the sample size condition is satisfied). You will need to use the prop.test
function to conduct the test.
prop.test(x = [number of successes], n = [sample size], alternative = "two.sided")
Note: Number of successes and sample size are both vectors of length 2, one entry for Democrats and one for Republicans, e.g. if 10 out of 40 Republicans and 30 out of 50 Democrats believe in evolution: x = c(10, 30)
and n = c(40, 50)
.
Next we take a look at how much time Americans spend on email. The GSS asks
About how many minutes or hours per week do you spend sending and answering electronic mail or e-mail?
The answers to this question are stored in EMAILHR
and EMAILMIN
. The sum of these variables define the total amount of time respondents spend on email.
EMAILHR
and EMAILMIN
, and store this information in a new variable called EMAILTIME
.EMAILTIME
. What is the sample mean?t.test
function to conduct the test.t.test(gss$EMAIL, mu = 420, alternative = "less")
Total | 100 pts |
---|---|
Part 1 | 55 pts |
Part 2 | 35 pts |
Overall organization, code quakity, clarity, commits, etc. | 10 pts |