Packages and data

library(tidyverse)
library(infer)

manhattan <- read_csv("data/manhattan.csv")
mb_yawn <- read_csv("data/mb-yawn.csv")
set.seed(45618)

Exercises

  1. Analyze the Manhattan data. Is there enough evidence to suggest that the mean price of a one-bedroom apartment is greater than 2400? Why or why not?

  2. Analyze the Manhattan data. Is there enough evidence to suggest that the median price of a one-bedroom apartment is greater than 2600? Why or why not?

  3. Reproduce the analysis with the yawning data. Is there enough evidence to suggest that yawning and observing someone yawn are not independent? Why or why not?

Exercise 1

Consider the hypothesis test:

\[H_0: \mu = 2400\] \[H_A: \mu > 2400\] Let \(\alpha = 0.05\)

xbar_rent <- manhattan %>% 
   specify(response = rent) %>% 
   calculate(stat = "mean") %>% 
   pull(stat)

xbar_rent
## [1] 2625.8
null_dist_xbar <- manhattan %>% 
   specify(response = rent) %>% 
   hypothesize(null = "point", mu = 2400) %>% 
   generate(reps = 10000, type = "bootstrap") %>% 
   calculate(stat = "mean")
null_dist_xbar %>% 
   visualise(alpha = .5) +
   geom_vline(xintercept = 2400, color = "purple", lty = 2, size = 1) +
   theme_minimal(base_size = 16) +
   shade_p_value(obs_stat = xbar_rent, direction = "greater")

null_dist_xbar %>% 
   get_p_value(obs_stat = xbar_rent, direction = "greater")

Since the p-value is greater than \(\alpha\), we fail to reject the null hypothesis at the 0.05 significance level. Hence, we do not have enough evidence to suggest that the mean rent exceeds $2,400 per month.

Confidence interval comparison

manhattan %>% 
   specify(response = rent) %>% 
   generate(reps = 10000, type = "bootstrap") %>% 
   calculate(stat = "mean") %>% 
   conf_int(level = 0.90)
## Warning: 'conf_int' is deprecated.
## Use 'get_confidence_interval' instead.
## See help("Deprecated")

If the value of the parameter specified by the null hypothesis is contained in the 90% interval then the null hypothesis cannot be rejected at the 0.05 level in our above test.

Exercise 2

Consider the hypothesis test:

\[H_0: M = 2600\] \[H_A: M > 2600\] Let \(\alpha = 0.05\)

med_rent <- manhattan %>% 
   specify(response = rent) %>% 
   calculate(stat = "median") %>% 
   pull(stat)

med_rent
## [1] 2350
null_dist_med <- manhattan %>% 
   specify(response = rent) %>% 
   hypothesize(null = "point", med = 2600) %>% 
   generate(reps = 10000, type = "bootstrap") %>% 
   calculate(stat = "median")
null_dist_med %>% 
   visualise(alpha = .5) +
   geom_vline(xintercept = 2600, color = "purple", lty = 2, size = 1) +
   theme_minimal(base_size = 16) +
   shade_p_value(obs_stat = med_rent, direction = "greater")

null_dist_med %>% 
   get_p_value(obs_stat = med_rent, direction = "greater")

Since the p-value is greater than \(\alpha\), we fail to reject the null hypothesis at the 0.05 significance level. Hence, we do not have enough evidence to suggest that the median rent exceeds $2,600 per month.

Exercise 3

\[H_0: p_{treatment} = p_{control}\] \[H_A: p_{treatment} > p_{control}\]

diff_props <- mb_yawn %>% 
   specify(outcome ~ group, success = "yawn") %>% 
   calculate(stat = "diff in props", order = c("treatment", "control"))
null_dist_prop <- mb_yawn %>% 
   specify(outcome ~ group, success = "yawn") %>%
   hypothesize(null = "independence") %>% 
   generate(reps = 5000, type = "permute") %>% 
   calculate(stat = "diff in props", order = c("treatment", "control"))
null_dist_prop %>% 
   get_p_value(obs_stat = diff_props, direction = "greater")

Since the p-value is greater than \(\alpha\), we fail to reject the null hypothesis at the 0.05 significance level. Hence, we do not have enough evidence to suggest that there is an association between seeing someone yawn and yawning.

Stage, commit and push

  1. Stage your modified files.
  2. Commit your changes with an informative message.
  3. Push your changes to your GitHub repo.
  4. Verify your files were updated on GitHub.