A brief outline of the steps you need to get started is shown below. See the Lab 01 Instructions for more details about the steps.
use_git_config()
function.Here are some tips as you complete HW 01:
We will use the following packages in this assignment:
library(tidyverse)
library(modelr)
library(readr)
library(broom)
We use functions from the readr
package to read in the bikeshare
data from a .csv file. The broom
package is used to display model output in a tidy format.
For this assignment, you will continue analyzing the bikeshare
dataset from Lab 01. The data comes from the Capital Bikeshare in Washington D.C. The Capital Bikeshare is a system in which customers can rent a bike for little cost, ride it around the city, and return it to a station near their destination. You can get more information about the bikeshare on their website, https://www.capitalbikeshare.com/. We will read in the data from the file bikeshare.csv located in the data folder.
bikeshare <- read_csv("data/bikeshare.csv")
This dataset contains information about the number of bike rentals, environmental conditions, and other information about the each day in 2011 and 2012. This anlaysis will focus on the following variables:
season |
1: Winter, 2: Spring, 3: Summer, 4: Fall |
temp |
Temperature (in \(^{\circ}C\)) ÷ 41 |
count |
total number of bike rentals |
The Computations & Concepts section of homework contains short answer questions about the concepts discussed in class. Some of these questions may also require short chunks of code to produce the output needed to answer the question. Answers should be written in complete sentences.
temp_c
that is calculated as temp * 41
.Fill in the code below to create a new dataframe with the modifications stated above and assign the dataframe to winter_data
.
_____ <- bikeshare %>%
filter(_____________) %>%
mutate(_____________)
temp_c
as the predictor variable and count
as the response and assign the results to winter_model
. Use the code below to display the model coefficients along with the test statistics and confidence intervals for the coefficients.winter_model %>%
tidy(conf.int=TRUE)
Interpret the 95% confidence interval for \(\beta_1\) in the context of the data.
Suppose we now calculate a 90% confidence interval for \(\beta_1\). Would the width of the interval be larger, smaller, or the same as the width of the 95% confidence interval calculated in the previous question? Briefly explain.
Based on the confidence interval, is there a statistically significant linear relationship between the temperature and the number of bike rentals in the winter? Briefly explain your reasoning.
What is the value of the test statistic associated with the null hypothesis \(H_0: \beta_1 = 0\)? Interpret this value in the context of the problem.
Suppose your roommate reads the regression output and says, “the probability that the slope is not 0 is 7.28e-25.” Is your friend correct? Briefly explain.
Use the code below to calculate R2, and interpret this value in the context of the problem.
rsquare(winter_model,winter_data)
The Data Analysis section of homework contains open-ended data analysis questions. Your response should be neatly organized and read as a complete narrative. This means that in addition to addressing the question there should also be data exploration and an analysis of the model assumptions. In short, these questions should be treated as “mini-projects”.
Use simple linear regression to describe the relationship between the temperature (temp_c
) and the number of bike rentals (count
) for spring season. The description should include discussion about the significance of the relationship and interpretations of any relevant model coefficients. Your response should also include data exploration and a discussion of the assumptions.
Total | 60 |
---|---|
Questions 1 - 8 | 30 |
Question 9 | 20 |
Documents neatly organized (Markdown and knitted documents) | 5 |
Grammar and writing quality | 3 |
Regular and informative commit messages | 2 |