San Francisco Rent

This week we will be predicting prices for rent in San Francisco. The dataset come from Craigslist postings from 2001 through 2018 in the San Francisco Bay area, as originally collected by Dr. Kate Pennington on her website here. We are using just a subset of these original data. The dataset is found in your GitHub repository for Lab 4.

The relevant variables are as follows (though note that there are quite a few other variables as well:

Exercises

  1. We are interested in predicting the price of a Craigslist apartment listing using the county, the number of beds, and the year of the posting. Create a linear model using these three predictors, treating county and year as categorical variables and beds as a continuous numeric variable. Evaluate the linearty and constant variance assumptions in this model using a well-labeled plot.
  2. Apply any relevant transformation(s) to your model such that linearity and constant variance are satisfied. Provide the form of your final model, as well as any well-labeled plots that support your case.
  3. Interpret the intercept estimate, slope parameter corresponding to beds, and the slope parameter corresponding to the year 2018 in your model.
  4. Now create an inflation-adjusted model. Go to the U.S. Bureau of Labor Statistics' Inflation Calculator here. Using "January" as the relevant month with which to do the adjustment (that is, assume all observations in the dataset were scraped in January of the relevant year), and use appropriate multipliers to standardize all prices to 2020 dollars. Fit the same model as you did in Exercise 4, and compare the slope parameters corresponidng to year in the two models. What do you notice?

There should only be one submission per team on Gradescope. All team members must make at least one meaningful commit to the repository for this week's lab.