Task

Price vs. a binary variable:

  • Fit a model predicting log(price) from one of the binary (0/1) variables in the dataset.

  • Write out the linear model, in the form \(\widehat{y} = b_0 + b_1 x\) but using the actual variables instead of \(y\) and \(x\), and using the estimated coefficients using \(b_0\) and \(b_1\).

  • Interpret the slope and the intercept.

  • Calculate and interpret \(R^2\).

Price vs. a numerical variable:

  • Fit a model predicting log(price) from one of the numerical variables in the dataset.

  • Write out the linear model, as above.

  • Interpret the slope and the intercept.

  • Calculate and interpret \(R^2\).

Price vs. material:

  • Recreate the recoding of the material variable: mat (from class)

  • Fit a model predicting log(price) from the recoded mat.

  • Write out the linear model, as above.

  • Interpret the slopes and the intercept.

  • Calculate and interpret \(R^2\).

  • Paintings with which material type are predicted to be the most expensive?

Synthesis:

At the end write one synthesis paragraph comparing your models and determine which model does the best job in explaining the variability in prices of paintings. Your interpretations should be in context of the data, which means you need to understand the context of your data. Thankfully your data expert will be available to answer questions on Piazza! (But don’t leave them till the last minute.)

Submission instructions

There should be a AppEx_09_29_2016.Rmd file in your Team’s AppEx repo, add you answers and writeup to that file and commit and push your changes to github.

Due date

Thursday, Oct 4, before class

Codebook