- Clone your assignment repo in RStudio Cloud (
ae-10-model-selection-pp-TEAMNAME
)
- In your R Markdown file, load the data the way we have been doing it in the course slides, paying attention to which character strings will be labelled as
NA
s upon reading the data.
- See here for the full data dictionary.
Planning
Decide on a subset of variables to consider for your analysis.
- Think about it as focusing on certain aspects of the price determination, as opposed to all aspects.
- You’re allowed a maximum of 10 total variables to consider, including interactions.
- The more variables you consider the longer model selection will take so keep that in mind.
Decide among these which variables might make sense to interact. Remember, we consider interactions when variables might behave differently for various levels of another variable. Ideally, you would get expert guidance for choosing interactions. Below is a list of interactions compiled by them that might be potentially interesting:
- School of painting & landscape variables:
school_pntg
& landsALL
/ lands_figs
/ lands_ment
- Landscapes & paired paintings:
landsALL
/ lands_figs
/ lands_ment
& paired
- Other artists & paired paintings:
othartist
& paired
- Size & paired paintings:
surface
& paired
- Size & figures:
surface
& figures
/ nfigures
- Dealer & previous owner:
dealer
& prevcoll
- Winning bidder & prevcoll:
endbuyer
& prevcoll
- Winning bidder & artist living:
winningbiddertype
& artistliving
This is not an exhaustive list, so you might come up with others.
Model fitting, selection, and interpretation
- Use backwards elimination to do model selection. Make sure to show each step of decision (though you don’t have to interpret the models at each stage).
- Yes, this is tedious. And yes, there are ways of automating it. But for now, go through the process “manually”, to get a good sense of how the model selection algorithm works.
- Provide interpretations for the slopes for the final model you arrive at and create at least one visualization that supports your narrative.