From last time

Application Exercise 15

Explore characteristics of various schools of paintings school_pntg. You can, if you like, conduct statistical inference, but the main goal of this application exercise is to familiarize yourself with the data and find similarities and differences between the various schools of paintings. Make sure your exploration includes some visualization and some summary.

Modeling price

Distribution of price

qplot(price, data = pp)

plot of chunk unnamed-chunk-2

Price vs. …

plot1 = qplot(height_in, price, data = pp)
plot2 = qplot(width_in, price, data = pp)
multiplot(plot1, plot2, cols = 2)
## Warning: Removed 252 rows containing missing values (geom_point).
## Warning: Removed 256 rows containing missing values (geom_point).

plot of chunk unnamed-chunk-4

Regression diagnostics

pr_h = lm(price ~ height_in, data = pp)
plot1 = qplot(pr_h$fitted.values, pr_h$residuals)
plot2 = qplot(pr_h$residuals)
plot3 = qplot(sample = pr_h$residuals)
multiplot(plot1, plot2, plot3, cols = 3)
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.

plot of chunk unnamed-chunk-5

Log transformation

plot1 = qplot(log(price), data = pp)
plot2 = qplot(height_in, log(price), data = pp)
plot3 = qplot(width_in, log(price), data = pp)
multiplot(plot1, plot2, plot3, cols = 3)
## Warning: Removed 252 rows containing missing values (geom_point).
## Warning: Removed 256 rows containing missing values (geom_point).

plot of chunk unnamed-chunk-6

Regression diagnostics after log transformation

log_pr_h = lm(log(price) ~ height_in, data = pp)
plot1 = qplot(log_pr_h$fitted.values, log_pr_h$residuals)
plot2 = qplot(log_pr_h$residuals)
plot3 = qplot(sample = log_pr_h$residuals)
multiplot(plot1, plot2, plot3, cols = 3)