Lab 06 - Modeling Course Evaluations

Due: Friday, Mar 6 at 11:59pm


Many college courses conclude by giving students the opportunity to evaluate the course and the instructor anonymously. However, the use of these student evaluations as an indicator of course quality and teaching effectiveness is often criticized because these measures may reflect the influence of non-teaching related characteristics. The article titled, “Beauty in the classroom: instructors’ pulchritude and putative pedagogical productivity” (Hamermesh and Parker, 2005) found that instructors who are viewed to be better looking receive higher instructional ratings. (Daniel S. Hamermesh, Amy Parker, Beauty in the classroom: instructors pulchritude and putative pedagogical productivity, Economics of Education Review, Volume 24, Issue 4, August 2005, Pages 369-376, ISSN 0272-7757, 10.1016/j.econedurev.2004.07.013.

For this assignment you will analyze data from this study.

Getting started

Clone your assignment repo into RStudio Cloud and open the R Markdown file. Don’t forget to load in the necessary packages and configure git:

If you would like your git password cached for a week for this project, type the following in the Terminal:

You will need to enter your GitHub username and password one more time after caching the password. After that you won’t need to enter your credentials for 604800 seconds = 7 days. Note that this is only good for this single RStudio Cloud project – you will need to cache your credentials for each project you create.


The data were gathered from end of semester student evaluations for a large sample of professors from the University of Texas at Austin. In addition, six students rated the professors’ physical appearance. (This is a slightly modified version of the original data set that was released as part of the replication data for Data Analysis Using Regression and Multilevel/Hierarchical Models (Gelman and Hill, 2007).) The result is a data frame where each row contains a different course and columns represent variables about the courses and professors.

To get started, read in the data and save it as an object named evals.


Variable name Description
score Average professor evaluation score: (1) very unsatisfactory - (5) excellent
rank Rank of professor: teaching, tenure track, tenure
ethnicity Ethnicity of professor: not minority, minority
gender Gender of professor: female, male
language Language of school where professor received education: English or non-English
age Age of professor
cls_perc_eval Percent of students in class who completed evaluation
cls_did_eval Number of students in class who completed evaluation
cls_students Total number of students in class
cls_level Class level: lower, upper
cls_profs Number of professors teaching sections in course in sample: single, multiple
cls_credits Number of credits of class: one credit (lab, PE, etc.), multi credit
bty_f1lower Beauty rating of professor from lower level female: (1) lowest - (10) highest
bty_f1upper Beauty rating of professor from upper level female: (1) lowest - (10) highest
bty_f2upper Beauty rating of professor from upper level female: (1) lowest - (10) highest
bty_m1lower Beauty rating of professor from lower level male: (1) lowest - (10) highest
bty_m1upper Beauty rating of professor from upper level male: (1) lowest - (10) highest
bty_m2upper Beauty rating of professor from upper level male: (1) lowest - (10) highest


Write all R code according to the style guidelines discussed in class. Be especially careful about staying within the 80 character limit.

All team members must commit and push to receive full credit.

In addition to lm(), factor(), and c(), your code should only contain functions from the loaded R packages, unless an exercise states otherwise.

Data wrangling

  1. Create a new variable called bty_avg that is the average attractiveness score given by the six students for each professor (bty_f1lower through bty_m2upper). Add this new variable to the evals data frame. Do this in one pipe, using the rowwise() function. Template code is given below to guide you in the right direction, however you will need to fill in the blanks.

Some model building

  1. Fit a linear model with the goal of predicting average professor evaluation score based on average beauty rating (bty_avg) only. Write out the linear model, and note \(R^2\) and adjusted \(R^2\).

  2. Fit a linear model with the goal of predicting average professor evaluation score based on average beauty rating (bty_avg) and gender. Write out the linear model, and note \(R^2\) and adjusted \(R^2\).

  3. Interpret the slope and intercept of the model in Exercise 3 in context of the data.

  4. What is the equation of the line corresponding to male professors for the model in Exercise 3?

  5. For two professors who received the same beauty rating, which gender tends to have the higher course evaluation score?

  6. How does the relationship between beauty and evaluation score vary between male and female professors?

  7. How do the adjusted \(R^2\) values of the models from Exercises 2 and 3 compare? What does this tell us about how useful gender is in explaining the variability in evaluation scores when we already have information on the beauty score of the professor?

  8. Compare the slopes of bty_avg under the two models. Has the addition of gender to the model changed the parameter estimate (slope) for bty_avg?

  9. Create a new model called m_bty_rank using rank and bty_avg to predict score. Write the equation of the linear model and interpret the slopes and intercept in context of the data.

Model selection

Going forward, only consider the following variables as potential predictors: rank, ethnicity, language, age, cls_perc_eval, cls_did_eval, cls_students, cls_level, cls_profs, cls_credits, bty_avg.

  1. Which variable on its own would you expect to be the worst predictor of evaluation scores? Why?

  2. Check your suspicion from the previous exercise by fitting a linear model with that variable as the single predictor. Explain if your suspicion is warranted based on some result from the model.

  3. Suppose you wanted to fit a full model with the variables listed above. If you are already going to include cls_perc_eval and cls_students, which variable should you not include as an additional predictor? Why?

  4. Fit a full model with all predictors listed above (except for the one you decided to exclude) in Exercise 13.

Use function step(). You’ll need to look at the help to see how to set it up for backward elimination. In your code chunk set results = “hide” to suppress the traced steps in the backward elimination process.

  1. Use backward elimination with AIC as the criterion to determine the best model. You do not need to show all steps in your answer, just the output for the final model. What are the \(R^2\) and adjusted \(R^2\) values.

  2. Interpret the slopes of one continuous and one categorical predictor based on your final model.

  3. Explain how you would assess if the linearity assumption is satisfied for this final model. Would it still make sense to use this model if the linearity assumption was severely violated?

  4. Would you be comfortable making predictions with this model for professors at any university? Why or why not?


Knit to PDF to create a PDF document. Stage and commit all remaining changes, and push your work to GitHub. Make sure all files are updated on your GitHub repo.

Please only upload your PDF document to Gradescope. Associate the “Overall” graded section with the first page of your PDF, and mark where each answer is to the exercises. If any answer spans multiple pages, then mark all pages.

Only one team member needs to submit for the group. After you hit submit, go to View or edit group and select all your team members from the drop-down menu.