Hertzsprung–Russell diagrams

Hertzsprung–Russell diagrams are visualizations that show the relationship between the brightness of stars and their temperatures. Before starting the lab, read about H-R diagrams here.

A repository has already been created for your team and will be available in the course GitHub organization. The dataset for this assignment can be found as a csv file in the data folder of your repository. This dataset represents data from over six thousand stars as taken from the General Catalogue of Trigonometric Stellar Parallaxes. There are only three variables in the provided dataset:

Exercises

  1. Create a linear model that predicts visual band magnitude with B-V color and parallax. Assess the assumptions of this linear model (you may assume independence is satisfied). Show any relevant plots.
  2. Create a scatterplot that has visual band magnitude on the y-axis and B-V color on the x-axis (no need to visualize parallax in this scatterplot; there is also no need to provide a title). Compare this to an actual H-R diagram, for instance as depicted in the brief article referenced above. What is an important variable that is missing from the dataset that might help you best predict visual magnitude using B-V color? Hint: there is a single correct answer here that we’re looking for in particular. It’s not luminosity - we already have this (it’s technically calculable from parallax and apparent visual magnitude).
  3. While adjusting for parallax, do you think the missing variable from Ex. 2 should enter your model as a main effect only, or additionally with an interaction term? Explain.
  4. Write out your chosen model from Ex. 3 in correct mathematical notation. Interpret the parameter corresponding to the intercept in context of your model. Finally, provide the expected change in visual band magnitude for a single unit increase in B-V color (while holding parallax constant) using notation from your written model. Hint: there shouldn’t be actual numbers here.
  5. (optional) if you’re bored Extend your visualization Ex. 4 to create an actual H-R diagram that additionally uses a realistic color scheme for the stars. Some colors and their corresponding B-V values are available here. You may use RGB hex color #ffc66d for the largest B-V value and #9bb0ff for the smallest B-V value. Please cite any sources you use in the preparation of this plot! (this is a completely optional question; you can’t lose points for attempting this).

There should only be one submission per team on Gradescope.