Hertzsprung–Russell diagrams
Hertzsprung–Russell diagrams are visualizations that show the
relationship between the brightness of stars and their temperatures.
Before starting the lab, read about H-R diagrams here.
A repository has already been created for your team and will be
available in the course GitHub organization. The dataset for this
assignment can be found as a csv file in the data folder of
your repository. This dataset represents data from over six thousand
stars as taken from the General Catalogue of Trigonometric Stellar
Parallaxes. There are only three variables in the provided dataset:
Vmag: Apparently visual band magnitude, a measure of
brightness. Don’t worry about the units for the purposes of this
lab.
Bvcol: The color of the star, which usually corresponds
to the temperature of the star (generally, negative value correspond to
hot blueish stars; values around 0.5 are white, and values above 1 or so
are cooler orange to red stars). Don’t worry about the units for the
purposes of this lab.
parallax: parallax in arcseconds, a measure of the
distance of the star from Earth
Exercises
- Create a linear model that predicts visual band magnitude with B-V
color and parallax. Assess the assumptions of this linear model (you may
assume independence is satisfied). Show any relevant plots.
- Create a scatterplot that has visual band magnitude on the y-axis
and B-V color on the x-axis (no need to visualize parallax in this
scatterplot; there is also no need to provide a title). Compare this to
an actual H-R diagram, for instance as depicted in the brief article
referenced above. What is an important variable that is missing from the
dataset that might help you best predict visual magnitude using B-V
color? Hint: there is a single correct answer here that
we’re looking for in particular. It’s not luminosity - we already have
this (it’s technically calculable from parallax and apparent visual
magnitude).
- While adjusting for parallax, do you think the missing variable from
Ex. 2 should enter your model as a main effect only, or additionally
with an interaction term? Explain.
- Write out your chosen model from Ex. 3 in correct mathematical
notation. Interpret the parameter corresponding to the intercept in
context of your model. Finally, provide the expected change in visual
band magnitude for a single unit increase in B-V color (while holding
parallax constant) using notation from your written model.
Hint: there shouldn’t be actual numbers here.
- (optional) if you’re bored Extend your
visualization Ex. 4 to create an actual H-R diagram that
additionally uses a realistic color scheme for the stars. Some colors
and their corresponding B-V values are available here. You may use
RGB hex color
#ffc66d for the largest B-V value and
#9bb0ff for the smallest B-V value. Please cite any sources
you use in the preparation of this plot! (this is a completely optional
question; you can’t lose points for attempting this).
There should only be one submission per team
on Gradescope.