Lab 05: Fixing merge conflicts and computing probabilities

Due: Thu, Mar 04 at 11:59pm ET

Goals

Getting started

Every team member should go to the course GitHub organization and locate their lab_05 repository, which should be named lab_05-<team name>. Copy the URL of the repository and clone the remote repo in RStudio.

Merge conflicts

You may have seen this already through the course of your collaboration last week in Lab 04. When two collaborators make changes to a file and push the file to their repository, git merges these two files.

If these two files have conflicting content on the same line, git will produce a merge conflict. Merge conflicts need to be resolved manually, as they require a human intervention:

To resolve the merge conflict, decide if you want to keep only your text/code, the text/code on GitHub, or incorporate changes from both sets. Delete the conflict markers <<<<<<<, =======, >>>>>>> and make the changes you want in the final merge.

Assign numbers 1, 2, 3, and 4 to each of your team members (if only 3 team members, just numbers 1 through 3). Go through the following steps in detail, which simulate a merge conflict. Completing this exercise will be part of the lab grade.

Resolving a merge conflict

Step 1: Everyone clone your team lab and open the Rmd file.

Members 3 & 4 should look at the group’s repo on GitHub to ensure that the other members’ files are pushed to GitHub after every step.

Step 2: Member 1 should change the team name to your team name. Knit, commit, and push.

Step 3: Member 2 should change the team name to something different (i.e., not your team name). Knit, commit, and push.

Member 2 should get an error on the attempted push.

Pull and review the document with the merge conflict. Member 2 should display and read the error to the entire team. A merge conflict occurred because Member 2 edited the same part of the document as Member 1. Resolve the conflict with whichever name you want to keep (please keep your real team name), then knit, commit and push again.

Step 4: Member 3 or 4 Write some narrative below the last code chunk in your lab_05.Rmd file. Knit to PDF, then stage, commit and push your .Rmd and PDF to GitHub.

You should get an error. Read this error to your teammates and show them the error by sharing your screen.

Pull and share your screen with your team. You should notice that the author line in the header has been updated. Knit to PDF, then stage, commit and push your .Rmd and PDF to GitHub.

Step 5: Everyone pull and delete the narrative. All team members should have the same content in the .Rmd file before proceeding to the exercises.

Exercises

Packages

library(tidyverse)

Data

The data comes from a cohort study of collegiate athletes using the National Collegiate Athletic Association (NCAA) Injury Surveillance System; certified athletic trainers recorded data during the 1997–2000 academic years.

The objective of the study was to compare sex differences regarding the incidence of concussions among collegiate athletes across three seasons in various sports.

More about the study can be found in the second reference of the References section.

concussion <- read_table("http://users.stat.ufl.edu/~winner/data/concussion.dat",
                         col_names = FALSE)

Write all R code according to the style guidelines discussed in class. Be especially careful about staying within the 80 character limit. Have your resulting tibble displayed for each exercise. Use dplyr functions where applicable.

For each probability exercise, assume we are randomly selecting individuals from this cohort study.

  1. Take a look at the variable names in concussion. Rename them with rename() so they are gender, sport, year, concussed, and count. Overwrite concussion.

  2. Convert year and concussed to be factors. Overwrite concussion.

  3. Compute the probability a male from this cohort study had an incidence of a concussion. Do the same for female. Your output should have two variables – gender, probability, and two rows.

  4. Given an athlete played soccer, what is the probability they had an incidence of a concussion. Your output should have one variable – probability, and one row. How does this number compare with your results in Exercise 3?

  5. Display in a bar plot the conditional probability of a concussion incident given sport and gender. It should look similar to what you see below, but you may implement your own theme and design. You don’t have to have the style features match to earn full credit here.

  1. Given an athlete had a concussion incident, what is the probability it was a female soccer player? Your output should have one variable – probability, and one row.

References

“Datasets”. Users.Stat.Ufl.Edu, 2021, http://users.stat.ufl.edu/~winner/datasets.html. Accessed 20 Feb 2021.

T. Covassin, C.B. Swanik, M.L. Sachs (2003). “Sex Differences and the Incidence of Concussions Among Collegiate Athletes”, Journal of Athletic Training, Vol. (38)3, pp238-244