STA 210: Final Project Instructions
Fall 2018

Introduction

The goal of the final project is to apply what you’ve learned in this course to conduct a statistical analysis. It should be an in-depth regression analysis of a question that interests you. This question may come from one of your other courses, your research interests, your future career interests, etc.


The project will consist of two components:

  • Project proposal: due Tuesday, November 13
  • Poster: Saturday, December 15, 2p - 5p (final exam period)


You may work on the project in groups of two or three. I encourage you to work in teams; however, you may work on the project individually if you strongly prefer.

Finding Data

It is best to start with the question of interest and finding the data second. As you’re looking for data, keep in mind your regression analysis must be done in R Studio. Once you find a data set, you should make sure you are able to load it into R Studio, especially if it is in a format we haven’t used in class before. If you’re having trouble loading your data set into R Studio, ask for help as soon as possible, so you can make any necessary adjustments before the project proposal is due.

The Data Visualization Services team (located in Bostock library) has written a guide for finding data for a regression analysis. Please visit the R Data Resources for Regression Analysis for guidelines to consider as you search for data along with suggestions for potential data sources.

Note: You may not use data sets we’ve analyzed for assignments or examples in this class.

Project Components

The final project is worth 15% of the final course grade. The project grade will be broken down as follows:

Project Proposal - Due November 13 at 11:59p

There are two main purposes of the project proposal:

  1. To help you think about the project early, so you can get a head start on finding data, reading relevant literature, thinking about the questions you wish to answer, etc.
  2. To ensure that the data you wish to analyze, methods you plan to use, and the scope of your analysis are feasible and will help you be successful for this project.

The proposal should be typed using R Markdown and submitted as a PDF document on Sakai. The proposal should include the following:

  • Introduction: This should be a brief description (about 1 to 2 paragraphs) of the problem/question you’re interested in analyzing.
  • Data: This should include the data source and a description of the data (variables in the data set, number of observations, groups represented (if applicable)). In addition to the description, use the glimpse() function to show the overview of the data.
  • Analysis: This should include a brief description of how you plan to analyze the data. The analysis plan should have the following components:
    • Description of the response variable and its variable type
    • Description of the explanatory variables and the population coefficients you wish to understand using statistical inference
    • Hypothesis/hypotheses regarding your question of interest
    • Proposed methods to use for the analysis (this may be updated later on as you conduct the analysis)
  • Reference: This should include the source of your data along with literature you plan to use as a reference.

Note: Your proposal must be approved before you begin any analysis. Failure to turn in the propsoal by the due date will result in a grade of 0 for the final project.

You are welcome to turn in the proposal early if you wish to get approval and thus start working on your analysis early.

Poster Session - December 15 2p - 5p

You will display your project on a poster that will be presented during the poster fair during the final exam period, December 15 2p - 5p. You must complete the final project and present your work at the poster fair in order to pass the course. (Please see the Course Policies page.)

Your poster will be graded based on the following criteria:

  • Consistency: Did you clearly answer the question of interest?
  • Clarity: Can the reader easily understand your analysis process and any sort of conclusions/arguments you make?
  • Relevancy: Did you use the appropriate statistical techniques to address your question? Was your analysis thorough (e.g. did you consider interactions in addition to main effects?) /li>
  • Interest: Did you attempt to answer a challenging and interesting question rather than just calculating a lot of descriptive statistics and simple linear regression models?
  • Presentation: Is your poster designed and organized in a way that is neat and easy to follow?

The grading rubric for the poster session may be found here. In addition to the poster, you will submit the .Rmd file of your code and a PDF containing an analysis of assumptions for your final model on Sakai.

Peer/Self Evaluation

If you work with a group, you will fill out a survey about each team member’s contribution. In addition to assessing your team member’s contribution, you will be asked to describe in detail how you contributed to the project. You must complete a peer/self evaluation in order to receive credit for the peer/self evaluation portion of the grade. If a team member gets an average rating indicating they have done less than 20% of the work, the other portions of the project grade may be negatively impacted.

Extra Credit

One goal of this project is for you to gain experience in creating an academic poster. We will use one of the lab dates for a session in which you will learn tips on how to effectively present your work using a poster. For this session, we need a few project posters (about 6 in each lab) to use as examples.

This is not only a good opportunity to get constructive feedback about your poster, but you can earn up to 5 points extra credit on your final project grade. Not to mention, it is a great incentive to complete the vast majority of your STA 210 project before exams!

If you would like to earn extra credit:

  1. Fill out this form by Tuesday, November 13 at 11:59p.
  2. Electronically submit a draft of the poster before the lab session. More details about the draft poster submission will be sent to those who sign up for the extra credit. In short: your analysis should be at or very close to completion (the main data analysis, models, and conclusions complete), and your poster components should be completed to the point that they would be ready to present to an audience.

We will have a guest from Data and Visualization Services leading this special lab session, so please do not sign up unless you are able to complete your analysis and provide a poster draft for the lab session. The session will take place on December 3.

Resources for Data Visualization and Poster Design