STA 199 | Intro to Data Science

Intro to data science and statistical thinking. Learn to explore, visualize,and analyze data to understand natural phenomena, investigate patterns, model outcomes,and make predictions, and do so in a reproducible and shareable manner. Gain experience in data wrangling and munging, exploratory data analysis, predictive modeling, data visualization, and effectively communicating results. Work on problems and case studies inspired by and based on real-world questions and data. The course will focus on the R statistical computing language.

Effective 03-23-20: All in-person scheduled meetings and office hours will be switched to Zoom meetings via Sakai. All times listed are in Eastern Standard Time.

Course info


  Social Sciences 136      Tue and Thu 10:05am - 11:20am


Lab 01      Old Chemistry 003       Fri 10:05am - 11:20am

Lab 02      Old Chemistry 003       Fri 11:45am - 1:00pm

Lab 03      Old Chemistry 003       Fri 1:25pm - 2:40pm

Teaching team and office hours

Instructor Shawn Santo   Mon 8:45am - 9:45am, Wed 4:30pm - 5:30pm Old Chemistry 207A
TAs Salvador Arellano   Wed 11:30am - 1:30pm Old Chemistry 203B
Max Bartlett   Sun 3:00pm - 5:00pm Old Chemistry 203B
Kate Chen   Tue and Thu 1:30pm - 2:30pm Old Chemistry 203B
Bin Han   Fri 3:00pm - 5:00pm Old Chemistry 025
Frances Hung   Tue 11:30am - 1:30pm Old Chemistry 025
Becky Tang   Mon 4:30pm - 5:30pm, Wed 3:30pm - 4:30pm Old Chemistry 203B


All books are freely available online. Hardcopies are also available for purchase.

R for Data Science Grolemund, Wickham O'Reilly, 1st edition, 2016
OpenIntro Statistics Diez, Barr, Çetinkaya-Rundel CreateSpace, 4th Edition, 2019
Introductory Statistics with Randomization and Simulation Diez, Barr, Çetinkaya-Rundel CreateSpace, 1st Edition, 2014


You should bring a fully-charged laptop, tablet with keyboard, or comparable device to every lecture and lab session.

This course has achieved Duke’s Green Classroom Certification. The certification indicates that the faculty member teaching this course has taken significant steps to green the delivery of this course. Your faculty member has completed a checklist indicating their common practices in areas of this course that have an environmental impact, such as paper and energy consumption. Some common practices implemented by faculty to reduce the environmental impact of their course include allowing electronic submission of assignments, providing online readings and turning off lights and electronics in the classroom when they are not in use. The eco-friendly aspects of course delivery may vary by faculty, by course and throughout the semester. Learn more at