Workshops

The following workshops will be held at Duke, participants from all schools are welcome. See here for visitor parking information.

Dinner will be served at all workshops.

Introduction to R

by John Little & John Herndon

  Sign up
  03/27/2018
  6 - 8pm
  Perkins 217

A gentle introduction to the basics of R using RStudio. Learn about managing your R projects, wrangling data, and writing clear code using the Tidyverse collection of R packages. If time allows: students will also gain brief introductions to visualization with ggvis and mapping with leaflet. Attendees will have the opportunity of supplementing the materials covered in this workshop with free academic access to the interactive training at DataCamp.com. Please note that laptops with Rstudio pre-installed are required for this workshop.

Working with large data

by Michael Akande

  Sign up
  03/29/2018
  6 - 8pm
  Bostock 127 (The Edge Workshop Room)

Tips and tricks for working with large datasets in R. Laptops with R and RStudio required.

Visualization in R using ggplot2

by Angela Zoss

  Sign up
  04/02/2018
  6 - 8pm
  Bostock 127 (The Edge Workshop Room)

In this workshop we will focus on ggplot2, a library for R that creates clear and well-designed visualizations and that plays well with other tidyverse packages. While prior experience with ggplot2 and with other tidyverse packages is not required, some basic familiarity with R is expected. Please consider attending (or viewing a recording of) our Introduction to R workshop before attending this workshop on ggplot2. In this workshop, we will use RStudio and RMarkdown files for all exercises. Laptops are required. Please make sure you come with RStudio and the tidyverse package installed. You may also want to install the knitr package to be able to compile the entire Rmarkdown file.

Introduction to Amazon Web Services (AWS)

by Brian Beach

  Sign up
  04/03/2018
  6 - 8pm
  Perkins 217

Amazon Web Services (AWS) offers reliable, scalable, and inexpensive cloud computing services including a comprehensive set of services to handle every step of the analytics process chain including data warehousing, business intelligence, batch processing, stream processing, machine learning, and data workflow orchestration. This workshop will introduce you to the tools available to help you during DataFest. Note that AWS is offering credits to DataFest participants who want to complete their analysis in the cloud.

Easy Interactive Charts and Maps with Tableau

by Eric Monson

  Sign up
  04/04/2018
  6 - 8pm
  Bostock 023

Tableau Public (available for both Windows and Mac) is free software that allows individuals to quickly and easily explore their data with a wide variety of visual representations, as well as create interactive web-based visualization dashboards. This workshop will focus on using Tableau Public to create data visualizations, starting with an overview of how the program thinks about data, common data manipulation and loading, and the terminology used. Activities will include a sample data visualization and mapping project, which will give people hands-on experience using Tableau’s basic chart types and dashboard creation tools. We will also discuss publishing to the Tableau Public web server and related services and tools, like the full Tableau Desktop application (free for full-time students). Laptops with tableau public pre-installed are required for this workshop.

Machine Learning and Data Mining

by Liz Lorenzi & Isaac Levine

  Sign up
  04/05/2018
  7 - 9pm
  Perkins 217

Introduction to machine learning and data mining algorithms. Laptops with R and RStudio required.

How Cloud Computing Empowers a Data Scientist

by David Giard (Microsoft)

No sign up needed
  04/08/2018
  8 - 9pm
  Broadhead 068

The last few years, we have seen an explosion in data science, artificial intelligence, and machine learning. The rise of cloud computing has been a major factor in this explosion. Cloud platforms, such as Microsoft Azure, enable users to quickly spin up clusters of computers to perform the high-performance calculations required by data science problems. Azure offers a number of tools for building data science solutions: From a drag and drop interface to Jupyter Notebooks as a service to a Virtual Machine pre-configured with powerful Machine Learning tools. In this session, you will learn how the cloud has impacted data science, the most useful tools in Azure for a data scientist, and when it is appropriate to use each tool.