ASA DataFest 2021 was held online April 9 - 11. Click here to see this year's winners.
What is DataFest?
ASA DataFestTM is a data analysis competition where teams of up to five students attack a large, complex, and surprise data set over a weekend. Your job is to represent your school by finding and communicating insights into these data. The teams that impress the judges will win prizes as well as glory for their school. Everyone will have a great experience, lots of food, and fun!
ASA DataFestTM is also a great opportunity to gain experience that employers are looking for. Having worked on a data analysis problem at this scale will certainly help make you a good candidate for any position that involves analysis and critical thinking, and it will provide a concrete example to demonstrate your experience during interviews.
ASA DataFestTM at Duke is organized by the Department of Statistical Science at Duke University, and co-hosted by the Departments of Statistics and Operations Research at UNC and Statistics at NCSU.
While ASA DataFestTM is a competition, the main goal of the event is to promote collaboration. Here are some testimonials from past participants:
It was a great experience, with a fun and interesting challenge. One of my favorite parts is how varied the presentations and projects from each team are. I love learning about ways in which others looked at and analyzed the same problem/ data.
DataFest was an awesome experience. To me, the best part was working in a team of friends that I usually hung out with, but had not had a chance to work together intensively on a project. We enjoyed analyzing the situations and solving problems together for our client. At the end of the day, we just got to know each other better. It was also fun to interact with other teams to explore other approaches while keeping in mind that we were in competition. The fact that we were given a huge amount of data really challenges us to come up with creative and practical approaches. Another important part was the presentation. Every team had to explain well to the judges their objectives and solutions. Our team won the Best Visualization award which is really awesome. Lastly, the food was fantastic.
Past DataFests at Duke
DataFest 2020 - COVID-19 Virtual Data Challenge
Goal: Explore data to understand a society impact of the COVID-19 pandemic other than its direct health outcomes. What have been the effects on pollution levels, transportation levels, or working from home? Has there been a change in the number of posts on TikTok? What is the impact on online education? The focus is up to you!
DataFest 2019 - Data source: Canadian National Women's Rugby Team
Goal: How do we quantify the role of fatigue and workload in a team’s performance in Rugby 7s? How reliable are the subjective wellness Fata? Should the quality of the opponent or the outcome of the game be considered when examining fatigue during a game? Can widely used measurements of training load and fatigue be improved? How reliable are GPS data in quantifying fatigue?
DataFest 2018 - Data source: Indeed
Goal: What advice would you give a new high school about what major to choose in college? How does Indeed's data compare to official government data on the labor market? Can it be used to provide good economic indicators?
DataFest 2017 - Data source: Expedia
Goal: How do visitors' searches relate to the choices of hotels booked or not booked? What role do external factors play in hotel choice?
Expedia provided DataFesters with data from search results from millions of visitors around the world who were interested in traveling to destinations all over the world. The data were in two files, one of which included data collected on search results from visitors' sessions, and another which contained detailed information about the destinations that visitors searched for.
DataFest 2016 - Data source: Ticketmaster
Goal: How can site visits be converted to ticket sales, and how can TicketMaster identify "true fans" of an artist or band?
Data consisted of three sets. One included events from the last 12 months that tracked customer travel through the website. Another provided information about advertising campaigns on Google, and the third included data on the events themselves.
DataFest 2015 - Data source: Edmunds.com
Goal: Detect insights into the process of car shopping that can help make the process easier for customers.
Data consist of visitor 'pathways' through a website that helps customers configure car features and shop for cars. Five data files were linked by a customer key, and including data about the customer, about his or her visits to the webpage, and, when applicable, about the car purchased and the dealership where the car was purchased.
DataFest 2014 - Data source: GridPoint
Goal: Help understand how customers can best save money and energy.
Data consisted of a random sample of customers, with five-minute aggregates over a year of energy consumption that was then aggregated across important features of the commercial properties, as well as supporting climate and location data.
DataFest 2013 - Data source: eHarmony
Goal: Help understand what qualities people look for in prospective dates.
The DataFest students worked with a large sample of prospective matches. For each customer, data were provided on his or her preferences, as well as four matches, their preferences, and information about whether parties contacted one another.
DataFest 2012 - Data source: Kiva.com
Goal: Help understand what motivates people to lend money to developing-nation entrepreneurs and what factors are associated with paying these loans.
Several data sets were provided, including characteristics of lenders and borrowers and loan pay-back data.