What can data science tell us
about past, current, and future
patterns of infection?

University of Toronto - May 2017

What is the Big Data Challenge?


The Big Data Challenge (BDC) involves STEM students undertaking independent research projects that tackle real-world problems with data science tools. For the month-long duration of the competition, teams of 4 students each are provided with datasets, tools for doing data science, as well as sample challenges; teams can then choose to either undertake exploratory data analysis or test a hypothesis about their data, with the guidance of mentors from industry and academia.

At the end of the competition, teams then produce scholarly reports about their research work, which a panel of judges examines in order to determine finalists - who then present their work at the end of the competition.

Underneath its 'competition' exterior, the BDC is a collaborative learning experience at its core. It is geared towards STEM students who have long-term plans of getting involved in research, and who want to learn vital computational skills for working with large amounts of data.

Competition Details

The BDC, in association with IBM Big Data University, will take place in the month of May, with the morning and afternoon of May 1st marking the BDC's Orientation Day. The remainder of the month will involve teams working independently on their projects at their own respective paces; participants will also be invited to ongoing workshops for data science tools over the course of the competition, as well as talks from leaders in industry and academia.

Students can either register online prior to May 1st, or at Orientation Day itself; the fee per student is $25. However, please note that the event is only open to U of T students enrolled in an undergradute program. Please also make sure to sign up on this form to complete your registration.

In the meantime, keep an eye on this page - we'll be updating it as more information is confirmed!


This year's BDC revolves around the theme of epidemiology, with a particular focus on the epidemiology of viruses. Below are related challenge sets that teams can work off of in order to develop their own projects.

What can open epidemiological data from the Government of Canada and the CDC tell us about patterns of infection?
What can weather data in conjunction with transmission/prevalence data tell us about the geographical/meteorological aspects of disease spread?
What can text mining on scientific communications and research within virus epidemiology tell us about both the current state and potential future of research?
What can mining Altmetric impact data tell us about patterns of research in virus epidemiology?

Judging Criteria

Idea proposed and approach to achieve it
Techniques for data and statistical analysis and methods used
Results, discussion and compelling arguments
Quality of the final report produced: language, figures, plots


Orientation Day

Bahen Centre, Room 1190 - May 1st, 2017
09:00Breakfast + Coffee
09:15Intro + Welcome + Breakfast
09:45Keynote by Dr. Richard Summerbell, Dalla Lana School of Public Health
10:00Introduction to Challenge Format, Tools, and Datasets
10:45Coffee break
11:00Resources/support available through U of T Libraries - Leanne Trimble, U of T Data and Statistics Librarian
11:25Introduction to IBM Big Data University
11:40IBM Data Science Experience Demo

Competition Schedule

May 1stOrientation Day
May 2nd - 26thTeams work independently on projects + Ongoing Workshops/Talks
May 2ndClustering in R (IBM BDU Workshop)
May 4thClassification in R (IBM BDU Workshop)
May 15thEditing 101: Learn To Be Your Own Editor (STEM Fellowship Scholarly Writing Workshop)
May 26thProject Report Submission Deadline
May 31stFinal Project Presentations + Awards Ceremony


When and where is the Orientation Day happening?

Orientation Day will take place in Room 1190 at the Bahen Centre for Information Technology on the morning of May 1st.

The competition is a month long. Do I need to be on campus for all of it?

Nope - aside from Orientation Day and our ongoing workshops/talks, any and all work towards your projects can take place whenever and wherever your team wants during the month of May. Your team may choose to have in-person meetings or collaborate remotely.

I don't have a team yet - what do I do?

We'll do our best to help place you in a team at Orientation Day! We also encourage the formation of teams on the Orientation Day Facebook event page.

Where can I purchase a ticket?

Tickets can be purchased at a cost of $25 on either Eventbrite or in person at Orientation Day (if any are still available by then). Please also make sure to register on this form for team-building/documentation purposes.

My team's project doesn't follow the suggested challenges. Is that okay?

Absolutely! Those are there to help guide your brainstorming, but we allow and even actively encourage any and all ideas as long as the project still relates to the theme of epidemiology.

How many students are allowed per team?

Teams can be 3 or 4 people: we highly recommend 4. Please note that participants must all be U of T undergraduates!

Do I need to know how to code?

Previous coding experience is not required! The very point of the competition is to learn programming skills.

What are the prizes for the winners?

Coming soon!

Are there additional fees for the ongoing workshops?

All workshops over the course of the competition are offered at no extra cost beyond the $25 ticket price, and are open only to BDC participants.


Primary Sponsors: