The Big Data Challenge (BDC) involves STEM students undertaking independent research projects that tackle real-world problems with data science tools. For the month-long duration of the competition, teams of 4 students each are provided with datasets, tools for doing data science, as well as sample challenges; teams can then choose to either undertake exploratory data analysis or test a hypothesis about their data, with the guidance of mentors from industry and academia.
At the end of the competition, teams then produce scholarly reports about their research work, which a panel of judges examines in order to determine finalists - who then present their work at the end of the competition.
Underneath its 'competition' exterior, the BDC is a collaborative learning experience at its core. It is geared towards STEM students who have long-term plans of getting involved in research, and who want to learn vital computational skills for working with large amounts of data.
The BDC, in association with IBM Big Data University, will take place in the month of May, with the morning and afternoon of May 1st marking the BDC's Orientation Day. The remainder of the month will involve teams working independently on their projects at their own respective paces; participants will also be invited to ongoing workshops for data science tools over the course of the competition, as well as talks from leaders in industry and academia.
Students can either register online prior to May 1st, or at Orientation Day itself; the fee per student is $25. However, please note that the event is only open to U of T students enrolled in an undergradute program. Please also make sure to sign up on this form to complete your registration.
In the meantime, keep an eye on this page - we'll be updating it as more information is confirmed!
This year's BDC revolves around the theme of epidemiology, with a particular focus on the epidemiology of viruses. Below are related challenge sets that teams can work off of in order to develop their own projects.
What can open epidemiological data from the Government of Canada and the CDC tell us about patterns of infection?
What can weather data in conjunction with transmission/prevalence data tell us about the geographical/meteorological aspects of disease spread?
What can text mining on scientific communications and research within virus epidemiology tell us about both the current state and potential future of research?
What can mining Altmetric impact data tell us about patterns of research in virus epidemiology?
Idea proposed and approach to achieve it
Techniques for data and statistical analysis and methods used
Results, discussion and compelling arguments
Quality of the final report produced: language, figures, plots
|09:00||Breakfast + Coffee|
|09:15||Intro + Welcome + Breakfast|
|09:45||Keynote by Dr. Richard Summerbell, Dalla Lana School of Public Health|
|10:00||Introduction to Challenge Format, Tools, and Datasets|
|11:00||Resources/support available through U of T Libraries - Leanne Trimble, U of T Data and Statistics Librarian|
|11:25||Introduction to IBM Big Data University|
|11:40||IBM Data Science Experience Demo|
|May 1st||Orientation Day|
|May 2nd - 26th||Teams work independently on projects + Ongoing Workshops/Talks|
|May 2nd||Clustering in R (IBM BDU Workshop)|
|May 4th||Classification in R (IBM BDU Workshop)|
|May 15th||Editing 101: Learn To Be Your Own Editor (STEM Fellowship Scholarly Writing Workshop)|
|May 26th||Project Report Submission Deadline|
|May 31st||Final Project Presentations + Awards Ceremony|
Orientation Day will take place in Room 1190 at the Bahen Centre for Information Technology on the morning of May 1st.
Nope - aside from Orientation Day and our ongoing workshops/talks, any and all work towards your projects can take place whenever and wherever your team wants during the month of May. Your team may choose to have in-person meetings or collaborate remotely.
We'll do our best to help place you in a team at Orientation Day! We also encourage the formation of teams on the Orientation Day Facebook event page.
Tickets can be purchased at a cost of $25 on either Eventbrite or in person at Orientation Day (if any are still available by then). Please also make sure to register on this form for team-building/documentation purposes.
Absolutely! Those are there to help guide your brainstorming, but we allow and even actively encourage any and all ideas as long as the project still relates to the theme of epidemiology.
Teams can be 3 or 4 people: we highly recommend 4. Please note that participants must all be U of T undergraduates!
Previous coding experience is not required! The very point of the competition is to learn programming skills.
All workshops over the course of the competition are offered at no extra cost beyond the $25 ticket price, and are open only to BDC participants.