At the start of the competition, we will provide 5 data sets: 3 development data sets and 2 challenge data sets. Each of the data sets will be divided into a training portion and a test portion, as specified on the Data Format page. Student performance labels will be withheld for the test portion of each data set. The competition task will be to develop a learning model based on the challenge and/or development data sets, use this algorithm to learn from the training portion of the challenge data sets, and then accurately predict student performance in the test sections. At the end of the competition, the actual winner will be determined based on their model's performance on an unseen portion of the challenge test sets. We will only evaluate each team's last submission of the challenge sets.
What do these data look like? Visit the Data Format page to learn more, then download the data.
Call for participants
Registration opens at 2pm EDT, development data sets available
Competition starts at 2pm EDT, challenge data sets available
Competition ends at 11:59pm EDT
Fact sheet and team composition info due by 11:59pm EDT
Winners announced
KDD Cup Workshop
KDD Cup is the annual Data Mining and Knowledge Discovery competition organized by ACM Special Interest Group on Knowledge Discovery and Data Mining (KDD), the leading professional organization of data miners.
This year's competition is hosted by PSLC DataShop. Learn more about the organizers and sponsors.
Contact us via email.