Download the development and challenge data sets.
Development data sets are provided for familiarizing yourself with the format and developing your learning model. Using them is optional, and your predictions on these data sets will not count toward determining the winner of the competition. Development data sets differ from challenge sets in that the actual student performance values for the prediction column, "Correct First Attempt", are provided for all steps—see the file ending in "_master.txt".
Predictions on challenge data sets will count toward determining the winner of the competition. In each of these two data sets, you'll be asked to provide predictions in the column "Correct First Attempt" for a subset of the steps. For more information on which steps these will be, see the bottom of our Data Format page.
tar xvzf kddcup_challenge.tar.gz
For a description of the format of the data, see the Data Format page.
Contact us via email.