KDD 2011 Workshop: Knowledge Discovery in Educational Data

August 21, 2011

Held as part of 17th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2011) in San Diego, CA. August 21-24

Following up on the success of the 2010 KDD Cup competition, this workshop seeks to engage the cutting edge data mining community with the education community. We solicit papers addressing problems such as predicting future student performance and learning the underlying structure of student knowledge from large educational datasets. The 2010 KDD Cup competition showed that many traditional data mining techniques could be successfully applied to educational data to improve prediction. This workshop will be a venue to continue this research and further explore the nature of educational data and what factors are important in determining student knowledge.


The first objective of this workshop is to explore the opportunities for knowledge discovery in educational data. Educational data is becoming increasingly rich as more and more educational systems are going online and collecting large amounts of data. Repositories such as the Pittsburgh Science of Learning Center DataShop (http://pslcdatashop.org) contain a large number of available data sets that present tremendous research opportunities for the larger SIGKDD. These datasets are primarily from tutors focused on STEM (Science, Technology, Engineering and Mathematics) topics such as Algebra, Geometry, Physics, Chemistry and others.

The second objective is to provide a bridge to connect the relatively new educational data mining community to the SIGKDD community. As seen in the 2010 KDD Cup competition, there are a number of interesting educational data mining problems that could benefit from the methods discussed and presented at SIGKDD.

Workshop Schedule

Time Duration
Welcome and Introduction 1:00pm 15 min
A Learning Design Recommendation System Based on Markov Decision Processes
Guillaume Durand, Francois LaPlante, and Rita Kop (pdf)
1:15pm 15 min
Anticipating Teachers' Performance
Joana Barracosa and Claudia Antunes (pdf)
1:30pm 15 min
Improving Pedagogy by Analyzing Relevance and Dependency of Course Learning Outcomes
Thomas Devine, Mahmood Hossain, Erica Harvey, and Andreas Baur (pdf)
1:45pm 15 min
The Sum is Greater than the Parts: Ensembling Student Knowledge Models in ASSISTments
Sujith Gowda, Ryan S.J.D. Baker, Zachary Pardos and Neil Heffernan (pdf)
2:00pm 20 min
Multi-relational Matrix Factorization Models for Predicting Student Performance
Nguyen Thai-Nghe, Lucas Drumond, Tomas Horvath and Lars Schmidt-Thieme (pdf)
2:20pm 20 min
Response Tabling - A simple and practical complement to Knowledge Tracing
Qing Yang Wang, Paul Kehrer, Zachary Pardos and Neil Heffernan (pdf)
2:40pm 20 min
Break 3:00pm 15 min
Towards Identifying Teacher Topic Interests and Expertise within an Online Social Networking Site
Sen Cai, Siddharth Jain, Yu-Han Chang and Jihie Kim (pdf)
3:15pm 15 min
An Analysis of Response Time Data for Improving Student Performance Prediction
Xiaolu Xiong, Zachary Pardos and Neil Heffernan (pdf)
3:30pm 15 min
From Data to Actionable Knowledge: A Collaborative Effort with Educators
Sharath Srinivas, Eric Hamby, Robert Lofthus, Edward Caruthers, Jan Barett and Erin Ells (pdf)
3:45pm 15 min
Analyzing the language evolution of a science classroom via a topic model
Mohammad Khoshneshin, Mohammad Ahmadi Basir, Padmini Srinivasan, Nick Street and Brian Hand (pdf)
4:00pm 20 min
Using Graphical Models to classify Dialogue Transition in Online Q&A Discussion
Soo Won Seo, Jeon-Hyung Kang, Joanna Drummond and Jihie Kim (pdf)
4:20pm 20 min
Comparative Action Sequence Analysis with Hidden Markov Models and Sequence Mining
John Kinnebrew and Gautam Biswas (pdf)
4:40pm 20 min
Towards Automatic Hint Generation for a Data-Driven Novice Programming Tutor
Wei Jin, Lorrie Lehmann, Matthew Johnson, Michael Eagle, Behrooz Mostafavi, Tiffany Barnes and John Stamper (pdf)
5:00pm 15 min
5:15pm 15 min

Topics of Interest

We welcome papers describing original work. Areas of interest include but are not limited to:

Large Datasets

The following large educational datasets have been made available for easy download:

Important Dates

Submission Types

There are two types of submission:

All submissions should follow the ACM single column formatting guidelines (MS Word, LaTeX).

Submission Instructions

Submission is managed by EasyChair. You’ll need to register (free and quick procedure). To enter the conference submission section, please go to: http://www.easychair.org/conferences/?conf=kddined2011.

Authors of accepted papers will be invited to submit extended versions of papers to an upcoming special issue of the Journal of Educational Data Mining on the KDD 2011 Knowledge Discovery in Educational Data workshop.

Workshop Committee
John Stamper Carnegie Mellon University
Kenneth R. Koedinger Carnegie Mellon University
Geoff Gordon Carnegie Mellon University
Ryan Baker Worcester Polytechnic Institute
Alexandru Niculescu-Mizil NEC Laboratories America
Chih-Jen Lin National Taiwan University
Philip Pavlik Carnegie Mellon University
Ted Carmichael University of North Carolina at Charlotte
Neil Heffernan Worcester Polytechnic Institute
Zach Pardos Worcester Polytechnic Institute
Steve Ritter Carnegie Learning, Inc.
Luo Si Purdue University
Guatam Biswas Vanderbilt University


Contact John Stamper—john AT stamper DOT org