Datasets and Data Analytics
A primary mission for SPLICE is to support infrastructure to collect educational datasets, make them available to researchers, and provide appropriate analysis tools.
The SPLICE Dataset Catalog provides a collection of datasets available from various sources. We welcome contributions of datasets!
Here are our current dataset sources that are included in the dataset catalog:
- Datashop@CMU
- A compilation of open-source datasets in computing education, curated by the "Where is the data? Finding and reusing datasets in computing education" CompEd 23' working group. The working group aims to make research data more accessible and encourage open data practices in the computing education research (CER) community. For more information, please refer to the working group's paper: Kiesler, Natalie, John Impagliazzo, Katarzyna Biernacka, Amanpreet Kapoor, Zain Kazmi, Sujeeth Goud Ramagoni, Aamod Sane, Keith Tran, Shubbhi Taneja, and Zihan Wu. "Where's the Data? Exploring Datasets in Computing Education." In Proceedings of the ACM Conference on Global Computing Education Vol 2, pp. 209-210. 2023.
- Demirtas, Fowler, & Cunningham (2024). Reexamining Learning Curve Analysis in Programming Education: The Value of Many Small Problems. In Proceedings of the 17th International Conference on Educational Data Mining.
- Huang et al (2023), Supporting skill integration in an intelligent tutoring system for code tracing.
- Rivers et al (2016), Learning Curve Analysis for Programming: Which Concepts do Students Struggle With? Proceedings of ICER 2016.