Treffer: Machine Learning Approaches for Early Student Performance Prediction in Programming Education.
Weitere Informationen
Intelligent recommender systems are essential for identifying at-risk students and personalizing learning through tailored resources. Accurate prediction of student performance enables these systems to deliver timely interventions and data-driven support. This paper presents the application of machine learning models to predict final exam grades in a university-level programming course, leveraging multi-modal student data to improve prediction accuracy. In particular, a recent raw dataset of students enrolled in a programming course across 36 class sections from the Fall 2024 and Winter 2025 terms was initially processed. The data was collected up to one month before the final exam. From this data, a comprehensive set of features was engineered, including the student's background, assessment grades and completion times, digital learning interactions, and engagement metrics. Building on this feature set, six machine learning prediction models were initially developed using data from the Fall 2024 term. Both training and testing were conducted on this dataset using cross-validation combined with hyperparameter tuning. The XGBoost model demonstrated strong performance, achieving an accuracy exceeding 91%. To assess the generalizability of the considered models, all models were retrained on the complete Fall 2024 dataset. They were then evaluated on an independent dataset from Winter 2025, with XGBoost achieving the highest accuracy, exceeding 84%. Feature importance analysis has revealed that the midterm grade and the average completion duration of lab assessments are the most influential predictors. This data-driven approach empowers instructors to proactively identify and support at-risk students, enabling adaptive learning environments that deliver personalized learning and timely interventions. [ABSTRACT FROM AUTHOR]