Explainable AI-Driven Predictive Analytics Framework for Student Performance and Dropout Detection
Abstract
The traditional methods of identifying academically at-risk students often include manual assessment and a retrospective analysis, both of which are time-consuming and are not predictive. In order to overcome these challenges, this paper suggests using machine learning techniques to develop a predictive analytics framework for student performance and dropouts that is based on Explainable Artificial Intelligence (XAI). The method proposed combines data preprocessing, feature selection, and predictive modeling using XGBoost and explainability using SHAP to make accurate and interpretable predictions of students' academic performance. This study uses the “Predict Students' Dropout and Academic Success” data set from the UCI Machine Learning Repository, which includes demographic, academic, financial, and institutional data for 4,424 students. Various machine learning algorithms, such as Decision Tree, Support Vector Machine (SVM), Random Forest, and XGBoost models, were applied and compared based on the following performance metrics: Accuracy, Precision, Recall, F1 score, and ROC-AUC score. The experimental results showed the proposed XGBoost model has higher accuracy, precision, recall, F1 score, and ROC-AUC score (94.18%, 93.74%, 93.21%, 93.47%, and 95.62%, respectively). In addition, curricular unit performance, admission grades, and tuition fee payment status were the most important factors for student academic outcomes and dropout that were identified through SHAP-based explainability analysis. The proposed framework facilitates the identification of academically at-risk learners at an early age and helps to make intelligent decisions with transparent and interpretable analytics. The results show that the suggested system can be very useful in educational institutions to improve student retention, provide better academic support mechanisms, and lower dropouts due to proactive intervention systems.