Machine Learning Model Optimization and Interpretability Analysis for Classifying Student Stress Levels

Samuel Wijayadi Sugiharto, Yessica Nataliani

Abstract


This study aims to compare and analyze the performance of several algorithms in classifying student stress levels. The dataset used in this research is the Student Lifestyle Dataset obtained from the Kaggle repository, consisting of 2,000 records with eight student lifestyle features. The methods employed include the implementation of three classification algorithms: Logistic Regression, K-Nearest Neighbors, and Support Vector Machine (SVM), across four experimental scenarios. These scenarios include a baseline model, handling imbalanced data using the Synthetic Minority Oversampling Technique (SMOTE), feature selection using Recursive Feature Elimination (RFE), and hyperparameter tuning. The results were evaluated using accuracy, precision, recall, and F1-score metrics. Furthermore, interpretability analysis of the best-performing model was conducted using SHAP. The findings indicate that the integration of data balancing techniques, feature selection, and parameter optimization with the SVM algorithm significantly improved performance, achieving an accuracy of 0.98, precision of 0.96, recall of 0.98, F1-score of 0.97, and a computation time of 0.011 seconds. The interpretability analysis revealed that lifestyle factors such as study duration and sleep duration had the most dominant influence on stress levels. These results demonstrate that the integrated optimization strategy successfully supports fast and accurate detection of student stress levels.

Keywords


classification; machine learning; model optimization; student stress; SHAP

Full Text:

PDF

References


E. Sheldon et al., “Prevalence and Risk Factors for Mental Health Problems in University Undergraduate Students: A Systematic Review with Meta-Analysis,” J. Affect. Disord., Vol. 287, No. December 2020, pp. 282–292, 2021, DOI: 10.1016/j.jad.2021.03.054.

A. Abdul Rahim et al., “The Pressure of Financial Status and its Effects on Public University Students Around Kuala Lumpur,” Int. J. Acad. Res. Bus. Soc. SCI., Vol. 13, No. 12, pp. 1545–1556, 2023, DOI: 10.6007/ijarbss/v13-i12/19568.

G. Navya and S. Sharma, “Impact of Spiritual Intelligence on Perceived Stress among Male and Female University Students,” Int. J. Bio-resource Stress Manag., Vol. 13, No. 1, pp. 62–68, 2022, DOI: 10.23910/1.2022.2511a.

J. Tri, V. Harefa, K. A. Manik, and Z. L. Habehan, “Stress , Anxiety , Depression and Social Media in Generation Z : A Scoping Review,” Vol. 4, No. 3, pp. 174–187, 2025, DOI: 10.56922/mhc.v4i3.1362.

R. K. Djoar and A. P. M. Anggarani, “Faktor - Faktor yang mempengaruhi Stress Akademik Mahasiswa Tingkat Akhir,” Jambura Heal. Sport J., Vol. 6, No. 1, pp. 52–59, 2024, DOI: 10.37311/jhsj.v6i1.24064.

G. Haixiang, L. Yijing, J. Shang, G. Mingyun, and H. Yuanyue, “Learning from Class-Imbalanced Data : Review of Methods and Applications,” Expert Syst. Appl., Vol. 73, pp. 220–239, 2017, DOI: 10.1016/j.eswa.2016.12.035.

J. Cai, J. Luo, S. Wang, and S. Yang, “Neurocomputing Feature Selection in Machine Learning : A New Perspective,” Neurocomputing, Vol. 300, pp. 70–79, 2018, DOI: 10.1016/j.neucom.2017.11.077.

P. Probst, “Tunability : Importance of Hyperparameters of Machine Learning Algorithms,” vol. 20, pp. 1–32, 2019.

M. Sulistiyono, Y. Pristyanto, S. Adi, and G. Gumelar, “Implementasi Algoritma Synthetic Minority Over-Sampling Technique untuk Menangani Ketidakseimbangan Kelas pada Dataset Klasifikasi,” Sistemasi, Vol. 10, No. 2, p. 445, 2021, DOI: 10.32520/stmsi.v10i2.1303.

A. M. R. Armaya, “Pengaruh Feature Selection dan Feature Extraction dalam Peningkatan Akurasi Klasifikasi Kebakaran Hutan,” JuTI “Jurnal Teknol. Informasi,” Vol. 3, No. 1, p. 13, 2024, DOI: 10.26798/juti.v3i1.1039.

I. Muhamad Malik Matin, “Hyperparameter Tuning Menggunakan GridsearchCV pada Random Forest untuk Deteksi Malware,” Multinetics, Vol. 9, No. 1, pp. 43–50, 2023, DOI: 10.32722/multinetics.v9i1.5578.

V. Hassija, V. Chamola, A. Mahapatra, A. Singal, D. Goel, and K. Huang, Interpreting Black ‑ Box Models : A Review on Explainable Artificial Intelligence. Springer US, 2024. DOI: 10.1007/s12559-023-10179-8.

H. Lamane, L. Mouhir, R. Moussadek, B. Baghdad, O. Kisi, and A. El, “Interpreting Machine Learning Models Based on Shap Values in Predicting Suspended Sediment Concentration,” Int. J. Sediment Res., Vol. 40, No. 1, pp. 91–107, 2025, DOI: 10.1016/j.ijsrc.2024.10.002.

S. Zhang, D. Cheng, Z. Deng, M. Zong, and X. Deng, “A Novel KNN Algorithm with Data-Driven k Parameter Computation,” Pattern Recognit. Lett., Vol. 109, pp. 44–54, 2018, DOI: 10.1016/j.patrec.2017.09.036.

J. Cervantes, F. Garcia-lamont, L. Rodríguez-mazahua, and A. Lopez, “Neurocomputing A Comprehensive Survey on Support Vector Machine Classification : Applications , Challenges and Trends,” Neurocomputing, No. xxxx, 2019, DOI: 10.1016/j.neucom.2019.10.118.

Anggi Trifani, Agus Perdana Windarto, and Hendry Qurniawan, “Penerapan Data Mining Klasifikasi C4.5 dalam Menentukan Tingkat Stres Mahasiswa Akhir,” Jural Ris. Rumpun Ilmu Tek., Vol. 1, No. 2, pp. 91–105, 2022, DOI: 10.55606/jurritek.v1i2.414.

S. Anisa, A. Komarudin, and E. Ramadhan, “Sistem Klasifikasi untuk Menentukan Tingkat Stress Mahasiswa secara umum Menggunakan Metode K-Nearest Neighbors,” J. Inform. Teknol. dan Sains, Vol. 6, No. 3, pp. 568–578, 2024, DOI: 10.51401/jinteks.v6i3.4317.

A. Anjani and Y. Yamasari, “Klasifikasi Tingkat Stres Mahasiswa menggunakan Metode berbasis Tree,” J. Informatics Comput. SCI., Vol. 05, pp. 83–89, 2023.

A. Aldi, S. R. C. Nursari, and F. Maspiyanti, “Deteksi Dini Tingkat Stres pada Mahasiswa menggunakan Metode Iterative Dichotomiser 3 dan K-Nearest Neighbour,” J. Informatics Adv. Comput., Vol. 1, No. 1, pp. 1–7, 2020.

I. Rusdiansyah, R. Pangestu, D. Azalia, M. F. Zhafran, and F. Saputra, “Integrasi Model Klasifikasi Tingkat Stress Mahasiswa berbasis Natural Language Processing,” Vol. 4, No. 4, pp. 7823–7831, 2025.

A. R. I. Pratama, S. A. Latipah, and B. N. Sari, “Optimasi Klasifikasi Curah Hujan menggunakan Support Vector Machine (SVM) dan Recursive Feature Elimination (RFE),” JIPI (Jurnal Ilm. Penelit. dan Pembelajaran Inform., Vol. 7, No. 2, pp. 314–324, 2022, DOI: 10.29100/jipi.v7i2.2675.

G. M. F. Id, “Challenges in the Real World use of Classification Accuracy Metrics : From Recall and Precision to the Matthews Correlation Coefficient,” pp. 1–27, 2023, DOI: 10.1371/journal.pone.0291908.

S. M. Lundberg and S. Lee, “Predictions,” No. Section 2, pp. 1–10, 2017.

M. D. Work and M. Students, “Mentally Demanding Work and Strain : Effects of Study Duration on Fatigue , Vigor , and Distress in Undergraduate Medical Students,” 2023.

S. A. Tafoya, V. Aldrete-cortez, F. Tafoya-ramos, C. Fouilloux-morales, and D. Claudia, “Sleep and Perceived Stress : An Exploratory Mediation Analysis of the Role of Self-Control and Resilience among University Students,” 2023.




DOI: https://doi.org/10.32520/stmsi.v15i5.6239

Article Metrics

Abstract view : 34 times
PDF - 6 times

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.