Predicting Students Performance Using Supervised Machine Learning Based on Imbalanced Dataset and Wrapper Feature Selection

Sadri Alija, Edmond Beqiri, Alaa Sahl Gaafar, Alaa Khalaf Hamoud


For learning environments like schools and colleges, predicting the performance of students is one of the most crucial topics since it aids in the creation of practical systems that, among other things, promote academic performance and prevent dropouts. The decision-makers and stakeholders in educational institutions always seek tools that help in predicting the number of failed courses for the students. These tools can help in finding and investigating the factors that led to this failure. In this paper, many supervised machine learning algorithms will investigate finding and exploring the optimal algorithm for predicting the number of failed courses of students. An imbalanced dataset will be handled with Synthetic Minority Oversampling TEchinque (SMOTE) to get an equal representation of the final class. Two feature selection approaches will be implemented to find the best approach that produces a highly accurate prediction. Wrapper with Particle Swarm Optimization (SPO) will be applied to find the optimal subset of features, and Info Gain with ranker to get the most correlated individual features to the final class. Many supervised algorithms will be implemented such as (Naïve Bayes, Random Forest, Random Tree, C4.5, LMT, Logistic, and Sequential Minimal Optimization algorithm (SMO)). The findings show that the wrapper filter with SPO-based SMOTE outperforms the Info-Gain filter with SMOTE and improves the performance of the algorithms. Random Forest outperforms the other supervised machine learning algorithms with (85.6%) in TP average rate and Recall, and (96.7%) in ROC curve.

Full Text:



U. Bin Mat, N. Buniyamin, P. M. Arsad, and R. A. Kassim, “An overview of using academic analytics to predict and improve students’ achievement: A proposed proactive intelligent intervention,” in 2013 IEEE 5th International Conference on Engineering Education: Aligning Engineering Education with Industrial Needs for Nation Development, ICEED 2013, 2014.

U. Fayyad, G. Piatetsky-Shapiro, and P. Smyth, “From data mining to knowledge discovery in databases,” AI Mag., vol. 17, no. 3, 1996.

A. El-Halees, “Mining Students Data To Analyze Learning Behavior : a Case Study Educational Systems,” Work, 2008.

A. B. E. D. Ahmed and I. S. Elaraby, “Data Mining: A prediction for performance improvement using classification,” World J. Comput. Appl. Technol., vol. 2, no. 2, 2014.

U. K. Pandey and S. Pal, “Data Mining: A prediction of performer or underperformer using classification,” arXiv Prepr. arXiv1104.4163, 2011.

S. M. M. Syed Tahir Hijazi1 & Raza Naqvi, “Factors affecting students’ performance: A case of private colleges,” Bangladesh e-Journal Sociol., vol. 3, no. 1, pp. 1–10, 2006.

Z. N. Khan, “Scholastic Achievement of Higher Secondary Students in Science Stream,” J. Soc. Sci., vol. 1, no. 2, 2005.

Z. J. Kovacic, “Early Prediction of Student Success: Mining Students Enrolment Data,” in Proceedings of the 2010 InSITE Conference, 2010.

G. (Univ T. A. Ben-Zadok, R. (Univ T. A. Mintz, A. (Univ T. A. Hershkovitz, and R. (Univ T. A. Nachmias, “Examining online learning processes based on log files analysis: A case study,” Res. Reflections Innov. Integr. ICT Educ. Proc. Fifth Intertnational Conf. Multimdeia ICT Educ., no. 2, 2009.

Q. A. Al-Radaideh, E. M. Al-Shawakfa, and M. I. Al-Najjar, “Mining student data using decision trees,” in International Arab Conference on Information Technology (ACIT’2006), Yarmouk University, Jordan, 2006.

A. K. Hamoud, A. S. Hashim, and W. A. Awadh, “Predicting Student Performance in Higher Education Institutions Using Decision Tree Analysis,” Int. J. Interact. Multimed. Artif. Intell., 2018.

B. Carson, “The transformative power of action learning,” Chief Learn. Off. Retrieved, 2017.

U. Sekaran and R. Bougie, Research methods for business: A skill building approach. john wiley & sons, 2016.

B. Remeseiro and V. Bolon-Canedo, “A review of feature selection methods in medical applications,” Computers in Biology and Medicine, vol. 112. 2019.

Y. Kim, W. N. Street, and F. Menczer, “Evolutionary model selection in unsupervised learning,” Intell. Data Anal., vol. 6, no. 6, 2002.

B. Xue, M. Zhang, and W. N. Browne, “Particle swarm optimization for feature selection in classification: A multi-objective approach,” IEEE Trans. Cybern., vol. 43, no. 6, 2013.

Y. Shi and R. Eberhart, “Modified particle swarm optimizer,” in Proceedings of the IEEE Conference on Evolutionary Computation, ICEC, 1998.

L. Yu and H. Liu, “Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution,” in Proceedings, Twentieth International Conference on Machine Learning, 2003, vol. 2.

E. Frank, M. A. Hall, and I. H. Witten, “The WEKA Workbench Data Mining: Practical Machine Learning Tools and Techniques,” Morgan Kaufmann, Fourth Ed., 2016.

U. M. Fayyad and K. B. Irani, “Multi-interval discretization of continuous-valued attributes for classification learning,” in Proceedings of the 13th International Joint Conference on Artificial Intelligence, 1993.

H. Liu, F. Hussain, C. L. Tan, and M. Dash, “Discretization: An enabling technique,” Data Min. Knowl. Discov., vol. 6, no. 4, 2002.

F. Provost and T. Fawcett, “Robust classification for imprecise environments,” Mach. Learn., vol. 42, no. 3, 2001.

A. S. Desuky, A. H. Omar, and N. M. Mostafa, “Boosting with crossover for improving imbalanced medical datasets classification,” Bull. Electr. Eng. Informatics, vol. 10, no. 5, 2021.

J. Xiao, L. Xie, C. He, and X. Jiang, “Dynamic classifier ensemble model for customer classification with imbalanced class distribution,” Expert Syst. Appl., vol. 39, no. 3, 2012.

C. Lu, S. Lin, X. Liu, and H. Shi, “Telecom fraud identification based on ADASYN and random forest,” in 2020 5th International Conference on Computer and Communication Systems, ICCCS 2020, 2020.

C. Padurariu and M. E. Breaban, “Dealing with data imbalance in text classification,” in Procedia Computer Science, 2019, vol. 159.

T. M. Ha and H. Bunke, “Off-line, handwritten numeral recognition by perturbation method,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 19, no. 5, 1997.

N. V Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: synthetic minority over-sampling technique,” J. Artif. Intell. Res., vol. 16, pp. 321–357, 2002.

M. A. Kumar and A. J. Laxmi, “Machine Learning Based Intentional Islanding Algorithm for DERs in Disaster Management,” IEEE Access, vol. 9, 2021.

A. K. Hamoud, “Selection of Best Decision Tree Algorithm for Prediction and Classification of Students’ Action,” Am. Int. J. Res. Sci. Technol. Eng. Math., vol. 16, no. 1, pp. 26–32, 2016.

A. S. Hashim, W. A. Awadh, and A. K. Hamoud, “Student performance prediction model based on supervised machine learning algorithms,” in IOP Conference Series: Materials Science and Engineering, 2020, vol. 928, no. 3, p. 32019.

T. Saba, I. Abunadi, M. N. Shahzad, and A. R. Khan, “Machine learning techniques to detect and forecast the daily total COVID-19 infected and deaths cases under different lockdown types,” Microsc. Res. Tech., vol. 84, no. 7, 2021.

I. A. Najm, A. K. Hamoud, J. Lloret, and I. Bosch, “Machine Learning Prediction Approach to Enhance Congestion Control in 5G IoT Environment,” Electronics, vol. 8, no. 6, p. 607, May 2019.

J. Chen, Y. Lian, and Y. Li, “Real-time grain impurity sensing for rice combine harvesters using image processing and decision-tree algorithm,” Comput. Electron. Agric., vol. 175, 2020.

I. S. Masad, A. Al-Fahoum, and I. Abu-Qasmieh, “Automated measurements of lumbar lordosis in T2-MR images using decision tree classifier and morphological image processing,” Eng. Sci. Technol. an Int. J., vol. 22, no. 4, 2019.

S. Khatoon et al., “Development of social media analytics system for emergency event detection and crisismanagement,” Comput. Mater. Contin., vol. 68, no. 3, 2021.

H. Li, D. Caragea, C. Caragea, and N. Herndon, “Disaster response aided by tweet classification with a domain adaptation approach,” J. Contingencies Cris. Manag., vol. 26, no. 1, 2018.

Y. Y. Song and Y. Lu, “Decision tree methods: applications for classification and prediction,” Shanghai Arch. Psychiatry, vol. 27, no. 2, 2015.

N. Mahdi Abdulkareem and A. Mohsin Abdulazeez, “Machine Learning Classification Based on Radom Forest Algorithm: A Review,” Int. J. Sci. Bus., vol. 5, no. 2, 2021.

S. M. Rasoolimanesh, M. Wang, J. L. Roldán, and P. Kunasekaran, “Are we in right path for mediation analysis? Reviewing the literature and proposing robust guidelines,” J. Hosp. Tour. Manag., vol. 48, 2021.

G. Biau and E. Scornet, “A random forest guided tour,” Test, vol. 25, no. 2, 2016.

N. Landwehr, M. Hall, and E. Frank, “Logistic Model Trees,” Mach. Learn., vol. 59, no. 1, pp. 161–205, 2005.

W. S. Noble, “What is a support vector machine?,” Nature Biotechnology, vol. 24, no. 12. 2006.

T. Joachims, “Svmlight: Support vector machine,” SVM-Light Support Vector Mach. http//, Univ. Dortmund, vol. 19, no. 4, 1999.

S. Ghosh, A. Dasgupta, and A. Swetapadma, “A study on support vector machine based linear and non-linear pattern classification,” in Proceedings of the International Conference on Intelligent Sustainable Systems, ICISS 2019, 2019.

K. Park, R. Rothfeder, S. Petheram, F. Buaku, R. Ewing, and W. H. Greene, “Linear regression,” in Basic Quantitative Research Methods for Urban Planners, 2020.

A. J. Scott, D. W. Hosmer, and S. Lemeshow, “Applied Logistic Regression.,” Biometrics, vol. 47, no. 4, 1991.

B. R. Kirkwood and J. A. C. Sterne, Essential Medical Statistics. 2003.

S. Sperandei, “Understanding logistic regression analysis,” Biochem. Medica, vol. 24, no. 1, 2014.

G. I. Webb, E. Keogh, and R. Miikkulainen, “Naïve Bayes.,” Encycl. Mach. Learn., vol. 15, pp. 713–714, 2010.

H. Zhang, “The optimality of naive Bayes,” Aa, vol. 1, no. 2, p. 3, 2004.

W. Lou, X. Wang, F. Chen, Y. Chen, B. Jiang, and H. Zhang, “Sequence based prediction of DNA-binding proteins based on hybrid feature selection using random forest and Gaussian naive Bayes,” PLoS One, vol. 9, no. 1, p. e86703, 2014.

J. Pearl, “Bayesian networks,” 2011.

P. Arora, D. Boyne, J. J. Slater, A. Gupta, D. R. Brenner, and M. J. Druzdzel, “Bayesian networks for risk prediction using real-world data: a tool for precision medicine,” Value Heal., vol. 22, no. 4, pp. 439–445, 2019.

D. Koller and A. Pfeffer, “Object-oriented Bayesian networks,” arXiv Prepr. arXiv1302.1554, 2013.

A. Khalaf et al., “Supervised Learning Algorithms in Educational Data Mining: A Systematic Review,” Southeast Eur. J. Soft Comput., vol. 10, no. 1, pp. 55–70, 2021.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.