Predicting Stages of Liver Cirrhosis Using Data Mining and Machine Learning Techniques

Duaa S. Ali, Maalim Aljabery

Abstract


Liver cirrhosis often occurs as a result of the lengthy and persistent progression of chronic liver disorders. It is a key crucial cause of death on a global scale. Early diagnosis and identification of cirrhosis are essential for preventing the disease's progression and the complete devastation of liver tissue. This paper aims to build an intelligent automated system that can predict the stages of cirrhosis employing Machine Learning (ML) algorithms, including Random Forest (RF), Extra Trees (ET), and Support Vector Machine (SVM). The dataset used in this research is sourced from the Zenodo website, which is linked to the GitHub website. This was our initial use of the data, which is publicly accessible. Data mining techniques were also implemented to analyze the data before predicting the outcome. Due to the considerable imbalance in the dataset's classes, we applied the Synthetic Minority Oversampling Technique (SMOTE) to mitigate a bias problem in a machine learning model. A newly proposed model implemented feature selection techniques Chi-Square and Recursive Feature Elimination and Cross-Validation (RFECV) with classifiers RF and SVM (RF-RFECV, SVM-RFECV). The experimental findings demonstrate that the Extra-Trees model using the Chi-square feature selection method (ET-Chi-Square) achieved the maximum level of accuracy of 93.87%. Additionally, it obtained recall, F1-score, and precision values of 94% each, and an Area Under Curve (AUC) of 99%. Our method exhibited exceptional performance as compared to previous relevant research.


Full Text:

PDF


DOI: https://doi.org/10.31449/inf.v48i21.6752

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.