Comparison of Machine Learning Algorithms for Predicting Thyroid Disorders in Diabetic Patients
Abstract
Machine Learning (ML), a subfield of Artificial Intelligence (AI), has been used successfully in the healthcare domain for disease diagnosis. Thyroid disorders and diabetes are two of the most prevalent and interconnected chronic diseases, as both play critical roles in regulating various physiological processes in the body. This study aims to predict thyroid disorders in diabetes patients using six machine learning algorithms: Random Forest (RF), Decision Tree (DT), K-Nearest Neighbors (KNN), Logistic Regression (LR), Naïve Bayes (NB), and Support Vector Machine (SVM). A locally sourced dataset comprising 44,534 instances of diabetic patients was utilized, undergoing preprocessing steps including data cleaning, encoding, and balancing. Two balancing techniques were employed: manual balancing and RandomUnderSampler. The dataset was partitioned into training and testing sets using a Stratified K-Fold cross-validation approach with 10 folds to ensure robust evaluation. Each algorithm’s performance was assessed using metrics such as accuracy and F1-score. Among the models, the RF algorithm outperformed the others, achieving the highest accuracy of 95% on the manually balanced dataset and 84% when the RandomUnderSampler technique was employed. Additionally, the F1-scores for RF were 95% and 82%, respectively, indicating its robustness in handling imbalanced datasets. This study highlights the importance of selecting appropriate preprocessing techniques and machine learning methods for healthcare datasets. The findings can assist healthcare providers in making early diagnoses and interventions for thyroid disorders in diabetic patients, potentially improving their quality of life and overall healthcare outcomes.
Full Text:
PDFReferences
F. Rong et al., “Association between thyroid dysfunction and type 2 diabetes: a meta-analysis of prospective observational studies,” BMC Med, vol. 19, no. 1, Dec. 2021, doi: 10.1186/s12916-021-02121-2.
B. Biondi, G. J. Kahaly, and R. P. Robertson, “Thyroid Dysfunction and Diabetes Mellitus: Two Closely Associated Disorders,” Endocr Rev, vol. 40, no. 3, pp. 789–824, Dec. 2018, doi: 10.1210/er.2018-00163.
N. T. Y. Alibrahim, M. G. Chasib, S. S. Hamadi, and A. A. Mansour, “Predictors of Metformin Side Effects in Patients with Newly Diagnosed Type 2 Diabetes Mellitus,” Ibnosina Journal of Medicine and Biomedical Sciences, vol. 15, no. 02, pp. 067–073, Jun. 2023, doi: 10.1055/s-0043-1761215.
I. Tasin, T. U. Nabil, S. Islam, and R. Khan, “Diabetes prediction using machine learning and explainable AI techniques,” Healthc Technol Lett, vol. 10, no. 1–2, pp. 1–10, Feb. 2023, doi: 10.1049/htl2.12039.
S. A. Hassan, A.-K. M. Ali, and R. I. Saleem, “Relationship between glycemic control and different insulin regimens in pediatric type 1 diabetes mellitus,” The Medical Journal of Basrah University, 2023, doi: 10.33762/mjbu.2023.140990.1138.
R. Kumar, P. Saha, S. Sahana, and A. Dubey, “A REVIEW ON DIABETES MELLITUS: TYPE1 & TYPE2,” 2020, doi: 10.20959/wjpps202010-17336.
C. J. McElwain, F. P. McCarthy, and C. M. McCarthy, “Gestational diabetes mellitus and maternal immune dysregulation: What we know so far,” Apr. 02, 2021, MDPI. doi: 10.3390/ijms22084261.
K. Dharmarajan, K. Balasree, A. S. Arunachalam, and K. Abirmai, “Thyroid Disease Classification Using Decision Tree and SVM,” 2020.
M. Nishi, “Diabetes mellitus and thyroid diseases,” May 01, 2018, Springer Tokyo. doi: 10.1007/s13340-018-0352-4.
P. Sharma, S. Shrestha, and P. Kumar, “A review on association between diabetes and thyroid disease,” Santosh University Journal of Health Sciences, vol. 5, no. 2, pp. 50–55, Jan. 2020, doi: 10.18231/j.sujhs.2019.013.
S. Gopal, P. Gaurav, and D. Prateek, Machine learning algorithms using Python programming. New York: Nova Science Publishers, 2021.
A. Panesar, Machine Learning and AI for Healthcare: big data for improved health outcomes. Berkeley, CA: Apress, 2021. doi: https://doi.org/10.1007/978-1-4842-6537-6.
F. Pedro. García Márquez, Handbook of research on big data clustering and machine learning. Engineering Science Reference (an imprint of IGI Global), 2020.
I. H. Sarker, “Machine Learning: Algorithms, Real-World Applications and Research Directions,” SN Computer Science, vol. 2, no. 3, pp. 1–21, Mar. 2021, doi: https://doi.org/10.1007/s42979-021-00592-x.
Yuxi. (Hayden). Liu, Python Machine Learning by Example Build Intelligent Systems Using Python, TensorFlow 2, Pytorch, and Scikit-Learn, 3rd Edition. Birmingham: Packt Publishing, Limited, 2020.
S. L. Mirtaheri and R. Shahbazian, Machine Learning Theory to Applications. CRC Press, 2022. doi: https://doi.org/10.1201/9781003119258.
D. Sisodia and D. S. Sisodia, “Prediction of Diabetes using Classification Algorithms,” Procedia Computer Science, vol. 132, pp. 1578–1585, 2018, doi: https://doi.org/10.1016/j.procs.2018.05.122.
P. Sonar and K. JayaMalini, "Diabetes Prediction Using Different Machine Learning Approaches," 2019 3rd International Conference on Computing Methodologies and Communication (ICCMC), Erode, India, 2019, pp. 367-371, doi: 10.1109/ICCMC.2019.8819841.
A. H. Khassawneh et al., “Prevalence and Predictors of Thyroid Dysfunction Among Type 2 Diabetic Patients: A Case–Control Study,” International Journal of General Medicine, vol. Volume 13, pp. 803–816, Oct. 2020, doi: https://doi.org/10.2147/ijgm.s273900.
C. Yadav and S. Pal, “Prediction of thyroid disease using decision tree ensemble method,” Human-Intelligent Systems Integration, vol. 2, no. 1–4, pp. 89–95, Apr. 2020, doi: https://doi.org/10.1007/s42454-020-00006-y.
P. Duggal and S. Shukla, "Prediction Of Thyroid Disorders Using Advanced Machine Learning Techniques," 2020 10th International Conference on Cloud Computing, Data Science & Engineering (Confluence), Noida, India, 2020, pp. 670-675, doi: https://doi.org/10.1109/Confluence47617.2020.9058102.
Dudkina, I. Meniailov, K. Bazilevych, S. Krivtsov, and A. Tkachenko, “Classification and Prediction of Diabetes Disease using Decision Tree Method,” Symposium on Information Technologies & Applied Sciences, Bratislava, Slovakia, Mar. 2021. Available: https://ceur-ws.org/Vol-2824/paper16.pdf
G. Chaubey, D. Bisen, S. Arjaria, and V. Yadav, “Thyroid Disease Prediction Using Machine Learning Approaches,” National Academy Science Letters, vol. 44, no. 3, pp. 233–238, May 2020, doi: https://doi.org/10.1007/s40009-020-00979-z.
Samin Poudel, “A Study of Disease Diagnosis using Machine Learning,” 2021, doi: 10.3390/xxxxx.
G. S. Ohannesian and E. J. Harfash, “Epileptic Seizures Detection from EEG Recordings Based on a Hybrid System of Gaussian Mixture Model and Random Forest Classifier,” Informatica (Slovenia), vol. 46, no. 6, 2022, doi: 10.31449/inf.v46i6.4203.
DOI: https://doi.org/10.31449/inf.v49i12.6927

This work is licensed under a Creative Commons Attribution 3.0 License.