Using the Machine Learning Algorithms for Accurate Prediction of Diabetes
DOI:
https://doi.org/10.33022/ijcs.v13i6.4488Keywords:
Diabetes Prediction, Machine learning, Support Vector Machine, AdaBoost, Neural Networks, K-Nearest Neighbors, Random Forest, Logit BoostAbstract
Diabetics has proven to be the most threatening illness affecting the body system. It is associated with many consequences, including blindness, kidney failure, amputations, heart failure, microvascular and macrovascular complications, which affects millions of people across the world and has contributed to increased mortality. Studies shows that effective management and early detection of diabetes remains crucial for preventing its complications and improving the patient. According to available data, we use machine learning algorithms, including the Support Vector Machine (SVM), AdaBoost (ADA), Neural Networks (NNET), K-Nearest Neighbors (KNN), Random Forest (RF), and Logit Boost (LOGIT), for the accurate prediction of diabetes amongst patients. We find that the Logit Boost and AdaBoost stand out as the top performers for predicting diabetic patients, with balanced and reliable performance across various evaluation metrics. They exhibit high accuracy, strong AUC scores, and good overall performance across multiple metrics, making them suitable for this classification task. Neural Networks show excellent precision and low log loss, indicating strong probabilistic predictions, but their lower specificity suggests a higher false-positive rate. Random Forest demonstrates good recall but lower accuracy on the test set, indicating potential overfitting to the training data. SVM and KNN perform the weakest across most metrics, suggesting they may not be the best choices for this prediction task.
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Abiola Olaide Ayodele, Adedeji Daniel Gbadebo
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.