Document Type

Article

Publication Date

2025

DOI

10.5120/ijca2025925995

Publication Title

International Journal of Computer Applications

Volume

187

Issue

57

Pages

9-16

Abstract

Diabetes remains a critical global health challenge, with early detection is crucial for effective management. This study presents a comprehensive benchmarking analysis of 14 diverse machine learning and Bayesian models for early-stage diabetes risk prediction using clinical data [2] from Sylhet, Bangladesh. This research evaluated traditional methods (Logistic Regression, Decision Trees), ensemble techniques (Random Forest, XGBoost, LightGBM), Bayesian approaches (BART, Bayesian Logistic Regression), and advanced neural architectures (Deep Belief Networks) using both 70-30 train-test splits and 10-fold cross-validation. The results demonstrate that ensemble methods consistently outperformed other approaches, with Random Forest(RF) achieving the highest cross-validated AUC (0.9951) and accuracy (0.9699). The study provides valuable insights into model selection for clinical decision support systems and highlights the robustness of tree-based ensemble methods for medical diagnosis tasks.

Rights

© 2025 The Foundation of Computer Science. All rights reserved.

Included with the kind written permission of the copyright holder.

Comments

Published by the Foundation of Computer Science.

Original Publication Citation

Hossain, M. I., & Porno, N. (2025). Comprehensive benchmarking of several machine learning and Bayesian models for early-stage diabetes risk prediction: A large-scale comparative study. International Journal of Computer Applications, 187(57), 9-16. https://doi.org/10.5120/ijca2025925995

Share

COinS