Fusion-Based AI Model for Early Risk Assessment of Metabolic Disorders
Main Article Content
Abstract
Diabetes mellitus remains one of the most critical and widespread chronic diseases globally, posing severe health threats and contributing to millions of deaths annually. It is a metabolic disorder that, if not detected and managed early, can result in life-threatening Complications such as cardiovascular disease, kidney failure and nerve damage. Early risk assessment is vital to enabling timely intervention and improving long-term health outcomes. To address this challenge, a fusion-based artificial intelligence model has been developed for accurate and early diabetes risk prediction. The approach integrates heterogeneous data sources and powerful machine learning algorithms to enhance prediction accuracy. Three well-known datasets—“Pima Indian Diabetes (PIMA-ID-I), Diabetes Dataset from Frankfurt Hospital in Germany (DDFH-G)”, and Iraqi Diabetes Patient Dataset (IDPD-I)—were utilized to ensure performance consistency across various demographic and clinical profiles. For optimal performance, the Extra Tree-based Feature Selection (ExtraTree FS) technique was employed to identify and retain the most informative attributes while reducing redundancy. This technique is known for its efficiency in selecting features by analyzing feature importance through randomized decision trees. Following feature selection, a voting-based ensemble classification strategy was adopted to maximize model robustness and generalization. The ensemble integrates three powerful classifiers—Boosted Decision Tree, Random Forest, and Bagged Extra Trees—each contributing unique strengths in capturing complex data patterns and minimizing classification errors. The fusion of these classifiers through a soft voting mechanism significantly enhances the stability and reliability of the predictions. The proposed AI-driven framework demonstrates strong potential in accurately identifying high-risk individuals across diverse populations. With high accuracy observed consistently across all three datasets, the model underlines the advantages of combining multiple techniques—feature selection, ensemble learning, and multi-source data fusion—to support early medical diagnostics. This approach not only improves predictive capability but also offers scalable potential for deployment in clinical decision support systems for metabolic disorder risk screening.