1. Introduction
Diabetic retinopathy (DR) is a common microvascular complication of diabetes mellitus (DM) and a leading cause of vision loss in diabetic patients [
1]. DR is associated with multiple risk factors, including hyperglycemia, hyperlipidemia, hypertension, and genetic factors [
2]. DR can be classified into two distinct stages: non-proliferative DR (NPDR) and proliferative DR (PDR), based on the presence or absence of neovascularization [
3]. A major complication of diabetes is DR, which damages the retina and can cause blindness. Leakage, hemorrhage, and irregular vessel development are all symptoms that can be caused by high blood sugar levels, which can also alter the blood vessels in the retina. Early detection is essential for efficient management of disaster recovery, which continues through the stages. It is possible to avoid DR or reduce its growth by maintaining the effective management of diabetes through diet, exercise, and medication. It may be necessary to undergo laser therapy or surgery in more severe situations to safeguard one’s eyesight. To prevent diabetes patients from experiencing vision loss, it is necessary to participate in early diagnosis and vigorous care [
4]. Epidemiological studies have identified several risk factors associated with the development and progression of DR, including a higher body mass index, a higher waist-to-hip ratio, smoking, congestive heart failure, chronic renal disease, hypertension, and poor glycemic control. The prevalence of DR varies depending on the population and type of diabetes, with rates ranging from 5.67% in prediabetes to 41.1% in diabetes patients at tertiary care centers. Risk factors implicated across various populations and diabetes types include obesity, hypertension, longer diabetes duration, insulin therapy, neuropathy, nephropathy, and dyslipidemia [
5,
6].
Current DR treatment strategies focus on preventing its progression and managing complications to preserve vision. Glycemic control is key, as maintaining optimal blood glucose levels can significantly reduce the risk of develo** DR and slow its progression. Anti-VEGF (Vascular Endothelial Growth Factor) therapy with repeated doses of drugs such as ranibizumab and aflibcept can reduce diabetic macular edema by reducing inflammation and preventing abnormal blood vessel growth used to treat edema (DME) and proliferative diabetic retinopathy (PDR). Laser photocoagulation, which masks leaky blood vessels and reduces the risk of severe loss, remains the mainstay of therapy, especially for PDR and focal DME. Intravitreal steroids, such as triamcinolone acetonide, are especially useful and anti-VEGF is used to reduce inflammation in non-clinical cases. In advanced PDR, vitrectomy surgery may be required to eliminate vitreous hemorrhage or tractional retinal detachment and restore normal retinal anatomy and function. Several advanced methods are available for detecting and monitoring retinal changes in the diagnosis of DR. Ophthalmology using direct and indirect techniques allows physicians to visualize the retina and detect symptoms such as microaneurysms, bleeding, and exudation. Fundus imaging provides detailed, useful visualization of the retina for recording and tracking DR progress, and is commonly used in assessment programs [
7,
8].
Metabolomics profiling, which involves the comprehensive quantitative analysis of small-molecule metabolites in biological specimens like blood and urine, has advanced significantly in recent years [
9]. The metabolic phenotype reflects the intricate interplay between genetic and environmental factors and provides valuable insights into the pathophysiological conditions of various diseases, including DR [
10]. Several large-scale metabolomics profiling studies have been conducted to identify metabolites associated with disease progression, particularly in DR. However, there is still a need to identify additional metabolites that could serve as reliable biomarkers for DR progression and aid in the early treatment and prevention of diabetic complications [
11]. Additionally, targeted metabolomics was employed to analyze the metabolome data of DR patients, revealing significant differences in the concentrations of specific metabolites among non-DR, NPDR, and PDR type 2 DM (T2DM) patients. This approach provides valuable insights into the metabolic changes associated with different stages of DR in diabetic individuals [
12].
Hybrid explainable artificial intelligence (XAI) refers to the merging of several AI methodologies or models to boost performance. Hybridity is more prominent in AI research because of the various needs of the scientific, public, and commercial sectors. A study focused on mask-wearing status employed a mixture of convolutional neural networks, including SqueezeNet, InceptionV3, VGG16, and VGG19, together with several machine learning (ML) models, to develop hybrid models for classification, achieving excellent accuracy [
13].
In the present study, we propose a methodology that integrates a hybrid AI model and XAI approaches for early diagnosis and determination of metabolomic biomarkers of DR in patients with T2DM. Unlike existing methods that usually rely solely on clinical parameters or traditional imaging techniques, our approach integrates advanced machine learning algorithms to identify specific metabolites with varying concentrations among individuals with NDR, NPDR, and PDR. This research attempts to uncover distinct biochemical signatures related to different subclasses of DR, providing important molecular-level insights into disease etiology and development. Our study integrates various learning algorithms with a hybrid innovative approach to increase prediction accuracy and sensitivity, and this methodology provides a more robust and reliable tool for clinical decision-making in DR. This innovative methodology may enable the establishment of tailored treatment strategies and more successful screening procedures for DR.
3. Results
The study sample consisted of T2DM patients (n = 317), comprising NDR (n = 143) and DR (n = 174) individuals. DR patients were further separated into two groups according to the status of the problems. These included the NPDR (n = 123) and PDR (n = 51) groups. Significant age differences were observed across the groups (H = 23.34, p < 0.00001). The median age and interquartile ranges (IQR) for each group were as follows: NDR (median = 55, IQR = [50, 60]), NPDR (median = 58, IQR = [53, 63]), and PDR (median = 60, IQR = [55, 65]). Post-hoc analyses using the Bonferroni correction showed significant differences between NDR and NPDR (p = 0.0000126), and NDR and PDR (p = 0.000551), but not between NPDR and PDR (p = 0.865). Significant differences were found in the HbA1c levels (H = 17.42, p < 0.0002). The median HbA1c values were as follows: NDR (6.0%, IQR = [5.7, 6.3]), NPDR (7.2%, IQR = [6.8, 7.6]), and PDR (7.5%, IQR = [7.1, 8.0]). Pairwise comparisons indicated significant differences between NDR and NPDR (p = 0.000122) and between NDR and PDR (p = 0.00418), but not between NPDR and PDR (p = 0.824). Glucose levels differed across the groups (H = 10.01, p = 0.0067). Median glucose levels were as follows: NDR (90 mg/dL, IQR = [85, 95]), NPDR (120 mg/dL, IQR = [110, 130]), and PDR (125 mg/dL, IQR = [115, 135]). The Bonferroni adjusted pairwise tests revealed significant differences between NDR and NPDR (p = 0.00155), but not between NDR and PDR (p = 0.12) or NPDR and PDR (p = 0.617). Creatinine levels also showed significant differences (H = 27.06, p < 0.000001). The median values were as follows: NDR (0.9 mg/dL, IQR = [0.8, 1.0]), NPDR (1.1 mg/dL, IQR = [1.0, 1.2]), and PDR (1.3 mg/dL, IQR = [1.2, 1.4]). Significant differences were found between all paired groups: NDR and NPDR (p = 0.00816), NDR and PDR (p = 0.000000692), and NPDR and PDR (p = 0.000874). We examined the association between gender and the retinopathy groups using a Chi-square test, which indicated no significant association (χ2 = 0.768, p = 0.681).
Table 1 indicates the analysis results of the non-hybrid models. In addition to this, two-stage ensemble models were also trained in this study. As explained before, in the first stage of this model, SVC was trained as a classifier to compute prediction probability. In the second stage, several other models were trained using these prediction probability scores as an extra feature. As a result of this process, four models, SVC + RF, SVC + DT, SVC + LR, and SVC + MLP, were trained.
Table 2 shows the performance metrics of the ensemble models trained using the T2DM dataset. According to the results in this table, there is an increase in performance measures when applying the proposed two-layer ensemble approach to any ML model used in the study. In addition, the deep neural network model achieved more successful results both in single models and hybrid models. In this regard, it is predicted that utilizing the deep MLP model would be most suitable for designing a biomarker. In designing this biomarker, the models are elucidated using the SHAP method to assess the impact of each feature utilized in the machine learning model on the success rate. The impact of features computed by SHAP using the MLP model, which outperformed the other models, is shown in the following figures.
Figure 2 illustrates the different patterns of feature importance across the DR classes, indicating that certain biochemical and physiological parameters are more relevant for certain conditions. Glucose and glycine show significant importance in all classes but are particularly impactful in the NPDR class. The analysis of SHAP values across different DR classes provides insights into the model’s behavior and decision-making process. The model leverages a complex interaction of features where metabolic and age-related factors like glucose, glycine, and age appear consistently across the classes, suggesting their universal importance in the pathology of DR. The prominence of specific metabolites and amino acids such as taurine, creatinine, and various phosphatidylcholines highlights the potential metabolic underpinnings of DR progression. This could suggest pathways for targeted therapeutic interventions or for biomarkers in clinical settings. In the PDR class, a marked influence of creatinine and the phosphatidylcholine molecules suggest a shift towards more systemic and nephrological influences as DR progresses to more severe forms. This shift is crucial for understanding how DR could be connected to broader systemic conditions.
Figure 3 demonstrates which biochemical and physiological features are most influential across different DR classes. Features such as HbA1c, Tyr (Tyrosine), and various phosphatidylcholine molecules (e.g., PC.ae.C36.2) appear frequently across the plots, indicating their significant role in the model’s decision-making process. Certain features have more pronounced impacts in specific DR classes. HbA1c and Tyr have a more substantial influence in the PDR class compared to the NDR and NPDR classes. This suggests that these features may be particularly relevant for identifying more severe stages of diabetic retinopathy. The comparison between classes highlights that different features carry different weights depending on the severity of the condition. Creatinine and citrate show significant importance in the NDR class, which might indicate their utility in distinguishing no retinopathy from some degree of retinopathy. It may be determined from the study findings that the two-stage hybrid strategy delivers more favorable outcomes. Upon inspection of feature significance in the hybrid model, it becomes obvious that the probabilities produced from the machine learning model applied in the second stage exert the most significant effect on the final class prediction. Considering these data, it is inferred that the boost in the performance of the two-stage hybrid approach may be attributable to the prediction probabilities provided by the first-stage method.
4. Discussion
The present study presents a comprehensive review of solo and hybrid machine learning models in DR prediction and analysis using the T2DM dataset. A substantial boost in prediction performance measures was obtained with the deployment of two-stage ensemble models over non-hybrid, solo models. This suggests a strategic benefit of combining various learning algorithms to increase prediction accuracy, precision, and other performance measurements.
The solo models analyzed (SVC, RF, DT, LR, and MLP) displayed commendable individual performances, with the MLP model outperforming others in terms of accuracy, precision, and F-scores. The MLP’s superior performance aligns with findings from a scientific article, which reported that deep learning models often outstrip traditional machine learning models in medical image analysis due to their ability to learn complex patterns from large datasets [
22]. One of the main purposes of the ensemble method is to correct errors made by one method using another method. In this study, the prediction probabilities calculated by the MLP were used as input for a second model to create an ensemble method. When examining studies in the literature, it has been observed that ensemble methods generally achieve more successful results than individual methods. Another factor that affects the performance of the model is the individual success of each method used in the ensemble. Therefore, it was expected that the best score observed in our study would be obtained with an ensemble method.
When examining the ensemble models (SVC + RF, SVC + DT, SVC + LR, and SVC + MLP), the SVC + MLP configuration showed the highest improvement in all metrics. This enhancement is consistent with that reported in other research, which found that layering different types of models could lead to more robust predictions in biomedical applications by capturing diverse patterns that solo models might miss [
23].
Furthermore, using the SHAP method to interpret model predictions provided insightful revelations about the feature importance in disease progression, particularly in different classes of DR. The significant roles of glucose, glycine, and age across all DR classes suggest their universal importance in DR pathology. This finding is corroborated by research that highlighted metabolic and age-related factors as critical in DR progression [
24]. Moreover, the analysis revealed varying impacts of specific biochemical markers across different DR severity levels. Creatinine and various phosphatidylcholine molecules exhibited higher importance in more severe DR classes (PDR), similar to observations by a medical study, which suggested a link between nephrological markers and severe DR conditions [
25].
The results of this study demonstrate considerable variation in age, HbA1c, glucose, and creatinine levels throughout different phases of diabetic retinopathy, underscoring the relevance of these biomarkers in monitoring disease progression. Notably, the increase in median HbA1c, glucose, and creatinine values from non-proliferative to severe proliferative diabetic retinopathy suggests a link with the deterioration of the disease condition. These findings are consistent with recent research suggesting that prolonged exposure to high glucose levels could improve the severity of retinopathy, presumably due to increased oxidative stress and vascular damage inside the retina [
26]. Moreover, the large age differences identified across the groups further support the hypothesis that the risk and development of diabetic retinopathy worsen with age. This is under the larger awareness within the profession that older age is a significant risk factor for the development of more severe diabetes-related problems [
27]. Our study identified no significant gender differences in the course of diabetic retinopathy, suggesting that the physiologic consequences of diabetes on retinal health could be similar across genders. This accords with clinical research that has questioned established ideas about gender discrepancies in diabetes outcomes, claiming instead that lifestyle and medication adherence may play more major roles [
28]. These findings are essential for doctors and academics alike, as they provide knowledge that can change screening and monitoring protocols for diabetic retinopathy. By recognizing the importance of age, HbA1c, glucose, and creatinine as indicators, healthcare practitioners can optimize patient outcomes through earlier intervention and personalized treatment programs [
29]. In contrast, some studies have reported minimal improvements in model performance when combining classifiers in the same way. This discrepancy could be attributed to differences in datasets, feature sets, or model tuning, highlighting the context-dependent nature of machine learning applications in healthcare [
30,
31].
Glucose consistently emerges as the most important metabolite in both solo and hybrid MLP models in all stages of DR. This finding shows that the role of diabetes mellitus and high consumption in DR development is well-established and consistent. Elevated glucose levels are a hallmark of diabetes and are directly linked to nerve damage and subsequent retinal problems. Chronic hyperglycemia leads to the production of advanced glycation end products (AGEs), which accumulate and cause structural and functional abnormalities in retinal blood vessels and this process leads to increased vascular permeability, microaneurysms growth, and end completely the blood–retinal barrier collapses. In addition, excess glucose levels can activate multiple signaling pathways that exacerbate inflammation and oxidative stress, exacerbating retinal damage. The consistent role of glucose as a key metabolic factor in DR models in its many phases emphasizes the critical importance of strict glycemic control in terms of control and prevention development of this disorder. In order to reduce the risk of DR progression and consequences, the associated metabolism may, ultimately, improve patient outcomes and quality of life. This insight into the primary role of glucose also highlights the importance of continued research and innovation to build effective ways to maintain adequate glycemic emphasize levels in diabetics [
32]. Managing glucose levels is thus paramount in preventing DR progression. Glycine is another significant metabolite, particularly highlighted in the Hybrid MLP models. Glycine’s role in neurotransmission and as a metabolic regulator suggests its involvement in the metabolic disturbances associated with diabetes [
33]. Elevated glycine levels may indicate an impaired glucose metabolism and increased oxidative stress, contributing to DR development. HbA1c, a measure of long-term glycemic control, is crucial in predicting DR stages. Its importance reflects the necessity for sustained glucose management to mitigate DR risks. Elevated HbA1c levels are strongly correlated with the severity of retinopathy [
34]. Phosphatidylcholines, including PC.aa.C34.2 and PC.aa.C38.6, are particularly prominent in the Solo MLP models for the NPDR and PDR classes. These metabolites are essential components of cell membranes and play significant roles in lipid metabolism. Altered PC levels suggest disruptions in lipid homeostasis and cellular integrity, contributing to retinal damage [
35]. Several amino acids, such as glutamine (Gln), alanine (Ala), valine (Val), threonine (Thr), and arginine (Arg), are also highlighted based on the comprehensive analysis. These amino acids are vital for protein synthesis and energy metabolism, and their altered levels can indicate broader metabolic dysregulation in diabetes. Creatinine, a marker of renal function, is significant in both the NPDR and PDR classes. This is consistent with the high prevalence of diabetic nephropathy in advanced DR stages. Age also emerges as an important factor, reflecting the increased risk of DR with advancing age [
36,
37]. Other metabolites, such as ornithine and proline, which are involved in the urea cycle and amino acid metabolism, as well as Trp and Tyr, precursors to neurotransmitters, suggest potential links between metabolic and neurodegenerative processes in diabetes [
38].
The study’s findings emphasize the promise of advanced machine learning methodologies, particularly hybrid models, in enhancing the projected accuracy and interpretability of health-related outcomes. Ensemble models and deep learning, along with interpretative techniques like SHAP, can greatly contribute to understanding intricate illness causes and improving diagnostic processes in clinical settings. Future studies might examine the integration of other, different models and the assessment of their interpretability to further enhance the predictions and insights produced by such sophisticated analytical tools [
39].
The superior performance of hybrid models, particularly the SVC + MLP model, suggests that these models could significantly enhance DR’s diagnostic accuracy. This precision is critical in distinguishing between the various stages of DR, allowing for earlier and more precise interventions, and potentially reducing the progression to more severe stages that require invasive treatments. The integration of these models into CDSS can provide ophthalmologists with powerful tools to analyze retinal images more efficiently. This integration can aid in making quicker and more accurate decisions, particularly in areas with limited access to specialized healthcare providers [
40]. The ability of the models to identify early signs of DR and to elucidate the metabolic and physiological markers associated with its progression offers a pathway to preventive healthcare strategies. By identifying at-risk individuals early, preventative measures can be taken sooner, which may include lifestyle and dietary changes, as well as closer monitoring of glucose and blood pressure levels. While the current study demonstrates significant potential, the scalability and adaptability of these models in different clinical settings remain to be tested. The models need to be validated not only across various demographics but also across different equipment and settings to ensure they maintain accuracy without high-grade, specialized equipment. Further research could explore combining the predictive power of machine learning models with other modalities like genetic testing or biomarker analysis to enhance predictive accuracy further. Additionally, longitudinal studies could assess how interventions based on model predictions affect patient outcomes over time. The hybrid SVC + MLP model outperformed the solo models in predicting and assessing DR, which may be attributed to various theoretical benefits inherent to ensemble learning. Ensemble learning integrates various learning algorithms to produce higher prediction performance than could be achieved from any of the component models alone. This strategy harnesses the strengths and mitigates the limitations of individual models, resulting in more robust and accurate forecasts.
The uncovered indicators, including glucose, glycine, HbA1c, and creatinine, might possibly be incorporated into existing clinical practice to promote early identification and tailored treatment methods for DR. For practical implementation, these biomarkers need to be verified in larger, independent cohorts to ensure their reliability and generalizability. This validation method incorporates longitudinal studies to evaluate the evolution of DR in varied groups and contexts. Additionally, incorporating these biomarkers into clinical decision support systems (CDSS) may benefit ophthalmologists in making more accurate and timely judgments, especially in resource-limited situations. The research implies that early identification of at-risk patients using these indicators might lead to preventive actions, such as lifestyle adjustments and tight glucose control, eventually improving patient outcomes. Therefore, future research should concentrate on confirming these biomarkers, understanding their significance in DR pathophysiology, and creating rigorous guidelines for their adoption in normal clinical practice. This will ensure that the encouraging findings from machine learning models translate into actual gains in controlling and preventing DR in clinical settings.
Recent research primarily concentrates on the growing role of hypertension and environmental variables in DR development, as well as the newest breakthroughs in AI applications in metabolomics. Notably, the inclusion of new studies addressing the relationship between metabolic diseases and hypertensive situations should provide a more comprehensive understanding of DR pathogenesis. Additionally, there is a need to mention significant references describing comparable AI applications in metabolomics, which would frame this work within the larger context of contemporary technical breakthroughs. Integrating these updates will boost the manuscript’s relevancy and highlight its alignment with cutting-edge research on this quickly evolving subject [
41,
42].
The clinical efficacy of the machine learning models addressed in this work, particularly the hybrid models such as SVC + MLP, indicates a promising improvement in the early diagnosis and management of DR. The findings demonstrate that these models not only boost diagnostic accuracy but also precision, enabling earlier intervention options that are critical in limiting the progression of DR [
43]. This is particularly advantageous for places where access to expert healthcare practitioners is restricted. Furthermore, implementing these models in real-world clinical settings could significantly streamline the screening process, making it faster and more reliable. This would allow for a broader, more effective deployment of resources, potentially reducing the overall healthcare burden associated with late-stage DR treatments. By integrating these advanced predictive models into existing clinical workflows, there is an opportunity to transform current DR management practices, emphasizing preventative care and personalized treatment plans based on precise, data-driven insights. However, to realize their full potential, these models must undergo extensive clinical validation to ensure their efficacy and reliability across diverse patient demographics and varying clinical environments.