Informatics in Medicine Unlocked (Jan 2024)
An ML-based decision support system for reliable diagnosis of ovarian cancer by leveraging explainable AI
Abstract
Ovarian cancer (OC) is one of the most prevalent types of cancer in women. Early and accurate diagnosis is crucial for the survival of the patients. However, the majority of women are diagnosed in advanced stages due to the lack of effective biomarkers and accurate screening tools. While previous studies sought a common biomarker, our study suggests different biomarkers for the premenopausal and postmenopausal populations. This can provide a new perspective in the search for novel predictors for the effective diagnosis of OC. Genetic algorithm has been utilized to identify the most significant biomarkers. The XGBoost classifier is then trained on the selected features and high ROC-AUC scores of 0.864 and 0.911 have been obtained for the premenopausal and postmenopausal populations, respectively. Lack of explainability is one major limitation of current AI systems. The stochastic nature of the ML algorithms raises concerns about the reliability of the system as it is difficult to interpret the reasons behind the decisions. To increase the trustworthiness and accountability of the diagnostic system as well as to provide transparency and explanations behind the predictions, explainable AI has been incorporated into the ML framework. SHAP is employed to quantify the contributions of the selected biomarkers and determine the most discriminative features. Merging SHAP with the ML models enables clinicians to investigate individual decisions made by the model and gain insights into the factors leading to that prediction. Thus, a hybrid decision support system has been established that can eliminate the bottlenecks caused by the black-box nature of the ML algorithms providing a safe and trustworthy AI tool. The diagnostic accuracy obtained from the proposed system outperforms the existing methods as well as the state-of-the-art ROMA algorithm by a substantial margin which signifies its potential to be an effective tool in the differential diagnosis of OC.