SAFE-MIL: a statistically interpretable framework for screening potential targeted therapy patients based on risk estimation

Yanfang Guan; Yanfang Guan; Yanfang Guan; Zhengfa Xue; Zhengfa Xue; Jiayin Wang; Jiayin Wang; Xinghao Ai; Rongrong Chen; Xin Yi; Shun Lu; Yuqian Liu; Yuqian Liu

doi:10.3389/fgene.2024.1381851

Frontiers in Genetics (Aug 2024)

SAFE-MIL: a statistically interpretable framework for screening potential targeted therapy patients based on risk estimation

Yanfang Guan,
Yanfang Guan,
Yanfang Guan,
Zhengfa Xue,
Zhengfa Xue,
Jiayin Wang,
Jiayin Wang,
Xinghao Ai,
Rongrong Chen,
Xin Yi,
Shun Lu,
Yuqian Liu,
Yuqian Liu

Affiliations

Yanfang Guan: School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, China
Yanfang Guan: Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, China
Yanfang Guan: Geneplus Beijing Institute, Beijing, China
Zhengfa Xue: School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, China
Zhengfa Xue: Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, China
Jiayin Wang: School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, China
Jiayin Wang: Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, China
Xinghao Ai: Shanghai Chest Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
Rongrong Chen: Geneplus Beijing Institute, Beijing, China
Xin Yi: Geneplus Beijing Institute, Beijing, China
Shun Lu: Shanghai Chest Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
Yuqian Liu: School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, China
Yuqian Liu: Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, China

DOI: https://doi.org/10.3389/fgene.2024.1381851
Journal volume & issue: Vol. 15

Abstract

Read online

Patients with the target gene mutation frequently derive significant clinical benefits from target therapy. However, differences in the abundance level of mutations among patients resulted in varying survival benefits, even among patients with the same target gene mutations. Currently, there is a lack of rational and interpretable models to assess the risk of treatment failure. In this study, we investigated the underlying coupled factors contributing to variations in medication sensitivity and established a statistically interpretable framework, named SAFE-MIL, for risk estimation. We first constructed an effectiveness label for each patient from the perspective of exploring the optimal grouping of patients’ positive judgment values and sampled patients into 600 and 1,000 groups, respectively, based on multi-instance learning (MIL). A novel and interpretable loss function was further designed based on the Hosmer-Lemeshow test for this framework. By integrating multi-instance learning with the Hosmer-Lemeshow test, SAFE-MIL is capable of accurately estimating the risk of drug treatment failure across diverse patient cohorts and providing the optimal threshold for assessing the risk stratification simultaneously. We conducted a comprehensive case study involving 457 non-small cell lung cancer patients with EGFR mutations treated with EGFR tyrosine kinase inhibitors. Results demonstrate that SAFE-MIL outperforms traditional regression methods with higher accuracy and can accurately assess patients’ risk stratification. This underscores its ability to accurately capture inter-patient variability in risk while providing statistical interpretability. SAFE-MIL is able to effectively guide clinical decision-making regarding the use of drugs in targeted therapy and provides an interpretable computational framework for other patient stratification problems. The SAFE-MIL framework has proven its effectiveness in capturing inter-patient variability in risk and providing statistical interpretability. It outperforms traditional regression methods and can effectively guide clinical decision-making in the use of drugs for targeted therapy. SAFE-MIL offers a valuable interpretable computational framework that can be applied to other patient stratification problems, enhancing the precision of risk assessment in personalized medicine. The source code for SAFE-MIL is available for further exploration and application at https://github.com/Nevermore233/SAFE-MIL.

Published in Frontiers in Genetics

ISSN: 1664-8021 (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Science: Biology (General): Genetics
Website: http://journal.frontiersin.org/journal/genetics

About the journal

Abstract

Keywords