Communications Biology (Aug 2025)
Interpretable and integrative analysis of single-cell multiomics with scMKL
Abstract
Abstract The rapid advancement of single-cell technologies has led to the development of various analysis methods, each with trade-offs between predictive power and interpretability particularly for multimodal data integration. Complex machine learning models achieve high accuracy, but they often lack transparency, while simpler models are more interpretable but less effective for prediction. In this manuscript, we introduce an innovative method for single-cell analysis using Multiple Kernel Learning (scMKL), that merges the predictive capabilities of complex models with the interpretability of linear approaches, aimed at providing actionable insights from single-cell multiomics data. scMKL excels at classifying healthy and cancerous cell populations across multiple cancer types, utilizing data from single-cell RNA sequencing, ATAC sequencing, and 10x Multiome. It outperforms existing methods while delivering interpretable results that identify key transcriptomic and epigenetic features, as well as multimodal pathways– that existing methods have failed to achieve, in breast, lymphatic, prostate, and lung cancers. Leveraging insights from one dataset to inform analysis in a new dataset, scMKL uncovers biological pathways that distinguish treatment responses in breast cancer, low-grade from high-grade prostate tumors, and subtypes in lung cancer, thereby enhancing our understanding of cancer biology and tumor progression.