Integration of deep learning with Ramachandran plot molecular dynamics simulation for genetic variant classification
Benjamin Tam,
Zixin Qin,
Bojin Zhao,
San Ming Wang,
Chon Lok Lei
Affiliations
Benjamin Tam
Ministry of Education Frontiers Science Center for Precision Oncology, Faculty of Health Sciences, University of Macau, Macau SAR, China; Cancer Centre, Faculty of Health Sciences, University of Macau, Macau SAR, China; Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau SAR, China
Zixin Qin
Ministry of Education Frontiers Science Center for Precision Oncology, Faculty of Health Sciences, University of Macau, Macau SAR, China; Cancer Centre, Faculty of Health Sciences, University of Macau, Macau SAR, China; Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau SAR, China
Bojin Zhao
Ministry of Education Frontiers Science Center for Precision Oncology, Faculty of Health Sciences, University of Macau, Macau SAR, China; Cancer Centre, Faculty of Health Sciences, University of Macau, Macau SAR, China; Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau SAR, China
San Ming Wang
Ministry of Education Frontiers Science Center for Precision Oncology, Faculty of Health Sciences, University of Macau, Macau SAR, China; Cancer Centre, Faculty of Health Sciences, University of Macau, Macau SAR, China; Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau SAR, China; Corresponding author
Chon Lok Lei
Ministry of Education Frontiers Science Center for Precision Oncology, Faculty of Health Sciences, University of Macau, Macau SAR, China; Cancer Centre, Faculty of Health Sciences, University of Macau, Macau SAR, China; Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau SAR, China; Corresponding author
Summary: Functional classification of genetic variants is a key for their clinical applications in patient care. However, abundant variant data generated by the next-generation DNA sequencing technologies limit the use of experimental methods for their classification. Here, we developed a protein structure and deep learning (DL)-based system for genetic variant classification, DL-RP-MDS, which comprises two principles: 1) Extracting protein structural and thermodynamics information using the Ramachandran plot-molecular dynamics simulation (RP-MDS) method, 2) combining those data with an unsupervised learning model of auto-encoder and a neural network classifier to identify the statistical significance patterns of the structural changes. We observed that DL-RP-MDS provided higher specificity than over 20 widely used in silico methods in classifying the variants of three DNA damage repair genes: TP53, MLH1, and MSH2. DL-RP-MDS offers a powerful platform for high-throughput genetic variant classification. The software and online application are available at https://genemutation.fhs.um.edu.mo/DL-RP-MDS/.