A machine learning and directed network optimization approach to uncover TP53 regulatory patterns
Charalampos P. Triantafyllidis,
Alessandro Barberis,
Fiona Hartley,
Ana Miar Cuervo,
Enio Gjerga,
Philip Charlton,
Linda van Bijsterveldt,
Julio Saez Rodriguez,
Francesca M. Buffa
Affiliations
Charalampos P. Triantafyllidis
Department of Oncology, Medical Sciences Division, University of Oxford, Oxford, UK; Department of Epidemiology & Biostatistics, School of Public Health, Imperial College London, London, UK; Corresponding author
Alessandro Barberis
Department of Oncology, Medical Sciences Division, University of Oxford, Oxford, UK; Nuffield Department of Surgical Sciences, University of Oxford, Oxford, UK
Fiona Hartley
Department of Oncology, Medical Sciences Division, University of Oxford, Oxford, UK
Ana Miar Cuervo
Department of Oncology, Medical Sciences Division, University of Oxford, Oxford, UK
Enio Gjerga
Heidelberg University, Faculty of Medicine, Heidelberg University Hospital, Institute for Computational Biomedicine, Heidelberg, Germany
Philip Charlton
Department of Oncology, Medical Sciences Division, University of Oxford, Oxford, UK
Linda van Bijsterveldt
MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge, UK
Julio Saez Rodriguez
Heidelberg University, Faculty of Medicine, Heidelberg University Hospital, Institute for Computational Biomedicine, Heidelberg, Germany
Francesca M. Buffa
Department of Oncology, Medical Sciences Division, University of Oxford, Oxford, UK; Department of Computing Sciences, BIDSA, Bocconi University, Milan, Italy; Corresponding author
Summary: TP53, the Guardian of the Genome, is the most frequently mutated gene in human cancers and the functional characterization of its regulation is fundamental. To address this we employ two strategies: machine learning to predict the mutation status of TP53 from transcriptomic data, and directed regulatory networks to reconstruct the effect of mutations on the transcipt levels of TP53 targets. Using data from established databases (Cancer Cell Line Encyclopedia, The Cancer Genome Atlas), machine learning could predict the mutation status, but not resolve different mutations. On the contrary, directed network optimization allowed to infer the TP53 regulatory profile across: (1) mutations, (2) irradiation in lung cancer, and (3) hypoxia in breast cancer, and we could observe differential regulatory profiles dictated by (1) mutation type, (2) deleterious consequences of the mutation, (3) known hotspots, (4) protein changes, (5) stress condition (irradiation/hypoxia). This is an important first step toward using regulatory networks for the characterization of the functional consequences of mutations, and could be extended to other perturbations, with implications for drug design and precision medicine.