M&M: an RNA-seq based pan-cancer classifier for paediatric tumoursResearch in context
Fleur S.A. Wallis,
John L. Baker-Hernandez,
Marc van Tuil,
Claudia van Hamersveld,
Marco J. Koudijs,
Eugène T.P. Verwiel,
Alex Janse,
Laura S. Hiemcke-Jiwa,
Ronald R. de Krijger,
Mariëtte E.G. Kranendonk,
Marijn A. Vermeulen,
Pieter Wesseling,
Uta E. Flucke,
Valérie de Haas,
Maaike Luesink,
Eelco W. Hoving,
Josef H. Vormoor,
Max M. van Noesel,
Jayne Y. Hehir-Kwa,
Bastiaan B.J. Tops,
Patrick Kemmeren,
Lennart A. Kester
Affiliations
Fleur S.A. Wallis
Princess Máxima Center for Paediatric Oncology, Utrecht, the Netherlands
John L. Baker-Hernandez
Princess Máxima Center for Paediatric Oncology, Utrecht, the Netherlands
Marc van Tuil
Princess Máxima Center for Paediatric Oncology, Utrecht, the Netherlands
Claudia van Hamersveld
Princess Máxima Center for Paediatric Oncology, Utrecht, the Netherlands
Marco J. Koudijs
Princess Máxima Center for Paediatric Oncology, Utrecht, the Netherlands
Eugène T.P. Verwiel
Princess Máxima Center for Paediatric Oncology, Utrecht, the Netherlands
Alex Janse
Princess Máxima Center for Paediatric Oncology, Utrecht, the Netherlands
Laura S. Hiemcke-Jiwa
Princess Máxima Center for Paediatric Oncology, Utrecht, the Netherlands; Department of Pathology, UMC Utrecht, Utrecht, the Netherlands
Ronald R. de Krijger
Princess Máxima Center for Paediatric Oncology, Utrecht, the Netherlands; Department of Pathology, UMC Utrecht, Utrecht, the Netherlands
Mariëtte E.G. Kranendonk
Princess Máxima Center for Paediatric Oncology, Utrecht, the Netherlands
Marijn A. Vermeulen
Princess Máxima Center for Paediatric Oncology, Utrecht, the Netherlands
Pieter Wesseling
Princess Máxima Center for Paediatric Oncology, Utrecht, the Netherlands; Department of Pathology, Amsterdam University Medical Centres/VUmc, Amsterdam, the Netherlands
Uta E. Flucke
Princess Máxima Center for Paediatric Oncology, Utrecht, the Netherlands
Valérie de Haas
Princess Máxima Center for Paediatric Oncology, Utrecht, the Netherlands
Maaike Luesink
Princess Máxima Center for Paediatric Oncology, Utrecht, the Netherlands
Eelco W. Hoving
Princess Máxima Center for Paediatric Oncology, Utrecht, the Netherlands
Josef H. Vormoor
Princess Máxima Center for Paediatric Oncology, Utrecht, the Netherlands; Utrecht Cancer Center, UMC Utrecht, Utrecht, the Netherlands
Max M. van Noesel
Princess Máxima Center for Paediatric Oncology, Utrecht, the Netherlands; Division Imaging & Cancer, UMC Utrecht, Utrecht, the Netherlands
Jayne Y. Hehir-Kwa
Princess Máxima Center for Paediatric Oncology, Utrecht, the Netherlands
Bastiaan B.J. Tops
Princess Máxima Center for Paediatric Oncology, Utrecht, the Netherlands
Patrick Kemmeren
Princess Máxima Center for Paediatric Oncology, Utrecht, the Netherlands; Center for Molecular Medicine, UMC Utrecht & Utrecht University, Utrecht, the Netherlands; Corresponding author. Princess Máxima Center for Paediatric Oncology, Heidelberglaan 25, 3584CS Utrecht, the Netherlands.
Lennart A. Kester
Princess Máxima Center for Paediatric Oncology, Utrecht, the Netherlands; Corresponding author.
Summary: Background: With many rare tumour types, acquiring the correct diagnosis is a challenging but crucial process in paediatric oncology. Historically, this is done based on histology and morphology of the disease. However, advances in genome wide profiling techniques such as RNA sequencing now allow the development of molecular classification tools. Methods: Here, we present M&M, a pan-paediatric cancer ensemble-based machine learning algorithm tailored towards inclusion of rare tumour types. Findings: The RNA-seq based algorithm can classify 52 different tumour types (precision ∼99%, recall ∼80%), plus the underlying 96 tumour subtypes (precision ∼96%, recall ∼70%). For low-confidence classifications, a comparable precision is achieved when including the three highest-scoring labels. We then validated M&M on an internal dataset (precision 99%, recall 76%) and an external dataset from the KidsFirst initiative (precision 98%, recall 77%). Finally, we show that M&M has similar performance as existing disease or domain specific classification algorithms based on RNA sequencing or methylation data. Interpretation: M&M's pan-cancer setup allows for easy clinical implementation, requiring only one classifier for all incoming diagnostic samples, including samples from different tumour stages and treatment statuses. Simultaneously, its performance is comparable to existing tumour- and tissue-specific classifiers. The introduction of an extensive pan-cancer classifier in diagnostics has the potential to increase diagnostic accuracy for many paediatric cancer cases, thereby contributing towards optimal patient survival and quality of life. Funding: Financial support was provided by the Foundation Children Cancer Free (KiKa core funding) and Adessium Foundation.