PLoS Computational Biology (Jun 2018)

CancerInSilico: An R/Bioconductor package for combining mathematical and statistical modeling to simulate time course bulk and single cell gene expression data in cancer.

  • Thomas D Sherman,
  • Luciane T Kagohara,
  • Raymon Cao,
  • Raymond Cheng,
  • Matthew Satriano,
  • Michael Considine,
  • Gabriel Krigsfeld,
  • Ruchira Ranaweera,
  • Yong Tang,
  • Sandra A Jablonski,
  • Genevieve Stein-O'Brien,
  • Daria A Gaykalova,
  • Louis M Weiner,
  • Christine H Chung,
  • Elana J Fertig

DOI
https://doi.org/10.1371/journal.pcbi.1006935
Journal volume & issue
Vol. 14, no. 4
p. e1006935

Abstract

Read online

Bioinformatics techniques to analyze time course bulk and single cell omics data are advancing. The absence of a known ground truth of the dynamics of molecular changes challenges benchmarking their performance on real data. Realistic simulated time-course datasets are essential to assess the performance of time course bioinformatics algorithms. We develop an R/Bioconductor package, CancerInSilico, to simulate bulk and single cell transcriptional data from a known ground truth obtained from mathematical models of cellular systems. This package contains a general R infrastructure for running cell-based models and simulating gene expression data based on the model states. We show how to use this package to simulate a gene expression data set and consequently benchmark analysis methods on this data set with a known ground truth. The package is freely available via Bioconductor: http://bioconductor.org/packages/CancerInSilico/.