Life (Apr 2022)

The R Language: An Engine for Bioinformatics and Data Science

  • Federico M. Giorgi,
  • Carmine Ceraolo,
  • Daniele Mercatelli

DOI
https://doi.org/10.3390/life12050648
Journal volume & issue
Vol. 12, no. 5
p. 648

Abstract

Read online

The R programming language is approaching its 30th birthday, and in the last three decades it has achieved a prominent role in statistics, bioinformatics, and data science in general. It currently ranks among the top 10 most popular languages worldwide, and its community has produced tens of thousands of extensions and packages, with scopes ranging from machine learning to transcriptome data analysis. In this review, we provide an historical chronicle of how R became what it is today, describing all its current features and capabilities. We also illustrate the major tools of R, such as the current R editors and integrated development environments (IDEs), the R Shiny web server, the R methods for machine learning, and its relationship with other programming languages. We also discuss the role of R in science in general as a driver for reproducibility. Overall, we hope to provide both a complete snapshot of R today and a practical compendium of the major features and applications of this programming language.

Keywords