Computational and Structural Biotechnology Journal (Jan 2021)

A primer on machine learning techniques for genomic applications

  • Alfonso Monaco,
  • Ester Pantaleo,
  • Nicola Amoroso,
  • Antonio Lacalamita,
  • Claudio Lo Giudice,
  • Adriano Fonzino,
  • Bruno Fosso,
  • Ernesto Picardi,
  • Sabina Tangaro,
  • Graziano Pesole,
  • Roberto Bellotti

Journal volume & issue
Vol. 19
pp. 4345 – 4359

Abstract

Read online

High throughput sequencing technologies have enabled the study of complex biological aspects at single nucleotide resolution, opening the big data era. The analysis of large volumes of heterogeneous “omic” data, however, requires novel and efficient computational algorithms based on the paradigm of Artificial Intelligence. In the present review, we introduce and describe the most common machine learning methodologies, and lately deep learning, applied to a variety of genomics tasks, trying to emphasize capabilities, strengths and limitations through a simple and intuitive language. We highlight the power of the machine learning approach in handling big data by means of a real life example, and underline how described methods could be relevant in all cases in which large amounts of multimodal genomic data are available.

Keywords