Mathematics (Mar 2021)

GASVeM: A New Machine Learning Methodology for Multi-SNP Analysis of GWAS Data Based on Genetic Algorithms and Support Vector Machines

  • Fidel Díez Díaz,
  • Fernando Sánchez Lasheras,
  • Víctor Moreno,
  • Ferran Moratalla-Navarro,
  • Antonio José Molina de la Torre,
  • Vicente Martín Sánchez

DOI
https://doi.org/10.3390/math9060654
Journal volume & issue
Vol. 9, no. 6
p. 654

Abstract

Read online

Genome-wide association studies (GWAS) are observational studies of a large set of genetic variants in an individual’s sample in order to find if any of these variants are linked to a particular trait. In the last two decades, GWAS have contributed to several new discoveries in the field of genetics. This research presents a novel methodology to which GWAS can be applied to. It is mainly based on two machine learning methodologies, genetic algorithms and support vector machines. The database employed for the study consisted of information about 370,750 single-nucleotide polymorphisms belonging to 1076 cases of colorectal cancer and 973 controls. Ten pathways with different degrees of relationship with the trait under study were tested. The results obtained showed how the proposed methodology is able to detect relevant pathways for a certain trait: in this case, colorectal cancer.

Keywords