Automated quantitative trait locus analysis (AutoQTL)

Philip J. Freda; Attri Ghosh; Elizabeth Zhang; Tianhao Luo; Apurva  S. Chitre; Oksana Polesskaya; Celine L. St. Pierre; Jianjun Gao; Connor D. Martin; Hao Chen; Angel G. Garcia-Martinez; Tengfei Wang; Wenyan Han; Keita Ishiwari; Paul Meyer; Alexander Lamparelli; Christopher P. King; Abraham A. Palmer; Ruowang Li; Jason H. Moore

doi:10.1186/s13040-023-00331-3

BioData Mining (Apr 2023)

Automated quantitative trait locus analysis (AutoQTL)

Philip J. Freda,
Attri Ghosh,
Elizabeth Zhang,
Tianhao Luo,
Apurva S. Chitre,
Oksana Polesskaya,
Celine L. St. Pierre,
Jianjun Gao,
Connor D. Martin,
Hao Chen,
Angel G. Garcia-Martinez,
Tengfei Wang,
Wenyan Han,
Keita Ishiwari,
Paul Meyer,
Alexander Lamparelli,
Christopher P. King,
Abraham A. Palmer,
Ruowang Li,
Jason H. Moore

Affiliations

Philip J. Freda: Department of Computational Biomedicine, Cedars-Sinai Medical Center
Attri Ghosh: Department of Computational Biomedicine, Cedars-Sinai Medical Center
Elizabeth Zhang: Department of Computational Biomedicine, Cedars-Sinai Medical Center
Tianhao Luo: Department of Computational Biomedicine, Cedars-Sinai Medical Center
Apurva S. Chitre: Department of Psychiatry, University of California San Diego
Oksana Polesskaya: Department of Psychiatry, University of California San Diego
Celine L. St. Pierre: Department of Psychiatry, University of California San Diego
Jianjun Gao: Department of Psychiatry, University of California San Diego
Connor D. Martin: Department of Pharmacology & Toxicology, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo
Hao Chen: Department of Pharmacology, Addiction Science, and Toxicology, University of Tennessee Health Science Center, Translational Research Building
Angel G. Garcia-Martinez: Department of Pharmacology, Addiction Science, and Toxicology, University of Tennessee Health Science Center, Translational Research Building
Tengfei Wang: Department of Pharmacology, Addiction Science, and Toxicology, University of Tennessee Health Science Center, Translational Research Building
Wenyan Han: Department of Pharmacology, Addiction Science, and Toxicology, University of Tennessee Health Science Center, Translational Research Building
Keita Ishiwari: Department of Pharmacology & Toxicology, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo
Paul Meyer: Department of Psychology, University at Buffalo, 204 Park Hall, North Campus
Alexander Lamparelli: Department of Psychology, University at Buffalo, 204 Park Hall, North Campus
Christopher P. King: Department of Psychology, University at Buffalo, 204 Park Hall, North Campus
Abraham A. Palmer: Department of Psychiatry, University of California San Diego
Ruowang Li: Department of Computational Biomedicine, Cedars-Sinai Medical Center
Jason H. Moore: Department of Computational Biomedicine, Cedars-Sinai Medical Center

DOI: https://doi.org/10.1186/s13040-023-00331-3
Journal volume & issue: Vol. 16, no. 1
pp. 1 – 24

Abstract

Read online

Abstract Background Quantitative Trait Locus (QTL) analysis and Genome-Wide Association Studies (GWAS) have the power to identify variants that capture significant levels of phenotypic variance in complex traits. However, effort and time are required to select the best methods and optimize parameters and pre-processing steps. Although machine learning approaches have been shown to greatly assist in optimization and data processing, applying them to QTL analysis and GWAS is challenging due to the complexity of large, heterogenous datasets. Here, we describe proof-of-concept for an automated machine learning approach, AutoQTL, with the ability to automate many complicated decisions related to analysis of complex traits and generate solutions to describe relationships that exist in genetic data. Results Using a publicly available dataset of 18 putative QTL from a large-scale GWAS of body mass index in the laboratory rat, Rattus norvegicus, AutoQTL captures the phenotypic variance explained under a standard additive model. AutoQTL also detects evidence of non-additive effects including deviations from additivity and 2-way epistatic interactions in simulated data via multiple optimal solutions. Additionally, feature importance metrics provide different insights into the inheritance models and predictive power of multiple GWAS-derived putative QTL. Conclusions This proof-of-concept illustrates that automated machine learning techniques can complement standard approaches and have the potential to detect both additive and non-additive effects via various optimal solutions and feature importance metrics. In the future, we aim to expand AutoQTL to accommodate omics-level datasets with intelligent feature selection and feature engineering strategies.

Published in BioData Mining

ISSN: 1756-0381 (Online)
Publisher: BMC
Country of publisher: United Kingdom
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics; Science: Mathematics: Analysis
Website: https://biodatamining.biomedcentral.com/

About the journal

Abstract

Keywords