Quick and effective approximation of in silico saturation mutagenesis experiments with first-order taylor expansion

Alexander Sasse; Maria Chikina; Sara Mostafavi

iScience (Sep 2024)

Quick and effective approximation of in silico saturation mutagenesis experiments with first-order taylor expansion

Alexander Sasse,
Maria Chikina,
Sara Mostafavi

Affiliations

Alexander Sasse: Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA 98195, USA
Maria Chikina: Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA 16354, USA; Corresponding author
Sara Mostafavi: Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA 98195, USA; Canadian Institute for Advanced Research, Toronto, ON MG5 1ZB, Canada; Corresponding author

Journal volume & issue: Vol. 27, no. 9
p. 110807

Abstract

Read online

Summary: To understand the decision process of genomic sequence-to-function models, explainable AI algorithms determine the importance of each nucleotide in a given input sequence to the model’s predictions and enable discovery of cis-regulatory motifs for gene regulation. The most commonly applied method is in silico saturation mutagenesis (ISM) because its per-nucleotide importance scores can be intuitively understood as the computational counterpart to in vivo saturation mutagenesis experiments. While ISM is highly interpretable, it is computationally challenging to perform for many sequences, and becomes prohibitive as the length of the input sequences and size of the model grows. Here, we use the first-order Taylor approximation to approximate ISM values from the model’s gradient, which reduces its computation cost to a single forward pass for an input sequence. We show that the Taylor ISM (TISM) approximation is robust across different model ablations, random initializations, training parameters, and dataset sizes.

Published in iScience

ISSN: 2589-0042 (Online)
Publisher: Elsevier
Country of publisher: United States
LCC subjects: Science
Website: http://www.cell.com/iscience/home

About the journal

Abstract

Keywords