Efficient implied alignment

Alex J. Washburn; Ward C. Wheeler

doi:10.1186/s12859-020-03595-2

BMC Bioinformatics (Jul 2020)

Efficient implied alignment

Alex J. Washburn,
Ward C. Wheeler

Affiliations

Alex J. Washburn: Division of Invertebrate Zoology, American Museum of Natural History
Ward C. Wheeler: Division of Invertebrate Zoology, American Museum of Natural History

DOI: https://doi.org/10.1186/s12859-020-03595-2
Journal volume & issue: Vol. 21, no. 1
pp. 1 – 18

Abstract

Read online

Abstract Background Given a binary tree T $\mathcal {T}$ of n leaves, each leaf labeled by a string of length at most k, and a binary string alignment function ⊗, an implied alignment can be generated to describe the alignment of a dynamic homology for T $\mathcal {T}$ . This is done by first decorating each node of T $\mathcal {T}$ with an alignment context using ⊗, in a post-order traversal, then, during a subsequent pre-order traversal, inferring on which edges insertion and deletion events occurred using those internal node decorations. Results Previous descriptions of the implied alignment algorithm suggest a technique of “back-propagation” with time complexity O k 2 ∗ n 2 $\mathcal {O}\left (k^{2} * n^{2}\right)$ . Here we describe an implied alignment algorithm with complexity O k ∗ n 2 $\mathcal {O}\left (k * n^{2}\right)$ . For well-behaved data, such as molecular sequences, the runtime approaches the best-case complexity of Ω(k∗n). Conclusions The reduction in the time complexity of the algorithm dramatically improves both its utility in generating multiple sequence alignments and its heuristic utility.

Published in BMC Bioinformatics

ISSN: 1471-2105 (Online)
Publisher: BMC
Country of publisher: United Kingdom
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics; Science: Biology (General)
Website: http://www.biomedcentral.com/bmcbioinformatics/

About the journal

Abstract

Keywords