Algorithms for Molecular Biology (Nov 2019)
TMRS: an algorithm for computing the time to the most recent substitution event from a multiple alignment column
Abstract
Abstract Background As the number of sequenced genomes grows, researchers have access to an increasingly rich source for discovering detailed evolutionary information. However, the computational technologies for inferring biologically important evolutionary events are not sufficiently developed. Results We present algorithms to estimate the evolutionary time ($$t_{\text {MRS}}$$ tMRS ) to the most recent substitution event from a multiple alignment column by using a probabilistic model of sequence evolution. As the confidence in estimated $$t_{\text {MRS}}$$ tMRS values varies depending on gap fractions and nucleotide patterns of alignment columns, we also compute the standard deviation $$\sigma$$ σ of $$t_{\text {MRS}}$$ tMRS by using a dynamic programming algorithm. We identified a number of human genomic sites at which the last substitutions occurred between two speciation events in the human lineage with confidence. A large fraction of such sites have substitutions that occurred between the concestor nodes of Hominoidea and Euarchontoglires. We investigated the correlation between tissue-specific transcribed enhancers and the distribution of the sites with specific substitution time intervals, and found that brain-specific transcribed enhancers are threefold enriched in the density of substitutions in the human lineage relative to expectations. Conclusions We have presented algorithms to estimate the evolutionary time ($$t_{\text {MRS}}$$ tMRS ) to the most recent substitution event from a multiple alignment column by using a probabilistic model of sequence evolution. Our algorithms will be useful for Evo-Devo studies, as they facilitate screening potential genomic sites that have played an important role in the acquisition of unique biological features by target species.
Keywords