Algorithms for Molecular Biology (Aug 2019)

A branching process for homology distribution-based inference of polyploidy, speciation and loss

  • Yue Zhang,
  • Chunfang Zheng,
  • David Sankoff

DOI
https://doi.org/10.1186/s13015-019-0153-8
Journal volume & issue
Vol. 14, no. 1
pp. 1 – 13

Abstract

Read online

Abstract Background The statistical distribution of the similarity or difference between pairs of paralogous genes, created by whole genome doubling, or between pairs of orthologous genes in two related species is an important source of information about genomic evolution, especially in plants. Methods We derive the mixture of distributions of sequence similarity for duplicate gene pairs generated by repeated episodes of whole gene doubling. This involves integrating sequence divergence and gene pair loss through fractionation, using a branching process and a mutational model. We account not only for the timing of these events in terms of local modes, but also the amplitude and variance of the component distributions. This model is then extended to orthologous gene pairs. Results We apply the model and inference procedures to the evolution of the Solanaceae, focusing on the genomes of economically important crops. We assess how consistent or variable fractionation rates are from species to species and over time.

Keywords