PLoS Computational Biology (Mar 2023)

Combining phylogeny and coevolution improves the inference of interaction partners among paralogous proteins.

  • Carlos A Gandarilla-Pérez,
  • Sergio Pinilla,
  • Anne-Florence Bitbol,
  • Martin Weigt

DOI
https://doi.org/10.1371/journal.pcbi.1011010
Journal volume & issue
Vol. 19, no. 3
p. e1011010

Abstract

Read online

Predicting protein-protein interactions from sequences is an important goal of computational biology. Various sources of information can be used to this end. Starting from the sequences of two interacting protein families, one can use phylogeny or residue coevolution to infer which paralogs are specific interaction partners within each species. We show that these two signals can be combined to improve the performance of the inference of interaction partners among paralogs. For this, we first align the sequence-similarity graphs of the two families through simulated annealing, yielding a robust partial pairing. We next use this partial pairing to seed a coevolution-based iterative pairing algorithm. This combined method improves performance over either separate method. The improvement obtained is striking in the difficult cases where the average number of paralogs per species is large or where the total number of sequences is modest.