BMC Biology (Jul 2022)

Conserved and lineage-specific hypothetical proteins may have played a central role in the rise and diversification of major archaeal groups

  • Raphaël Méheust,
  • Cindy J. Castelle,
  • Alexander L. Jaffe,
  • Jillian F. Banfield

DOI
https://doi.org/10.1186/s12915-022-01348-6
Journal volume & issue
Vol. 20, no. 1
pp. 1 – 17

Abstract

Read online

Abstract Background Archaea play fundamental roles in the environment, for example by methane production and consumption, ammonia oxidation, protein degradation, carbon compound turnover, and sulfur compound transformations. Recent genomic analyses have profoundly reshaped our understanding of the distribution and functionalities of Archaea and their roles in eukaryotic evolution. Results Here, 1179 representative genomes were selected from 3197 archaeal genomes. The representative genomes clustered based on the content of 10,866 newly defined archaeal protein families (that will serve as a community resource) recapitulates archaeal phylogeny. We identified the co-occurring proteins that distinguish the major lineages. Those with metabolic roles were consistent with experimental data. However, two families specific to Asgard were determined to be new eukaryotic signature proteins. Overall, the blocks of lineage-specific families are dominated by proteins that lack functional predictions. Conclusions Given that these hypothetical proteins are near ubiquitous within major archaeal groups, we propose that they were important in the origin of most of the major archaeal lineages. Interestingly, although there were clearly phylum-specific co-occurring proteins, no such blocks of protein families were shared across superphyla, suggesting a burst-like origin of new lineages early in archaeal evolution.

Keywords