Frontiers in Microbiology (Jul 2020)

Batch-Learning Self-Organizing Map Identifies Horizontal Gene Transfer Candidates and Their Origins in Entire Genomes

  • Takashi Abe,
  • Yu Akazawa,
  • Atsushi Toyoda,
  • Atsushi Toyoda,
  • Hironori Niki,
  • Tomoya Baba,
  • Tomoya Baba

DOI
https://doi.org/10.3389/fmicb.2020.01486
Journal volume & issue
Vol. 11

Abstract

Read online

Horizontal gene transfer (HGT) has been widely suggested to play a critical role in the environmental adaptation of microbes; however, the number and origin of the genes in microbial genomes obtained through HGT remain unknown as the frequency of detected HGT events is generally underestimated, particularly in the absence of information on donor sequences. As an alternative to phylogeny-based methods that rely on sequence alignments, we have developed an alignment-free clustering method on the basis of an unsupervised neural network “Batch-Learning Self-Organizing Map (BLSOM)” in which sequence fragments are clustered based solely on oligonucleotide similarity without taxonomical information, to detect HGT candidates and their origin in entire genomes. By mapping the microbial genomic sequences on large-scale BLSOMs constructed with nearly all prokaryotic genomes, HGT candidates can be identified, and their origin assigned comprehensively, even for microbial genomes that exhibit high novelty. By focusing on two types of Alphaproteobacteria, specifically psychrotolerant Sphingomonas strains from an Antarctic lake, we detected HGT candidates using BLSOM and found higher proportions of HGT candidates from organisms belonging to Betaproteobacteria in the genomes of these two Antarctic strains compared with those of continental strains. Further, an origin difference was noted in the HGT candidates found in the two Antarctic strains. Although their origins were highly diversified, gene functions related to the cell wall or membrane biogenesis were shared among the HGT candidates. Moreover, analyses of amino acid frequency suggested that housekeeping genes and some HGT candidates of the Antarctic strains exhibited different characteristics to other continental strains. Lys, Ser, Thr, and Val were the amino acids found to be increased in the Antarctic strains, whereas Ala, Arg, Glu, and Leu were decreased. Our findings strongly suggest a low-temperature adaptation process for microbes that may have arisen convergently as an independent evolutionary strategy in each Antarctic strain. Hence, BLSOM analysis could serve as a powerful tool in not only detecting HGT candidates and their origins in entire genomes, but also in providing novel perspectives into the environmental adaptations of microbes.

Keywords