PLoS ONE (Jan 2020)
Deciphering the functional diversity of DNA-binding transcription factors in Bacteria and Archaea organisms.
Abstract
DNA-binding Transcription Factors (TFs) play a central role in regulation of gene expression in prokaryotic organisms, and similarities at the sequence level have been reported. These proteins are predicted with different abundances as a consequence of genome size, where small organisms contain a low proportion of TFs and large genomes contain a high proportion of TFs. In this work, we analyzed a collection of 668 experimentally validated TFs across 30 different species from diverse taxonomical classes, including Escherichia coli K-12, Bacillus subtilis 168, Corynebacterium glutamicum, and Streptomyces coelicolor, among others. This collection of TFs, together with 111 hidden Markov model profiles associated with DNA-binding TFs collected from diverse databases such as PFAM and DBD, was used to identify the repertoire of proteins putatively devoted to gene regulation in 1321 representative genomes of Archaea and Bacteria. The predicted regulatory proteins were posteriorly analyzed in terms of their genomic context, allowing the prediction of functions for TFs and their neighbor genes, such as genes involved in virulence, enzymatic functions, phosphorylation mechanisms, and antibiotic resistance. The functional analysis associated with PFAM groups showed diverse functional categories were significantly enriched in the collection of TFs and the proteins encoded by the neighbor genes, in particular, small-molecule binding and amino acid transmembrane transporter activities associated with the LysR family and proteins devoted to cellular aromatic compound metabolic processes or responses to drugs, stress, or abiotic stimuli in the MarR family. We consider that with the increasing data derived from new technologies, novel TFs can be identified and help improve the predictions for this class of proteins in complete genomes. The complete collection of experimentally characterized and predicted TFs is available at http://web.pcyt.unam.mx/EntrafDB/.