Microbial Cell Factories (May 2022)
Genome sequence and Carbohydrate Active Enzymes (CAZymes) repertoire of the thermophilic Caldicoprobacter algeriensis TH7C1T
Abstract
Abstract Background Omics approaches are widely applied in the field of biology for the discovery of potential CAZymes including whole genome sequencing. The aim of this study was to identify protein encoding genes including CAZymes in order to understand glycans-degrading machinery in the thermophilic Caldicoprobacter algeriensis TH7C1T strain. Results Caldicoprobacter algeriensis TH7C1T is a thermophilic anaerobic bacterium belonging to the Firmicutes phylum, which grows between the temperatures of 55 °C and 75 °C. Next generation sequencing using Illumina technology was performed on the C. algeriensis strain resulting in 45 contigs with an average GC content of 44.9% and a total length of 2,535,023 bp. Genome annotation reveals 2425 protein-coding genes with 97 ORFs coding CAZymes. Many glycoside hydrolases, carbohydrate esterases and glycosyltransferases genes were found linked to genes encoding oligosaccharide transporters and transcriptional regulators; suggesting that CAZyme encoding genes are organized in clusters involved in polysaccharides degradation and transport. In depth analysis of CAZomes content in C. algeriensis genome unveiled 33 CAZyme gene clusters uncovering new enzyme combinations targeting specific substrates. Conclusions This study is the first targeting CAZymes repertoire of C. algeriensis, it provides insight to the high potential of identified enzymes for plant biomass degradation and their biotechnological applications.
Keywords