Data in Brief (Jun 2024)
High-throughput metagenomic assessment of Cango Cave microbiome–A South African limestone cave
Abstract
Microorganisms inhabiting caves exhibit medical or biotechnological promise, most of which have been attributed to factors such as antimicrobial activity or the induction of mineral precipitation. This dataset explored the shotgun metagenomic sequencing of the Cango cave microbial community in Oudtshoorn, South Africa. The aimed to elucidate both the structure and function of the microbial community linked to the cave. DNA sequencing was conducted using the Illumina NovaSeq platform, a next-generation sequencing. The data comprises 4,738,604 sequences, with a cumulative size of 1,180,744,252 base pairs and a GC content of 52%. Data derived from the metagenome sequences can be accessed through the bioproject number PRJNA982691 on NCBI. Using an online metagenome server, MG-RAST, the subsystem database revealed that bacteria displayed the highest taxonomical representation, constituting about 98.66%. Archaea accounted for 0.05%, Eukaryotes at 1.20%, viruses were 0.07%, while unclassified sequences had a representation of 0.02%. The most abundant phyla were Proteobacteria (81.74%), Bacteroidetes (10.57%), Actinobacteria (4.16%), Firmicutes (SK‒1.03%), Acidobacteria (0.20), and Planctomycetes (SK‒0.16%). Functional annotation using subsystem analysis revealed that clustering based on subsystems had 13.44%, while amino acids and derivatives comprised 11.41%. Carbohydrates sequences constituted 9.55%, along with other advantageous functional traits essential for growth promotion and plant management.