PCycDB: a comprehensive and accurate database for fast analysis of phosphorus cycling genes
Jiaxiong Zeng,
Qichao Tu,
Xiaoli Yu,
Lu Qian,
Cheng Wang,
Longfei Shu,
Fei Liu,
Shengwei Liu,
Zhijian Huang,
Jianguo He,
Qingyun Yan,
Zhili He
Affiliations
Jiaxiong Zeng
Environmental Microbiomics Research Center, School of Environmental Science and Engineering, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), State Key Laboratory of Biocontrol, Sun Yat-sen University
Qichao Tu
Institute of Marine Science and Technology, Shandong University
Xiaoli Yu
Environmental Microbiomics Research Center, School of Environmental Science and Engineering, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), State Key Laboratory of Biocontrol, Sun Yat-sen University
Lu Qian
Environmental Microbiomics Research Center, School of Environmental Science and Engineering, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), State Key Laboratory of Biocontrol, Sun Yat-sen University
Cheng Wang
Environmental Microbiomics Research Center, School of Environmental Science and Engineering, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), State Key Laboratory of Biocontrol, Sun Yat-sen University
Longfei Shu
Environmental Microbiomics Research Center, School of Environmental Science and Engineering, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), State Key Laboratory of Biocontrol, Sun Yat-sen University
Fei Liu
Environmental Microbiomics Research Center, School of Environmental Science and Engineering, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), State Key Laboratory of Biocontrol, Sun Yat-sen University
Shengwei Liu
Environmental Microbiomics Research Center, School of Environmental Science and Engineering, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), State Key Laboratory of Biocontrol, Sun Yat-sen University
Zhijian Huang
Environmental Microbiomics Research Center, School of Environmental Science and Engineering, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), State Key Laboratory of Biocontrol, Sun Yat-sen University
Jianguo He
Environmental Microbiomics Research Center, School of Environmental Science and Engineering, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), State Key Laboratory of Biocontrol, Sun Yat-sen University
Qingyun Yan
Environmental Microbiomics Research Center, School of Environmental Science and Engineering, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), State Key Laboratory of Biocontrol, Sun Yat-sen University
Zhili He
Environmental Microbiomics Research Center, School of Environmental Science and Engineering, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), State Key Laboratory of Biocontrol, Sun Yat-sen University
Abstract Background Phosphorus (P) is one of the most essential macronutrients on the planet, and microorganisms (including bacteria and archaea) play a key role in P cycling in all living things and ecosystems. However, our comprehensive understanding of key P cycling genes (PCGs) and microorganisms (PCMs) as well as their ecological functions remains elusive even with the rapid advancement of metagenome sequencing technologies. One of major challenges is a lack of a comprehensive and accurately annotated P cycling functional gene database. Results In this study, we constructed a well-curated P cycling database (PCycDB) covering 139 gene families and 10 P metabolic processes, including several previously ignored PCGs such as pafA encoding phosphate-insensitive phosphatase, ptxABCD (phosphite-related genes), and novel aepXVWPS genes for 2-aminoethylphosphonate transporters. We achieved an annotation accuracy, positive predictive value (PPV), sensitivity, specificity, and negative predictive value (NPV) of 99.8%, 96.1%, 99.9%, 99.8%, and 99.9%, respectively, for simulated gene datasets. Compared to other orthology databases, PCycDB is more accurate, more comprehensive, and faster to profile the PCGs. We used PCycDB to analyze P cycling microbial communities from representative natural and engineered environments and showed that PCycDB could apply to different environments. Conclusions We demonstrate that PCycDB is a powerful tool for advancing our understanding of microbially driven P cycling in the environment with high coverage, high accuracy, and rapid analysis of metagenome sequencing data. The PCycDB is available at https://github.com/ZengJiaxiong/Phosphorus-cycling-database . Video Abstract