BMC Bioinformatics (Apr 2020)
CYPminer: an automated cytochrome P450 identification, classification, and data analysis tool for genome data sets across kingdoms
Abstract
Abstract Background Cytochrome P450 monooxygenases (termed CYPs or P450s) are hemoproteins ubiquitously found across all kingdoms, playing a central role in intracellular metabolism, especially in metabolism of drugs and xenobiotics. The explosive growth of genome sequencing brings a new set of challenges and issues for researchers, such as a systematic investigation of CYPs across all kingdoms in terms of identification, classification, and pan-CYPome analyses. Such investigation requires an automated tool that can handle an enormous amount of sequencing data in a timely manner. Results CYPminer was developed in the Python language to facilitate rapid, comprehensive analysis of CYPs from genomes of all kingdoms. CYPminer consists of two procedures i) to generate the Genome-CYP Matrix (GCM) that lists all occurrences of CYPs across the genomes, and ii) to perform analyses and visualization of the GCM, including pan-CYPomes (pan- and core-CYPome), CYP co-occurrence networks, CYP clouds, and genome clustering data. The performance of CYPminer was evaluated with three datasets from fungal and bacterial genome sequences. Conclusions CYPminer completes CYP analyses for large-scale genomes from all kingdoms, which allows systematic genome annotation and comparative insights for CYPs. CYPminer also can be extended and adapted easily for broader usage.
Keywords