Metabolites (Jul 2017)

Natural Product Discovery Using Planes of Principal Component Analysis in R (PoPCAR)

  • Shaurya Chanana,
  • Chris S. Thomas,
  • Doug R. Braun,
  • Yanpeng Hou,
  • Thomas P. Wyche,
  • Tim S. Bugni

DOI
https://doi.org/10.3390/metabo7030034
Journal volume & issue
Vol. 7, no. 3
p. 34

Abstract

Read online

Rediscovery of known natural products hinders the discovery of new, unique scaffolds. Efforts have mostly focused on streamlining the determination of what compounds are known vs. unknown (dereplication), but an alternative strategy is to focus on what is different. Utilizing statistics and assuming that common actinobacterial metabolites are likely known, focus can be shifted away from dereplication and towards discovery. LC-MS-based principal component analysis (PCA) provides a perfect tool to distinguish unique vs. common metabolites, but the variability inherent within natural products leads to datasets that do not fit ideal standards. To simplify the analysis of PCA models, we developed a script that identifies only those masses or molecules that are unique to each strain within a group, thereby greatly reducing the number of data points to be inspected manually. Since the script is written in R, it facilitates integration with other metabolomics workflows and supports automated mass matching to databases such as Antibase.

Keywords