Journal of Cheminformatics (Dec 2022)

Visualizing chemical space networks with RDKit and NetworkX

  • Vincent F. Scalfani,
  • Vishank D. Patel,
  • Avery M. Fernandez

DOI
https://doi.org/10.1186/s13321-022-00664-x
Journal volume & issue
Vol. 14, no. 1
pp. 1 – 13

Abstract

Read online

Abstract This article demonstrates how to create Chemical Space Networks (CSNs) using a Python RDKit and NetworkX workflow. CSNs are a type of network visualization that depict compounds as nodes connected by edges, defined as a pairwise relationship such as a 2D fingerprint similarity value. A step by step approach is presented for creating two different CSNs in this manuscript, one based on RDKit 2D fingerprint Tanimoto similarity values, and another based on maximum common substructure similarity values. Several different CSN visualization features are included in the tutorial including methods to represent nodes with color based on bioactivity attribute value, edges with different line styles based on similarity value, as well as replacing the circle nodes with 2D structure depictions. Finally, some common network property and analysis calculations are presented including the clustering coefficient, degree assortativity, and modularity. All code is provided in the form of Jupyter Notebooks and is available on GitHub with a permissive BSD-3 open-source license: https://github.com/vfscalfani/CSN_tutorial Graphical Abstract

Keywords