Automated molecular structure segmentation from documents using ChemSAM

Bowen Tang; Zhangming Niu; Xiaofeng Wang; Junjie Huang; Chao Ma; Jing Peng; Yinghui Jiang; Ruiquan Ge; Hongyu Hu; Luhao Lin; Guang Yang

doi:10.1186/s13321-024-00823-2

Journal of Cheminformatics (Mar 2024)

Automated molecular structure segmentation from documents using ChemSAM

Bowen Tang,
Zhangming Niu,
Xiaofeng Wang,
Junjie Huang,
Chao Ma,
Jing Peng,
Yinghui Jiang,
Ruiquan Ge,
Hongyu Hu,
Luhao Lin,
Guang Yang

Affiliations

Bowen Tang: College of Life Sciences, Zhejiang University
Zhangming Niu: MindRank AI Ltd.
Xiaofeng Wang: MindRank AI Ltd.
Junjie Huang: MindRank AI Ltd.
Chao Ma: MindRank AI Ltd.
Jing Peng: Hunan University of Medicine
Yinghui Jiang: MindRank AI Ltd.
Ruiquan Ge: Hangzhou Dianzi University
Hongyu Hu: Xingzhi College, Zhejiang Normal University
Luhao Lin: Department of Pharmacy, The 910th Hospital of the Joint Logistics Support Force of the Chinese PLA
Guang Yang: Bioengineering Department and Imperial-X, Imperial College London

DOI: https://doi.org/10.1186/s13321-024-00823-2
Journal volume & issue: Vol. 16, no. 1
pp. 1 – 13

Abstract

Read online

Abstract Chemical structure segmentation constitutes a pivotal task in cheminformatics, involving the extraction and abstraction of structural information of chemical compounds from text-based sources, including patents and scientific articles. This study introduces a deep learning approach to chemical structure segmentation, employing a Vision Transformer (ViT) to discern the structural patterns of chemical compounds from their graphical representations. The Chemistry-Segment Anything Model (ChemSAM) achieves state-of-the-art results on publicly available benchmark datasets and real-world tasks, underscoring its effectiveness in accurately segmenting chemical structures from text-based sources. Moreover, this deep learning-based approach obviates the need for handcrafted features and demonstrates robustness against variations in image quality and style. During the detection phase, a ViT-based encoder-decoder model is used to identify and locate chemical structure depictions on the input page. This model generates masks to ascertain whether each pixel belongs to a chemical structure, thereby offering a pixel-level classification and indicating the presence or absence of chemical structures at each position. Subsequently, the generated masks are clustered based on their connectivity, and each mask cluster is updated to encapsulate a single structure in the post-processing workflow. This two-step process facilitates the effective automatic extraction of chemical structure depictions from documents. By utilizing the deep learning approach described herein, it is demonstrated that effective performance on low-resolution and densely arranged molecular structural layouts in journal articles and patents is achievable.

Published in Journal of Cheminformatics

ISSN: 1758-2946 (Online)
Publisher: BMC
Country of publisher: United Kingdom
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering: Information technology; Science: Chemistry
Website: https://jcheminf.biomedcentral.com/

About the journal