Complex & Intelligent Systems (Aug 2024)

Molecular subgraph representation learning based on spatial structure transformer

  • Shaoguang Zhang,
  • Jianguang Lu,
  • Xianghong Tang

DOI
https://doi.org/10.1007/s40747-024-01602-0
Journal volume & issue
Vol. 10, no. 6
pp. 8197 – 8212

Abstract

Read online

Abstract In the field of molecular biology, graph representation learning is crucial for molecular structure analysis. However, challenges arise in recognising functional groups and distinguishing isomers due to a lack of spatial structure information. To address these problems, we design a novel graph representation learning method based on a spatial structure information extraction Transformer (SSET). The SSET model comprises the Edge Feature Fusion Subgraph Spatial Structure Extractor (ETSE) module and the Positional Information Encoding Graph Transformer (PEGT) module. The ETSE module extracts spatial structural information by fusing edge features and generating the most-value subgraph (Mv-subgraph). The PEGT module encodes positional information based on the graph transformer, addressing the indistinguishability problem among nodes with identical features. In addition, the SSET model alleviates the burden of high computational complexity by using subgraph. Experiments on real datasets show that the SSET model, built on the graph transformer, considerably improves graph representation learning.

Keywords