Algorithms for Molecular Biology (Aug 2021)

A novel method for inference of acyclic chemical compounds with bounded branch-height based on artificial neural networks and integer programming

  • Naveed Ahmed Azam,
  • Jianshen Zhu,
  • Yanming Sun,
  • Yu Shi,
  • Aleksandar Shurbevski,
  • Liang Zhao,
  • Hiroshi Nagamochi,
  • Tatsuya Akutsu

DOI
https://doi.org/10.1186/s13015-021-00197-2
Journal volume & issue
Vol. 16, no. 1
pp. 1 – 39

Abstract

Read online

Abstract Analysis of chemical graphs is becoming a major research topic in computational molecular biology due to its potential applications to drug design. One of the major approaches in such a study is inverse quantitative structure activity/property relationship (inverse QSAR/QSPR) analysis, which is to infer chemical structures from given chemical activities/properties. Recently, a novel two-phase framework has been proposed for inverse QSAR/QSPR, where in the first phase an artificial neural network (ANN) is used to construct a prediction function. In the second phase, a mixed integer linear program (MILP) formulated on the trained ANN and a graph search algorithm are used to infer desired chemical structures. The framework has been applied to the case of chemical compounds with cycle index up to 2 so far. The computational results conducted on instances with n non-hydrogen atoms show that a feature vector can be inferred by solving an MILP for up to $$n=40$$ n = 40 , whereas graphs can be enumerated for up to $$n=15$$ n = 15 . When applied to the case of chemical acyclic graphs, the maximum computable diameter of a chemical structure was up to 8. In this paper, we introduce a new characterization of graph structure, called “branch-height” based on which a new MILP formulation and a new graph search algorithm are designed for chemical acyclic graphs. The results of computational experiments using such chemical properties as octanol/water partition coefficient, boiling point and heat of combustion suggest that the proposed method can infer chemical acyclic graphs with around $$n=50$$ n = 50 and diameter 30.

Keywords