Molecules (Sep 2013)

Enhanced QSAR Model Performance by Integrating Structural and Gene Expression Information

  • Xiaohui Fan,
  • Li Xing,
  • Wei Liu,
  • Leihong Wu,
  • Qian Chen

DOI
https://doi.org/10.3390/molecules180910789
Journal volume & issue
Vol. 18, no. 9
pp. 10789 – 10801

Abstract

Read online

Despite decades of intensive research and a number of demonstrable successes, quantitative structure-activity relationship (QSAR) models still fail to yield predictions with reasonable accuracy in some circumstances, especially when the QSAR paradox occurs. In this study, to avoid the QSAR paradox, we proposed a novel integrated approach to improve the model performance through using both structural and biological information from compounds. As a proof-of-concept, the integrated models were built on a toxicological dataset to predict non-genotoxic carcinogenicity of compounds, using not only the conventional molecular descriptors but also expression profiles of significant genes selected from microarray data. For test set data, our results demonstrated that the prediction accuracy of QSAR model was dramatically increased from 0.57 to 0.67 with incorporation of expression data of just one selected signature gene. Our successful integration of biological information into classic QSAR model provided a new insight and methodology for building predictive models especially when QSAR paradox occurred.

Keywords