Journal of Cheminformatics (Feb 2021)

The kernel-weighted local polynomial regression (KwLPR) approach: an efficient, novel tool for development of QSAR/QSAAR toxicity extrapolation models

  • Agnieszka Gajewicz-Skretna,
  • Supratik Kar,
  • Magdalena Piotrowska,
  • Jerzy Leszczynski

DOI
https://doi.org/10.1186/s13321-021-00484-5
Journal volume & issue
Vol. 13, no. 1
pp. 1 – 20

Abstract

Read online

Abstract The ability of accurate predictions of biological response (biological activity/property/toxicity) of a given chemical makes the quantitative structure‐activity/property/toxicity relationship (QSAR/QSPR/QSTR) models unique among the in silico tools. In addition, experimental data of selected species can also be used as an independent variable along with other structural as well as physicochemical variables to predict the response for different species formulating quantitative activity–activity relationship (QAAR)/quantitative structure–activity–activity relationship (QSAAR) approach. Irrespective of the models' type, the developed model's quality, and reliability need to be checked through multiple classical stringent validation metrics. Among the validation metrics, error-based metrics are more significant as the basic idea of a good predictive model is to improve the predictions' quality by lowering the predicted residuals for new query compounds. Following the concept, we have checked the predictive quality of the QSAR and QSAAR models employing kernel-weighted local polynomial regression (KwLPR) approach over the traditional linear and non-linear regression-based approaches tools such as multiple linear regression (MLR) and k nearest neighbors (kNN). Five datasets which were previously modeled using linear and non-linear regression method were considered to implement the KwPLR approach, followed by comparison of their validation metrics outcomes. For all five cases, the KwLPR based models reported better results over the traditional approaches. The present study's focus is not to develop a better or improved QSAR/QSAAR model over the previous ones, but to demonstrate the advantage, prediction power, and reliability of the KwLPR algorithm and establishing it as a novel, powerful cheminformatic tool. To facilitate the use of the KwLPR algorithm for QSAR/QSPR/QSTR/QSAAR modeling, the authors provide an in-house developed KwLPR.RMD script under the open-source R programming language.

Keywords