Journal of Cheminformatics (May 2023)
OWSum: algorithmic odor prediction and insight into structure-odor relationships
Abstract
Abstract We derived and implemented a linear classification algorithm for the prediction of a molecule’s odor, called Olfactory Weighted Sum (OWSum). Our approach relies solely on structural patterns of the molecules as features for algorithmic treatment and uses conditional probabilities combined with tf-idf values. In addition to the prediction of molecular odor, OWSum provides insights into properties of the dataset and allows to understand how algorithmic classifications are reached by quantitatively assigning structural patterns to odors. This provides chemists with an intuitive understanding of underlying interactions. To deal with ambiguities of the natural language used to describe odor, we introduced descriptor overlap as a metric for the quantification of semantic overlap between descriptors. Thus, grouping of descriptors and derivation of higher-level descriptors becomes possible. Our approach poses a large leap forward in our capabilities to understand and predict molecular features.
Keywords