Machine-Learning-Enabled Virtual Screening for Inhibitors of Lysine-Specific Histone Demethylase 1
Jiajun Zhou,
Shiying Wu,
Boon Giin Lee,
Tianwei Chen,
Ziqi He,
Yukun Lei,
Bencan Tang,
Jonathan D. Hirst
Affiliations
Jiajun Zhou
Key Laboratory for Carbonaceous Waste Processing and Process Intensification Research of Zhejiang Province, University of Nottingham Ningbo China, 199 Taikang East Road, Ningbo 315100, China
Shiying Wu
Key Laboratory for Carbonaceous Waste Processing and Process Intensification Research of Zhejiang Province, University of Nottingham Ningbo China, 199 Taikang East Road, Ningbo 315100, China
Boon Giin Lee
School of Computer Science, University of Nottingham Ningbo China, 199 Taikang East Road, Ningbo 315100, China
Tianwei Chen
Key Laboratory for Carbonaceous Waste Processing and Process Intensification Research of Zhejiang Province, University of Nottingham Ningbo China, 199 Taikang East Road, Ningbo 315100, China
Ziqi He
Key Laboratory for Carbonaceous Waste Processing and Process Intensification Research of Zhejiang Province, University of Nottingham Ningbo China, 199 Taikang East Road, Ningbo 315100, China
Yukun Lei
Key Laboratory for Carbonaceous Waste Processing and Process Intensification Research of Zhejiang Province, University of Nottingham Ningbo China, 199 Taikang East Road, Ningbo 315100, China
Bencan Tang
Key Laboratory for Carbonaceous Waste Processing and Process Intensification Research of Zhejiang Province, University of Nottingham Ningbo China, 199 Taikang East Road, Ningbo 315100, China
Jonathan D. Hirst
School of Chemistry, University of Nottingham, University Park, Nottingham NG7 2RD, UK
A machine learning approach has been applied to virtual screening for lysine specific demethylase 1 (LSD1) inhibitors. LSD1 is an important anti-cancer target. Machine learning models to predict activity were constructed using Morgan molecular fingerprints. The dataset, consisting of 931 molecules with LSD1 inhibition activity, was obtained from the ChEMBL database. An evaluation of several candidate algorithms on the main dataset revealed that the support vector regressor gave the best model, with a coefficient of determination (R2) of 0.703. Virtual screening, using this model, identified five predicted potent inhibitors from the ZINC database comprising more than 300,000 molecules. The virtual screening recovered a known inhibitor, RN1, as well as four compounds where activity against LSD1 had not previously been suggested. Thus, we performed a machine-learning-enabled virtual screening of LSD1 inhibitors using only the structural information of the molecules.