iSKIN: Integrated application of machine learning and Mondrian conformal prediction to detect skin sensitizers in cosmetic raw materials
Weikaixin Kong,
Jie Zhu,
Peipei Shan,
Huiyan Ying,
Tongyu Chen,
Bowen Zhang,
Chao Peng,
Zihan Wang,
Yifan Wang,
Liting Huang,
Suzhen Bi,
Weining Ma,
Zhuo Huang,
Sujie Zhu,
Xueyan Liu,
Chun Li
Affiliations
Weikaixin Kong
Key Laboratory of Birth Regulation and Control Technology of National Health Commission of China Shandong Provincial Maternal and Child Health Care Hospital Affiliated to Qingdao University Jinan China
Jie Zhu
Key Laboratory of Birth Regulation and Control Technology of National Health Commission of China Shandong Provincial Maternal and Child Health Care Hospital Affiliated to Qingdao University Jinan China
Peipei Shan
Key Laboratory of Birth Regulation and Control Technology of National Health Commission of China Shandong Provincial Maternal and Child Health Care Hospital Affiliated to Qingdao University Jinan China
Huiyan Ying
Institute for Molecular Medicine Finland (FIMM) HiLIFE, University of Helsinki Helsinki Finland
Tongyu Chen
School of Materials Science and Engineering Northeastern University Shenyang China
Bowen Zhang
Department of Molecular and Cellular Pharmacology, School of Pharmaceutical Sciences Peking University Health Science Center Beijing China
Chao Peng
Department of Molecular and Cellular Pharmacology, School of Pharmaceutical Sciences Peking University Health Science Center Beijing China
Zihan Wang
Department of Molecular and Cellular Pharmacology, School of Pharmaceutical Sciences Peking University Health Science Center Beijing China
Yifan Wang
Scripps Research La Jolla California USA
Liting Huang
Key Laboratory of Birth Regulation and Control Technology of National Health Commission of China Shandong Provincial Maternal and Child Health Care Hospital Affiliated to Qingdao University Jinan China
Suzhen Bi
Key Laboratory of Birth Regulation and Control Technology of National Health Commission of China Shandong Provincial Maternal and Child Health Care Hospital Affiliated to Qingdao University Jinan China
Weining Ma
Department of Neurosurgery Shengjing Hospital of China Medical University Shenyang China
Zhuo Huang
Department of Molecular and Cellular Pharmacology, School of Pharmaceutical Sciences Peking University Health Science Center Beijing China
Sujie Zhu
Key Laboratory of Birth Regulation and Control Technology of National Health Commission of China Shandong Provincial Maternal and Child Health Care Hospital Affiliated to Qingdao University Jinan China
Xueyan Liu
Department of Pediatrics Shengjing Hospital of China Medical University Shenyang China
Chun Li
Department of Pediatrics Shengjing Hospital of China Medical University Shenyang China
Abstract Animal experiments traditionally identify sensitizers in cosmetic materials. However, with growing concerns over animal ethics and bans on such experiments globally, alternative methods like machine learning are gaining prominence for their efficiency and cost‐effectiveness. In this study, to develop a robust sensitizer detector model, we first constructed benchmark data sets using data from previous studies and a public database, then 589 sensitizers and 831 nonsensitizers were collected. In addition, a graph‐based autoencoder and Mondrian conformal prediction (MCP) were combined to build a robust sensitizer detector, iSKIN. In the independent test set, the Matthews correlation coefficient (MCC) and the area under the receiver operating characteristic curve (ROCAUC) values of the iSKIN model without MCP were 0.472 and 0.804, respectively, which are higher than those of the three baseline models. When setting the significance level in MCP at 0.7, the MCC and ROCAUC values of iSKIN could achieve 0.753 and 0.927, respectively. Regrouping experiments proved that the MCP method is robust in the improvement of model performance. Through key structure analysis, seven key substructures in sensitizers were identified to guide cosmetic material design. Notably, long chains with halogen atoms and phenyl groups with two chlorine atoms at ortho‐positions were potential sensitizers. Finally, a user‐friendly web tool (http://www.iskin.work/) of the iSKIN model was deployed to be used by other researchers. In summary, the proposed iSKIN model has achieved state‐of‐the‐art performance so far, which can contribute to the safety evaluation of cosmetic raw materials and provide a reference for the chemical structure design of these materials.