Molecular Therapy: Methods & Clinical Development (Jun 2021)
Machine learning prediction of methionine and tryptophan photooxidation susceptibility
Abstract
Photooxidation of methionine (Met) and tryptophan (Trp) residues is common and includes major degradation pathways that often pose a serious threat to the success of therapeutic proteins. Oxidation impacts all steps of protein production, manufacturing, and shelf life. Prediction of oxidation liability as early as possible in development is important because many more candidate drugs are discovered than can be tested experimentally. Undetected oxidation liabilities necessitate expensive and time-consuming remediation strategies in development and may lead to good drugs reaching patients slowly. Conversely, sites mischaracterized as oxidation liabilities could result in overengineering and lead to good drugs never reaching patients. To our knowledge, no predictive model for photooxidation of Met or Trp is currently available. We applied the random forest machine learning algorithm to in-house liquid chromatography-tandem mass spectrometry (LC-MS/MS) datasets (Met, n = 421; Trp, n = 342) of tryptic therapeutic protein peptides to create computational models for Met and Trp photooxidation. We show that our machine learning models predict Met and Trp photooxidation likelihood with 0.926 and 0.860 area under the curve (AUC), respectively, and Met photooxidation rate with a correlation coefficient (Q2) of 0.511 and root-mean-square error (RMSE) of 10.9%. We further identify important physical, chemical, and formulation parameters that influence photooxidation. Improvement of biopharmaceutical liability predictions will result in better, more stable drugs, increasing development throughput, product quality, and likelihood of clinical success.