HiTIME: An efficient model-selection approach for the detection of unknown drug metabolites in LC-MS data
Michael G. Leeming,
Andrew P. Isaac,
Luke Zappia,
Richard A.J. O’Hair,
William A. Donald,
Bernard J. Pope
Affiliations
Michael G. Leeming
School of Chemistry and Bio21 Molecular Science & Biotechnology Institute, The University of Melbourne, Australia
Andrew P. Isaac
Melbourne Bioinformatics, The University of Melbourne, Australia; The Walter and Eliza Hall Institute of Medical Research, Australia
Luke Zappia
School of Biosciences, The University of Melbourne, Australia; Murdoch Children’s Research Institute, Australia
Richard A.J. O’Hair
School of Chemistry and Bio21 Molecular Science & Biotechnology Institute, The University of Melbourne, Australia
William A. Donald
School of Chemistry, University of New South Wales, Australia
Bernard J. Pope
Melbourne Bioinformatics, The University of Melbourne, Australia; Department of Clinical Pathology, The University of Melbourne, Australia; Department of Medicine, Central Clinical School, Monash University, Australia; Corresponding author at: Melbourne Bioinformatics, The University of Melbourne, Victoria, Australia, 3010, Australia.
The identification of metabolites plays an important role in understanding drug efficacy and safety however these compounds are often difficult to identify in complex mixtures. One approach to identify drug metabolites involves utilising differentially isotopically labelled drug compounds to create unique isotopic signals that can be detected by liquid chromatography-mass spectrometry (LC-MS). User-friendly, efficient, computational tools that allow selective detection of these signals are lacking. We have developed an efficient open-source software tool called HiTIME (High-Resolution Twin-Ion Metabolite Extraction) which filters twin-ion signals in LC-MS data. The intensity of each data point in the input is replaced by a Z-score describing how well the point matches an idealised twin-ion signal versus alternative ion signatures. Here we provide a detailed description of the algorithm and demonstrate its performance on simulated and experimental data.