PLoS ONE (Jan 2014)

Metabolomics data normalization with EigenMS.

  • Yuliya V Karpievitch,
  • Sonja B Nikolic,
  • Richard Wilson,
  • James E Sharman,
  • Lindsay M Edwards

DOI
https://doi.org/10.1371/journal.pone.0116221
Journal volume & issue
Vol. 9, no. 12
p. e116221

Abstract

Read online

Liquid chromatography mass spectrometry has become one of the analytical platforms of choice for metabolomics studies. However, LC-MS metabolomics data can suffer from the effects of various systematic biases. These include batch effects, day-to-day variations in instrument performance, signal intensity loss due to time-dependent effects of the LC column performance, accumulation of contaminants in the MS ion source and MS sensitivity among others. In this study we aimed to test a singular value decomposition-based method, called EigenMS, for normalization of metabolomics data. We analyzed a clinical human dataset where LC-MS serum metabolomics data and physiological measurements were collected from thirty nine healthy subjects and forty with type 2 diabetes and applied EigenMS to detect and correct for any systematic bias. EigenMS works in several stages. First, EigenMS preserves the treatment group differences in the metabolomics data by estimating treatment effects with an ANOVA model (multiple fixed effects can be estimated). Singular value decomposition of the residuals matrix is then used to determine bias trends in the data. The number of bias trends is then estimated via a permutation test and the effects of the bias trends are eliminated. EigenMS removed bias of unknown complexity from the LC-MS metabolomics data, allowing for increased sensitivity in differential analysis. Moreover, normalized samples better correlated with both other normalized samples and corresponding physiological data, such as blood glucose level, glycated haemoglobin, exercise central augmentation pressure normalized to heart rate of 75, and total cholesterol. We were able to report 2578 discriminatory metabolite peaks in the normalized data (p<0.05) as compared to only 1840 metabolite signals in the raw data. Our results support the use of singular value decomposition-based normalization for metabolomics data.