Frontiers in Genetics (Mar 2023)

Use of pain-related gene features to predict depression by support vector machine model in patients with fibromyalgia

  • Fengfeng Wang,
  • Chi Wai Cheung,
  • Stanley Sau Ching Wong

DOI
https://doi.org/10.3389/fgene.2023.1026672
Journal volume & issue
Vol. 14

Abstract

Read online

The prevalence rate of depression is higher in patients with fibromyalgia syndrome, but this is often unrecognized in patients with chronic pain. Given that depression is a common major barrier in the management of patients with fibromyalgia syndrome, an objective tool that reliably predicts depression in patients with fibromyalgia syndrome could significantly enhance the diagnostic accuracy. Since pain and depression can cause each other and worsen each other, we wonder if pain-related genes can be used to differentiate between those with major depression from those without. This study developed a support vector machine model combined with principal component analysis to differentiate major depression in fibromyalgia syndrome patients using a microarray dataset, including 25 fibromyalgia syndrome patients with major depression, and 36 patients without major depression. Gene co-expression analysis was used to select gene features to construct support vector machine model. The principal component analysis can help reduce the number of data dimensions without much loss of information, and identify patterns in data easily. The 61 samples available in the database were not enough for learning based methods and cannot represent every possible variation of each patient. To address this issue, we adopted Gaussian noise to generate a large amount of simulated data for training and testing of the model. The ability of support vector machine model to differentiate major depression using microarray data was measured as accuracy. Different structural co-expression patterns were identified for 114 genes involved in pain signaling pathway by two-sample KS test (p < 0.001 for the maximum deviation D = 0.11 > Dcritical = 0.05), indicating the aberrant co-expression patterns in fibromyalgia syndrome patients. Twenty hub gene features were further selected based on co-expression analysis to construct the model. The principal component analysis reduced the dimension of the training samples from 20 to 16, since 16 components were needed to retain more than 90% of the original variance. The support vector machine model was able to differentiate between those with major depression from those without in fibromyalgia syndrome patients with an average accuracy of 93.22% based on the expression levels of the selected hub gene features. These findings would contribute key information that can be used to develop a clinical decision-making tool for the data-driven, personalized optimization of diagnosing depression in patients with fibromyalgia syndrome.

Keywords