BMC Bioinformatics (Dec 2023)

Subtyping irritable bowel syndrome using cluster analysis: a systematic review

  • Diana Zarei,
  • Amene Saghazadeh,
  • Nima Rezaei

DOI
https://doi.org/10.1186/s12859-023-05567-8
Journal volume & issue
Vol. 24, no. 1
pp. 1 – 25

Abstract

Read online

Abstract Background Irritable bowel syndrome (IBS) is a common chronic functional gastrointestinal disorder associated with a wide range of clinical symptoms. Some researchers have used cluster analysis (CA), a group of non-supervised learning methods that identifies homogenous clusters within different entities based on their similarity. Objective and methods This literature review aims to identify published articles that apply CA to IBS patients. We searched relevant keywords in PubMed, Embase, Web of Science, and Scopus. We reviewed studies in terms of the selected variables, participants’ characteristics, data collection, methodology, number of clusters, clusters’ profiles, and results. Results Among the 14 articles focused on the heterogeneity of IBS, eight of them utilized K-means Cluster Analysis (K-means CA), four employed Hierarchical Cluster Analysis, and only two studies utilized Latent Class Analysis. Seven studies focused on clinical symptoms, while four articles examined anocolorectal functions. Two studies were centered around immunological findings, and only one study explored microbial composition. The number of clusters obtained ranged from two to seven, showing variation across the studies. Males exhibited lower symptom severity and fewer psychological findings. The association between symptom severity and rectal perception suggests that altered rectal perception serves as a biological indicator of IBS. Ultra-slow waves observed in IBS patients are linked to increased activity of the anal sphincter, higher anal pressure, dystonia, and dyschezia. Conclusion IBS has different subgroups based on different factors. Most IBS patients have low clinical severity, good QoL, high rectal sensitivity, delayed left colon transit time, increased systemic cytokines, and changes in microbial composition, including increased Firmicutes-associated taxa and depleted Bacteroidetes-related taxa. However, the number of clusters is inconsistent across studies due to the methodological heterogeneity. CA, a valuable non-supervised learning method, is sensitive to hyperparameters like the number of clusters and random initialization of cluster centers. The random nature of these parameters leads to diverse outcomes even with the same algorithm. This has implications for future research and practical applications, necessitating further studies to improve our understanding of IBS and develop personalized treatments.

Keywords