Algorithms (Mar 2024)

Data Mining Techniques for Endometriosis Detection in a Data-Scarce Medical Dataset

  • Pablo Caballero,
  • Luis Gonzalez-Abril,
  • Juan A. Ortega,
  • Áurea Simon-Soro

DOI
https://doi.org/10.3390/a17030108
Journal volume & issue
Vol. 17, no. 3
p. 108

Abstract

Read online

Endometriosis (EM) is a chronic inflammatory estrogen-dependent disorder that affects 10% of women worldwide. It affects the female reproductive tract and its resident microbiota, as well as distal body sites that can serve as surrogate markers of EM. Currently, no single definitive biomarker can diagnose EM. For this pilot study, we analyzed a cohort of 21 patients with endometriosis and infertility-associated conditions. A microbiome dataset was created using five sample types taken from the reproductive and gastrointestinal tracts of each patient. We evaluated several machine learning algorithms for EM detection using these features. The characteristics of the dataset were derived from endometrial biopsy, endometrial fluid, vaginal, oral, and fecal samples. Despite limited data, the algorithms demonstrated high performance with respect to the F1 score. In addition, they suggested that disease diagnosis could potentially be improved by using less medically invasive procedures. Overall, the results indicate that machine learning algorithms can be useful tools for diagnosing endometriosis in low-resource settings where data availability and availability are limited. We recommend that future studies explore the complexities of the EM disorder using artificial intelligence and prediction modeling to further define the characteristics of the endometriosis phenotype.

Keywords