JMIR Formative Research (Aug 2023)

Comparing Literature- and Subreddit-Derived Laboratory Values in Polycystic Ovary Syndrome (PCOS): Validation of Clinical Data Posted on PCOS Reddit Forums

  • Rebecca H K Emanuel,
  • Paul D Docherty,
  • Helen Lunt,
  • Rebecca E Campbell

DOI
https://doi.org/10.2196/44810
Journal volume & issue
Vol. 7
p. e44810

Abstract

Read online

BackgroundPolycystic ovary syndrome (PCOS) is a heterogeneous condition that affects 4% to 21% of people with ovaries. Inaccessibility or dissatisfaction with clinical treatment for PCOS has led to some individuals with the condition discussing their experiences in specialized web-based forums. ObjectiveThis study explores the feasibility of using such web-based forums for clinical research purposes by gathering and analyzing laboratory test results posted in an active PCOS forum, specifically the PCOS subreddit hosted on Reddit. MethodsWe gathered around 45,000 posts from the PCOS subreddit. A random subset of 5000 posts was manually read, and the presence of laboratory test results was labeled. These labeled posts were used to train a machine learning model to identify which of the remaining posts contained laboratory results. The laboratory results were extracted manually from the identified posts. These self-reported laboratory test results were compared with values in the published literature to assess whether the results were concordant with researcher-published values for PCOS cohorts. A total of 10 papers were chosen to represent published PCOS literature, with selection criteria including the Rotterdam diagnostic criteria for PCOS, a publication date within the last 20 years, and at least 50 participants with PCOS. ResultsOverall, the general trends observed in the laboratory test results from the PCOS web-based forum were consistent with clinically reported PCOS. A number of results, such as follicle stimulating hormone, fasting insulin, and anti-Mullerian hormone, were concordant with published values for patients with PCOS. The high consistency of these results among the literature and when compared to the subreddit suggests that follicle stimulating hormone, fasting insulin, and anti-Mullerian hormone are more consistent across PCOS phenotypes than other test results. Some results, such as testosterone, sex hormone–binding globulin, and homeostasis model assessment–estimated insulin resistance index, were between those of PCOS literature values and normal values, as defined by clinical testing limits. Interestingly, other results, including dehydroepiandrosterone sulfate, luteinizing hormone, and fasting glucose, appeared to be slightly more dysregulated than those reported in the literature. ConclusionsThe differences between the forum-posted results and those published in the literature may be due to the selection process in clinical studies and the possibility that the forum disproportionally describes PCOS phenotypes that are less likely to be alleviated with medical intervention. However, the degree of concordance in most laboratory test values implied that the PCOS web-based forum participants were representative of research-identified PCOS cohorts. This validation of the PCOS subreddit grants the possibility for more research into the contents of the subreddit and the idea of undertaking similar research using the contents of other medical internet forums.