Validating GAN-BioBERT: A Methodology for Assessing Reporting Trends in Clinical Trials

Joshua J. Myszewski; Emily Klossowski; Patrick Meyer; Kristin Bevil; Lisa Klesius; Kristopher M. Schroeder

doi:10.3389/fdgth.2022.878369

Frontiers in Digital Health (May 2022)

Validating GAN-BioBERT: A Methodology for Assessing Reporting Trends in Clinical Trials

Joshua J. Myszewski,
Emily Klossowski,
Patrick Meyer,
Kristin Bevil,
Lisa Klesius,
Kristopher M. Schroeder

Affiliations

Joshua J. Myszewski: School of Medicine and Public Health, University of Wisconsin, Madison, WI, United States
Emily Klossowski: University of Wisconsin-Milwaukee, Milwaukee, WI, United States
Patrick Meyer: Department of Anesthesiology, School of Medicine and Public Health, University of Wisconsin, Madison, WI, United States
Kristin Bevil: Department of Anesthesiology, School of Medicine and Public Health, University of Wisconsin, Madison, WI, United States
Lisa Klesius: Department of Anesthesiology, School of Medicine and Public Health, University of Wisconsin, Madison, WI, United States
Kristopher M. Schroeder: Department of Anesthesiology, School of Medicine and Public Health, University of Wisconsin, Madison, WI, United States

DOI: https://doi.org/10.3389/fdgth.2022.878369
Journal volume & issue: Vol. 4

Abstract

Read online

BackgroundThe aim of this study was to validate a three-class sentiment classification model for clinical trial abstracts combining adversarial learning and the BioBERT language processing model as a tool to assess trends in biomedical literature in a clearly reproducible manner. We then assessed the model's performance for this application and compared it to previous models used for this task.MethodsUsing 108 expert-annotated clinical trial abstracts and 2,000 unlabeled abstracts this study develops a three-class sentiment classification algorithm for clinical trial abstracts. The model uses a semi-supervised model based on the Bidirectional Encoder Representation from Transformers (BERT) model, a much more advanced and accurate method compared to previously used models based upon traditional machine learning methods. The prediction performance was compared to those previous studies.ResultsThe algorithm was found to have a classification accuracy of 91.3%, with a macro F1-Score of 0.92, significantly outperforming previous studies used to classify sentiment in clinical trial literature, while also making the sentiment classification finer grained with greater reproducibility.ConclusionWe demonstrate an easily applied sentiment classification model for clinical trial abstracts that significantly outperforms previous models with greater reproducibility and applicability to large-scale study of reporting trends.

Published in Frontiers in Digital Health

ISSN: 2673-253X (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Medicine: Public aspects of medicine; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.frontiersin.org/journals/digital-health#

About the journal

Abstract

Keywords