Matching patients to clinical trials with large language models

Qiao Jin; Zifeng Wang; Charalampos S. Floudas; Fangyuan Chen; Changlin Gong; Dara Bracken-Clarke; Elisabetta Xue; Yifan Yang; Jimeng Sun; Zhiyong Lu

doi:10.1038/s41467-024-53081-z

Nature Communications (Nov 2024)

Matching patients to clinical trials with large language models

Qiao Jin,
Zifeng Wang,
Charalampos S. Floudas,
Fangyuan Chen,
Changlin Gong,
Dara Bracken-Clarke,
Elisabetta Xue,
Yifan Yang,
Jimeng Sun,
Zhiyong Lu

Affiliations

Qiao Jin: National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH)
Zifeng Wang: Department of Computer Science, University of Illinois Urbana-Champaign
Charalampos S. Floudas: Center for Immuno-Oncology, Center for Cancer Research, National Cancer Institute, National Institutes of Health
Fangyuan Chen: School of Medicine, University of Pittsburgh
Changlin Gong: Jacob Medical Center, Albert Einstein College of Medicine
Dara Bracken-Clarke: Center for Immuno-Oncology, Center for Cancer Research, National Cancer Institute, National Institutes of Health
Elisabetta Xue: Center for Immuno-Oncology, Center for Cancer Research, National Cancer Institute, National Institutes of Health
Yifan Yang: National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH)
Jimeng Sun: Department of Computer Science, University of Illinois Urbana-Champaign
Zhiyong Lu: National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH)

DOI: https://doi.org/10.1038/s41467-024-53081-z
Journal volume & issue: Vol. 15, no. 1
pp. 1 – 14

Abstract

Read online

Abstract Patient recruitment is challenging for clinical trials. We introduce TrialGPT, an end-to-end framework for zero-shot patient-to-trial matching with large language models. TrialGPT comprises three modules: it first performs large-scale filtering to retrieve candidate trials (TrialGPT-Retrieval); then predicts criterion-level patient eligibility (TrialGPT-Matching); and finally generates trial-level scores (TrialGPT-Ranking). We evaluate TrialGPT on three cohorts of 183 synthetic patients with over 75,000 trial annotations. TrialGPT-Retrieval can recall over 90% of relevant trials using less than 6% of the initial collection. Manual evaluations on 1015 patient-criterion pairs show that TrialGPT-Matching achieves an accuracy of 87.3% with faithful explanations, close to the expert performance. The TrialGPT-Ranking scores are highly correlated with human judgments and outperform the best-competing models by 43.8% in ranking and excluding trials. Furthermore, our user study reveals that TrialGPT can reduce the screening time by 42.6% in patient recruitment. Overall, these results have demonstrated promising opportunities for patient-to-trial matching with TrialGPT.

Published in Nature Communications

ISSN: 2041-1723 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Science
Website: https://www.nature.com/ncomms/

About the journal