Mayo Clinic Proceedings: Digital Health (Jun 2024)
A Systematic Review of Natural Language Processing Methods and Applications in Thyroidology
Abstract
This study aimed to review the application of natural language processing (NLP) in thyroid-related conditions and to summarize current challenges and potential future directions. We performed a systematic search of databases for studies describing NLP applications in thyroid conditions published in English between January 1, 2012 and November 4, 2022. In addition, we used a snowballing technique to identify studies missed in the initial search or published after our search timeline until April 1, 2023. For included studies, we extracted the NLP method (eg, rule-based, machine learning, deep learning, or hybrid), NLP application (eg, identification, classification, and automation), thyroid condition (eg, thyroid cancer, thyroid nodule, and functional or autoimmune disease), data source (eg, electronic health records, health forums, medical literature databases, or genomic databases), performance metrics, and stages of development. We identified 24 eligible NLP studies focusing on thyroid-related conditions. Deep learning-based methods were the most common (38%), followed by rule-based (21%), and traditional machine learning (21%) methods. Thyroid nodules (54%) and thyroid cancer (29%) were the primary conditions under investigation. Electronic health records were the dominant data source (17/24, 71%), with imaging reports being the most frequently used (15/17, 88%). There is increasing interest in NLP applications for thyroid-related studies, mostly addressing thyroid nodules and using deep learning-based methodologies with limited external validation. However, none of the reviewed NLP applications have reached clinical practice. Several limitations, including inconsistent clinical documentation and model portability, need to be addressed to promote the evaluation and implementation of NLP applications to support patient care in thyroidology.