Examining HPO by organ and system to facilitate practical use by clinicians

Eisuke Dohi; Terue Takatsuki; Yuka Tateisi; Toyofumi Fujiwara; Yasunori Yamamoto

doi:10.1186/s44342-024-00024-1

Genomics & Informatics (Nov 2024)

Examining HPO by organ and system to facilitate practical use by clinicians

Eisuke Dohi,
Terue Takatsuki,
Yuka Tateisi,
Toyofumi Fujiwara,
Yasunori Yamamoto

Affiliations

Eisuke Dohi: National Center of Neurology and Psychiatry, National Institute of Neuroscience
Terue Takatsuki: Database Center for Life Science, ROIS-DS
Yuka Tateisi: Office of NBDC Program, JST
Toyofumi Fujiwara: Database Center for Life Science, ROIS-DS
Yasunori Yamamoto: Database Center for Life Science, ROIS-DS

DOI: https://doi.org/10.1186/s44342-024-00024-1
Journal volume & issue: Vol. 22, no. 1
pp. 1 – 5

Abstract

Read online

Abstract The Human Phenotype Ontology (HPO) is widely used for annotating clinical text data, and sufficient annotation is crucial for the effective utilization of clinical texts. It was known that the use of LLMs can successfully extract symptoms and findings, but cannot annotate them with the HPO. We hypothesized that one of the potential issue for this is the lack of appropriate terms in the HPO. Therefore, during the Biomedical Linked Annotation Hackathon 8 (BLAH8), we attempted the following two tasks in order to grasp the overall picture of HPO. (1) Extract all HPO terms for each of the 23 HPO subclasses (defined as categories) directly under the HPO "Phenotypic abnormality" and then (2) search for major attributes in each of 23 categories. We employed LLM for these two tasks related to examining HPO and, at the same time, found that LLM didn't work well without ingenuity for tasks that lacked sentences and context. A manual search for terms within each category revealed that the HPO contains a mix of terms with four major attributes: (1) Disease Name, (2) Condition, (3) Test Data, and (4) Symptoms and Findings. Manual curation showed that the ratio of symptoms and findings varied from 0 to 93.1% across categories. For clinicians, who are end-users of medical terminology including HPO, it is difficult to understand ontologies. However, for good quality ontology is also important for good-quality data, and a clinician’s help is essential. It is also important to make the overall picture and limitations of ontologies easy to understand in order to bring out the explanatory power of LLMs and artificial intelligence.

Published in Genomics & Informatics

ISSN: 1598-866X (Print); 2234-0742 (Online)
Publisher: Korea Genome Organization
Country of publisher: Korea, Republic of
LCC subjects: Science: Biology (General): Genetics
Website: https://genominfo.org/

About the journal

Abstract

Keywords