Computational and Structural Biotechnology Journal (Dec 2024)

Integrating predictive coding and a user-centric interface for enhanced auditing and quality in cancer registry data

  • Hong-Jie Dai,
  • Chien-Chang Chen,
  • Tatheer Hussain Mir,
  • Ting-Yu Wang,
  • Chen-Kai Wang,
  • Ya-Chen Chang,
  • Shu-Jung Yu,
  • Yi-Wen Shen,
  • Cheng-Jiun Huang,
  • Chia-Hsuan Tsai,
  • Ching-Yun Wang,
  • Hsiao-Jou Chen,
  • Pei-Shan Weng,
  • You-Xiang Lin,
  • Sheng-Wei Chen,
  • Ming-Ju Tsai,
  • Shian-Fei Juang,
  • Su-Ying Wu,
  • Wen-Tsung Tsai,
  • Ming-Yii Huang,
  • Chih-Jen Huang,
  • Chih-Jen Yang,
  • Ping-Zun Liu,
  • Chiao-Wen Huang,
  • Chi-Yen Huang,
  • William Yu Chung Wang,
  • Inn-Wen Chong,
  • Yi-Hsin Yang

Journal volume & issue
Vol. 24
pp. 322 – 333

Abstract

Read online

Data curation for a hospital-based cancer registry heavily relies on the labor-intensive manual abstraction process by cancer registrars to identify cancer-related information from free-text electronic health records. To streamline this process, a natural language processing system incorporating a hybrid of deep learning-based and rule-based approaches for identifying lung cancer registry-related concepts, along with a symbolic expert system that generates registry coding based on weighted rules, was developed. The system is integrated with the hospital information system at a medical center to provide cancer registrars with a patient journey visualization platform. The embedded system offers a comprehensive view of patient reports annotated with significant registry concepts to facilitate the manual coding process and elevate overall quality. Extensive evaluations, including comparisons with state-of-the-art methods, were conducted using a lung cancer dataset comprising 1428 patients from the medical center. The experimental results illustrate the effectiveness of the developed system, consistently achieving F1-scores of 0.85 and 1.00 across 30 coding items. Registrar feedback highlights the system’s reliability as a tool for assisting and auditing the abstraction. By presenting key registry items along the timeline of a patient’s reports with accurate code predictions, the system improves the quality of registrar outcomes and reduces the labor resources and time required for data abstraction. Our study highlights advancements in cancer registry coding practices, demonstrating that the proposed hybrid weighted neural-symbolic cancer registry system is reliable and efficient for assisting cancer registrars in the coding workflow and contributing to clinical outcomes.

Keywords