Department of Electrical Engineering, Intelligent System Laboratory, College of Electrical Engineering and Computer Science, National Kaohsiung University of Science and Technology, Kaohsiung, Taiwan
Yi-Hsin Yang
National Institute of Cancer Research, National Health Research Institutes, Tainan, Taiwan
Department of Radiation Oncology, China Medical University Hospital, China Medical University, Taichung, Taiwan
Yan-Jie Lin
Institute of Population Health Sciences, National Health Research Institutes, Miaoli, Taiwan
Pin-Jou Lu
Department of Electrical Engineering, Intelligent System Laboratory, College of Electrical Engineering and Computer Science, National Kaohsiung University of Science and Technology, Kaohsiung, Taiwan
Chung-Yang Wu
Department of Electrical Engineering, Intelligent System Laboratory, College of Electrical Engineering and Computer Science, National Kaohsiung University of Science and Technology, Kaohsiung, Taiwan
Yu-Cheng Chang
Department of Electrical Engineering, Intelligent System Laboratory, College of Electrical Engineering and Computer Science, National Kaohsiung University of Science and Technology, Kaohsiung, Taiwan
You-Qian Lee
Department of Electrical Engineering, Intelligent System Laboratory, College of Electrical Engineering and Computer Science, National Kaohsiung University of Science and Technology, Kaohsiung, Taiwan
You-Chen Zhang
Department of Electrical Engineering, Intelligent System Laboratory, College of Electrical Engineering and Computer Science, National Kaohsiung University of Science and Technology, Kaohsiung, Taiwan
Yuan-Chi Hsu
Department of Electrical Engineering, Intelligent System Laboratory, College of Electrical Engineering and Computer Science, National Kaohsiung University of Science and Technology, Kaohsiung, Taiwan
Han-Hsiang Wu
Department of Electrical Engineering, Intelligent System Laboratory, College of Electrical Engineering and Computer Science, National Kaohsiung University of Science and Technology, Kaohsiung, Taiwan
Cheng-Rong Ke
Department of Electrical Engineering, Intelligent System Laboratory, College of Electrical Engineering and Computer Science, National Kaohsiung University of Science and Technology, Kaohsiung, Taiwan
Chih-Jen Huang
Cancer Center, Kaohsiung Medical University Hospital, Kaohsiung Medical University, Kaohsiung, Taiwan
Yu-Tsang Wang
Department of Medical Research, Division of Medical Statistics and Bioinformatics, Kaohsiung Medical University Hospital, Kaohsiung Medical University, Kaohsiung, Taiwan
Sheau-Fang Yang
Department of Pathology, Kaohsiung Medical University Hospital, Kaohsiung Medical University, Kaohsiung, Taiwan
Kuan-Chung Hsiao
National Institute of Cancer Research, National Health Research Institutes, Tainan, Taiwan
Cancer registries are critical databases for cancer research whose maintenance requires various types of domain knowledge with labor-intensive data curation. In order to facilitate the curation process with high quality in a timely manner, we developed a hybrid neural symbolic system for cancer registry coding. Unlike previous works which mainly worked on the dataset collected from one hospital or formulated the task as text classification problems, we collaborated with two medical centers in Taiwan to compile a cross-hospital corpus and applied neural networks to extract cancer registry variables described in unstructured pathology reports along with expert systems for generating registry coding. We conducted experiments to study the feasibility of the proposed hybrid for the task of cancer registry coding and compare its performance with state-of-the-art non-hybrid approaches. Furthermore, cross-hospital experiments were performed to study the advantages and limitations of transfer learning for processing reports from different sources. The experiment results demonstrated that the proposed hybrid neural symbolic system is a robust approach which works well across hospitals and outperformed classification-based baselines by F-scores of 0.13~0.27. Compared to the baseline models, the F-scores of the proposed approaches are apparently higher when fewer training instances were used. All methods benefited from the transferred parameters learned from the source dataset, but the results suggest that it is a better strategy to transfer the learned knowledge through the concept recognition task followed by the symbolic expert system to address the task of cancer registry coding.