PLoS ONE (Jan 2018)

Automatic inference model construction for computer-aided diagnosis of lung nodule: Explanation adequacy, inference accuracy, and experts' knowledge.

  • Masami Kawagishi,
  • Takeshi Kubo,
  • Ryo Sakamoto,
  • Masahiro Yakami,
  • Koji Fujimoto,
  • Gakuto Aoyama,
  • Yutaka Emoto,
  • Hiroyuki Sekiguchi,
  • Koji Sakai,
  • Yoshio Iizuka,
  • Mizuho Nishio,
  • Hiroyuki Yamamoto,
  • Kaori Togashi

DOI
https://doi.org/10.1371/journal.pone.0207661
Journal volume & issue
Vol. 13, no. 11
p. e0207661

Abstract

Read online

We aimed to describe the development of an inference model for computer-aided diagnosis of lung nodules that could provide valid reasoning for any inferences, thereby improving the interpretability and performance of the system. An automatic construction method was used that considered explanation adequacy and inference accuracy. In addition, we evaluated the usefulness of prior experts' (radiologists') knowledge while constructing the models. In total, 179 patients with lung nodules were included and divided into 79 and 100 cases for training and test data, respectively. F-measure and accuracy were used to assess explanation adequacy and inference accuracy, respectively. For F-measure, reasons were defined as proper subsets of Evidence that had a strong influence on the inference result. The inference models were automatically constructed using the Bayesian network and Markov chain Monte Carlo methods, selecting only those models that met the predefined criteria. During model constructions, we examined the effect of including radiologist's knowledge in the initial Bayesian network models. Performance of the best models in terms of F-measure, accuracy, and evaluation metric were as follows: 0.411, 72.0%, and 0.566, respectively, with prior knowledge, and 0.274, 65.0%, and 0.462, respectively, without prior knowledge. The best models with prior knowledge were then subjectively and independently evaluated by two radiologists using a 5-point scale, with 5, 3, and 1 representing beneficial, appropriate, and detrimental, respectively. The average scores by the two radiologists were 3.97 and 3.76 for the test data, indicating that the proposed computer-aided diagnosis system was acceptable to them. In conclusion, the proposed method incorporating radiologists' knowledge could help in eliminating radiologists' distrust of computer-aided diagnosis and improving its performance.