Digital Diagnostics (Jan 2023)

Possibilities and limitations of using machine text-processing tools in Russian radiology reports

  • Daria Yu. Kokina,
  • Victor A. Gombolevskiy,
  • Kirill M. Arzamasov,
  • Anna E. Andreychenko,
  • Sergey P. Morozov

DOI
https://doi.org/10.17816/DD101099
Journal volume & issue
Vol. 3, no. 4
pp. 374 – 383

Abstract

Read online

BACKGROUND: In radiology, important information can be found not only in medical images, but also in the accompanying text descriptions created by radiologists. Identification of study protocols containing certain data and extraction of these data can be useful primarily for clinical problems; however, given the large amount of such data, the development of machine analysis algorithms is necessary. AIM: To estimate the possibilities and limitations of using a tool for machine processing of radiology reports to search for pathological findings. MATERIALS AND METHODS: To create an algorithm for automatic analysis of radiology reports, use cases were selected that participated in the experiment on the use of innovative technologies in the computer vision for the analysis of medical images in 2020. Mammography, chest X-ray, chest computed tomography (CT), and LDCT, were among the use cases performed in Moscow. A dictionary of keywords has been compiled. After the automatic marking of the reports by the developed tool, the results were assessed by a radiologist. The number of protocols analyzed by the radiologist for training and validation of the algorithms was 977 for mammography, 4,804 for all chest X-ray scans, 4,074 for chest CT, and 398 for chest LDCT. For the final testing of the developed algorithms, test datasets of 1,032 studies for mammography, 544 for chest X-ray, 5,000 for CT of the chest, and 1,082 studies for the LDCT of the chest were additionally labeled. RESULTS: The best results were achieved in the search for viral pneumonia in chest CT reports (accuracy 0.996, sensitivity 0.998, and specificity 0.989) and breast cancer in mammography reports (accuracy 1.0, sensitivity 1.0, and specificity 1.0). When searching for signs of lung cancer by the algorithm, the metrics were as follows: accuracy 0.895, sensitivity 0.829, and specificity 0.936, when searching for pathological changes in the chest organs in radiography and fluorography protocols (accuracy 0.912, sensitivity 1.000, and specificity 0.844). CONCLUSIONS: Machine methods with high accuracy can be used to automatically classify the radiology reports of mammography and chest CT with viral pneumonia. The achieved accuracy is sufficient for successful application to automatically compare the conclusions of physicians and artificial intelligence models when searching for signs of lung cancer in chest CT and LDCT, pathological findings in chest X-ray.

Keywords