Diagnostics (Oct 2021)

Interpretation of Thoracic Radiography Shows Large Discrepancies Depending on the Qualification of the Physician—Quantitative Evaluation of Interobserver Agreement in a Representative Emergency Department Scenario

  • Jan Rudolph,
  • Nicola Fink,
  • Julien Dinkel,
  • Vanessa Koliogiannis,
  • Vincent Schwarze,
  • Sophia Goller,
  • Bernd Erber,
  • Thomas Geyer,
  • Boj Friedrich Hoppe,
  • Maximilian Fischer,
  • Najib Ben Khaled,
  • Maximilian Jörgens,
  • Jens Ricke,
  • Johannes Rueckel,
  • Bastian Oliver Sabel

DOI
https://doi.org/10.3390/diagnostics11101868
Journal volume & issue
Vol. 11, no. 10
p. 1868

Abstract

Read online

(1) Background: Chest radiography (CXR) is still a key diagnostic component in the emergency department (ED). Correct interpretation is essential since some pathologies require urgent treatment. This study quantifies potential discrepancies in CXR analysis between radiologists and non-radiology physicians in training with ED experience. (2) Methods: Nine differently qualified physicians (three board-certified radiologists [BCR], three radiology residents [RR], and three non-radiology residents involved in ED [NRR]) evaluated a series of 563 posterior-anterior CXR images by quantifying suspicion for four relevant pathologies: pleural effusion, pneumothorax, pneumonia, and pulmonary nodules. Reading results were noted separately for each hemithorax on a Likert scale (0–4; 0: no suspicion of pathology, 4: safe existence of pathology) adding up to a total of 40,536 reported pathology suspicions. Interrater reliability/correlation and Kruskal–Wallis tests were performed for statistical analysis. (3) Results: While interrater reliability was good among radiologists, major discrepancies between radiologists’ and non-radiologists’ reading results could be observed in all pathologies. Highest overall interrater agreement was found for pneumothorax detection and lowest agreement in raising suspicion for malignancy suspicious nodules. Pleural effusion and pneumonia were often suspected with indifferent choices (1–3). In terms of pneumothorax detection, all readers mainly decided for a clear option (0 or 4). Interrater reliability was usually higher when evaluating the right hemithorax (all pathologies except pneumothorax). (4) Conclusions: Quantified CXR interrater reliability analysis displays a general uncertainty and strongly depends on medical training. NRR can benefit from radiology reporting in terms of time efficiency and diagnostic accuracy. CXR evaluation of long-time trained ED specialists has not been tested.

Keywords