Radiation Oncology (Jun 2021)
Interobserver variability in organ at risk delineation in head and neck cancer
Abstract
Abstract Background In radiotherapy inaccuracy in organ at risk (OAR) delineation can impact treatment plan optimisation and treatment plan evaluation. Brouwer et al. showed significant interobserver variability (IOV) in OAR delineation in head and neck cancer (HNC) and published international consensus guidelines (ICG) for OAR delineation in 2015. The aim of our study was to evaluate IOV in the presence of these guidelines. Methods HNC radiation oncologists (RO) from each Belgian radiotherapy centre were invited to complete a survey and submit contours for 5 HNC cases. Reference contours (OARref) were obtained by a clinically validated artificial intelligence-tool trained using ICG. Dice similarity coefficients (DSC), mean surface distance (MSD) and 95% Hausdorff distances (HD95) were used for comparison. Results Fourteen of twenty-two RO (64%) completed the survey and submitted delineations. Thirteen (93%) confirmed the use of delineation guidelines, of which six (43%) used the ICG. The OARs whose delineations agreed best with the OARref were mandible [median DSC 0.9, range (0.8–0.9); median MSD 1.1 mm, range (0.8–8.3), median HD95 3.4 mm, range (1.5–38.7)], brainstem [median DSC 0.9 (0.6–0.9); median MSD 1.5 mm (1.1–4.0), median HD95 4.0 mm (2.3–15.0)], submandibular glands [median DSC 0.8 (0.5–0.9); median MSD 1.2 mm (0.9–2.5), median HD95 3.1 mm (1.8–12.2)] and parotids [median DSC 0.9 (0.6–0.9); median MSD 1.9 mm (1.2–4.2), median HD95 5.1 mm (3.1–19.2)]. Oral cavity, cochleas, PCMs, supraglottic larynx and glottic area showed more variation. RO who used the consensus guidelines showed significantly less IOV (p = 0.008). Conclusions Although ICG for delineation of OARs in HNC exist, they are only implemented by about half of RO participating in this study, which partly explains the delineation variability. However, this study highlights that guidelines alone do not suffice to eliminate IOV and that more effort needs to be done to accomplish further treatment standardisation, for example with artificial intelligence.
Keywords