Journal of Medical Physics (Jan 2023)
Assessment of optimizers and their performance in autosegmenting lung tumors
Abstract
Purpose: Optimizers are widely utilized across various domains to enhance desired outcomes by either maximizing or minimizing objective functions. In the context of deep learning, they help to minimize the loss function and improve model's performance. This study aims to evaluate the accuracy of different optimizers employed for autosegmentation of non-small cell lung cancer (NSCLC) target volumes on thoracic computed tomography images utilized in oncology. Materials and Methods: The study utilized 112 patients, comprising 92 patients from “The Cancer Imaging Archive” (TCIA) and 20 of our local clinical patients, to evaluate the efficacy of various optimizers. The gross tumor volume was selected as the foreground mask for training and testing the models. Of the 92 TCIA patients, 57 were used for training and validation, and the remaining 35 for testing using nnU-Net. The performance of the final model was further evaluated on the 20 local clinical patient datasets. Six different optimizers, namely AdaDelta, AdaGrad, Adam, NAdam, RMSprop, and stochastic gradient descent (SGD), were investigated. To assess the agreement between the predicted volume and the ground truth, several metrics including Dice similarity coefficient (DSC), Jaccard index, sensitivity, precision, Hausdorff distance (HD), 95th percentile Hausdorff distance (HD95), and average symmetric surface distance (ASSD) were utilized. Results: The DSC values for AdaDelta, AdaGrad, Adam, NAdam, RMSprop, and SGD were 0.75, 0.84, 0.85, 0.84, 0.83, and 0.81, respectively, for the TCIA test data. However, when the model trained on TCIA datasets was applied to the clinical datasets, the DSC, HD, HD95, and ASSD metrics showed a statistically significant decrease in performance compared to the TCIA test datasets, indicating the presence of image and/or mask heterogeneity between the data sources. Conclusion: The choice of optimizer in deep learning is a critical factor that can significantly impact the performance of autosegmentation models. However, it is worth noting that the behavior of optimizers may vary when applied to new clinical datasets, which can lead to changes in models' performance. Therefore, selecting the appropriate optimizer for a specific task is essential to ensure optimal performance and generalizability of the model to different datasets.
Keywords