Scientific Reports (Mar 2022)
Generalized ComBat harmonization methods for radiomic features with multi-modal distributions and multiple batch effects
Abstract
Abstract Radiomic features have a wide range of clinical applications, but variability due to image acquisition factors can affect their performance. The harmonization tool ComBat is a promising solution but is limited by inability to harmonize multimodal distributions, unknown imaging parameters, and multiple imaging parameters. In this study, we propose two methods for addressing these limitations. We propose a sequential method that allows for harmonization of radiomic features by multiple imaging parameters (Nested ComBat). We also employ a Gaussian Mixture Model (GMM)-based method (GMM ComBat) where scans are split into groupings based on the shape of the distribution used for harmonization as a batch effect and subsequent harmonization by a known imaging parameter. These two methods were evaluated on features extracted with CapTK and PyRadiomics from two public lung computed tomography datasets. We found that Nested ComBat exhibited similar performance to standard ComBat in reducing the percentage of features with statistically significant differences in distribution attributable to imaging parameters. GMM ComBat improved harmonization performance over standard ComBat (− 11%, − 10% for Lung3/CAPTK, Lung3/PyRadiomics harmonizing by kernel resolution). Features harmonized with a variant of the Nested method and the GMM split method demonstrated similar c-statistics and Kaplan–Meier curves when used in survival analyses.