IEEE Access (Jan 2024)
MDDCMA: A Distributed Image Fusion Framework Based on Multiscale Dense Dilated Convolution and Coordinate Mean Attention
Abstract
In the task of fusing infrared and visible images, the extraction of features and fusion strategy significantly impacts the outcome of the fusion. However, prevailing fusion methods are often manually designed, unlearnable, and neglect to consider context adequately. To address these issues, this paper introduces a distributed architecture network based on attention mechanism and dense dilated convolution, realizing three-channel data fusion. This network employs a distributed fusion framework to fully utilize the fusion output of the previous step, capitalizing effectively on the target and texture information in infrared and visible images. Initially, two channels gather ample context from the source images through a dense dilated convolution module with multiscale channel attention. Subsequently, a fusion strategy based on coordinate mean attention is adopted to facilitate the fusion of results between the two channels. Then, the fused features and the preceding fusion results are fed into the fusion channel, minimizing loss of target and texture information in infrared and visible images. Furthermore, we incorporate an edge correction block, capable of refining the edge details of the fusion results and effectively suppressing noise. The proposed method demonstrates good fusion performance and extensive ablation experiments validate the effectiveness of the proposed methodology. Simultaneously, both subjective qualitative and objective quantitative comparison results, conducted on public datasets such as RoadScene, TNO, and MSRS, indicate that the visual quality and evaluation metrics of our fusion images are comparable to those achieved by the state-of-the-art image fusion methods.
Keywords