Mathematics (May 2025)
AIF: Infrared and Visible Image Fusion Based on Ascending–Descending Mechanism and Illumination Perception Subnetwork
Abstract
The purpose of infrared and visible image fusion is to generate a composite image that can contain both the thermal radiation profile information of the infrared image and the texture details of the visible image. This kind of composite image can be used to detect targets under various lighting conditions and offer high scene spatial resolution. However, the existing image fusion algorithms rarely consider light factor in the modeling process. The study presents a novel image fusion approach (AIF) that can adaptively fuse infrared and visible images under various lighting conditions. Specifically, the infrared image and the visible image are extracted by the AdC feature extractor, respectively, and both of them are adaptively fused under the guidance of the illumination perception subnetwork. The image fusion model is trained in an unsupervised manner with a customized loss function. The AdC feature extractor adopts an ascending–descending feature extraction mechanism to organize convolutional layers and combines these convolutional layers with cross-modal interactive differential modules to achieve the effective extraction of hierarchical complementary and differential information. The illumination perception subnetwork obtains the scene lighting condition based on the visible image, which determines the contribution weights of the visible image and the infrared image in the composite image. The customized loss function consists of illumination loss, gradient loss, and intensity loss. It is more targeted and can effectively improve the fusion effect of visible images and infrared images under different lighting conditions. Ablation experiments demonstrate the effectiveness of the loss function. We compare our method with nine other methods on public datasets, including four traditional methods and five deep-learning-based methods. Qualitative and quantitative experiments show that our method performs better in terms of indicators such as SD, and the fused image has more prominent contour information and richer detail information.
Keywords