Applied Sciences (Mar 2024)
SatelliteRF: Accelerating 3D Reconstruction in Multi-View Satellite Images with Efficient Neural Radiance Fields
Abstract
In the field of multi-view satellite photogrammetry, the neural radiance field (NeRF) method has received widespread attention due to its ability to provide continuous scene representation and realistic rendering effects. However, the satellite radiance field methods based on the NeRF are limited by the slow training speed of the original NeRF, and the scene reconstruction efficiency is low. Training for a single scene usually takes 8–10 h or even longer, which severely constrains the utilization and exploration of the NeRF approach within the domain of satellite photogrammetry. In response to the above problems, we propose an efficient neural radiance field method called SatelliteRF, which aims to quickly and efficiently reconstruct the earth’s surface through multi-view satellite images. By introducing innovative multi-resolution hash coding, SatelliteRF enables the model to greatly increase the training speed while maintaining high reconstruction quality. This approach allows for smaller multi-layer perceptron (MLP) networks, reduces the computational cost of neural rendering, and accelerates the training process. Furthermore, to overcome the challenges of illumination changes and transient objects encountered when processing multi-date satellite images, we adopt an improved irradiance model and learn transient embeddings for each image. This not only increases the adaptability of the model to illumination variations but also improves its ability to handle changing objects. We also introduce a loss function based on stochastic structural similarity (SSIM) to provide structural information of the scene for model training, which further improves the quality and detailed performance of the reconstructed scene. Through extensive experiments on the DFC 2019 dataset, we demonstrate that SatelliteRF is not only able to significantly reduce the training time for the same region from the original 8–10 h to only 5–10 min but also achieves better performance in terms of rendering and the reconstruction quality.
Keywords