IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (Jan 2025)
NeRF-Based Large-Scale Urban True Digital Orthophoto Map Generation Method
Abstract
In urban scenes, there are man-made ground objects with complex structures and significant height differences, which leads to challenges in generating large-scale true digital orthophoto maps (TDOMs) of cities. However, traditional TDOM generation methods rely on precise prior geometric information, and this reliance makes it difficult to adapt to such complex environments. In this work, we propose TDOM-NeRF, a novel neural radiance field (NeRF) based method for large-scale urban TDOM generation. Without relying on additional prior information, our method utilizes multiview unmanned aerial vehicle (UAV) images as input and implicitly expresses the scene using hash grid features and multilayer perceptrons. By performing orthogonal volume rendering on the reconstruction results of the scene, we effectively address the problem of uneven scales in the synthetic views when generating TDOM. In actual training, we adopt a scene-block training strategy to extend our method to TDOM generation of large-scale scenes while achieving high-fidelity scene reconstruction. In addition, we regularize the model during training to eliminate erroneous “floating objects” in scene reconstruction, making our method applicable to low-overlap UAV datasets. Experimental results demonstrate that our method generates TDOMs with comparable geometric accuracy and eliminates issues such as ghosting, misalignment, and distortion in challenging areas, compared to commercial photogrammetry software like Metashape and Pix4DMapper. Furthermore, our method exhibits superior novel view synthesis performance compared to advanced NeRF methods like Ortho-NeRF. Our pipeline offers a flexible configuration of scene blocking, enabling a balance between performance and efficiency, which makes it particularly suitable for large-scale scene reconstruction.
Keywords