RoadFormer: Pyramidal deformable vision transformers for road network extraction with remote sensing images

Xiaoling Jiang; Yinyin Li; Tao Jiang; Junhao Xie; Yilong Wu; Qianfeng Cai; Jinhui Jiang; Jiaming Xu; Hui Zhang

International Journal of Applied Earth Observations and Geoinformation (Sep 2022)

RoadFormer: Pyramidal deformable vision transformers for road network extraction with remote sensing images

Xiaoling Jiang,
Yinyin Li,
Tao Jiang,
Junhao Xie,
Yilong Wu,
Qianfeng Cai,
Jinhui Jiang,
Jiaming Xu,
Hui Zhang

Affiliations

Xiaoling Jiang: Faculty of Computer and Software Engineering, Huaiyin Institute of Technology, Huaian, JS 223003, China; Corresponding author.
Yinyin Li: Faculty of Computer and Software Engineering, Huaiyin Institute of Technology, Huaian, JS 223003, China
Tao Jiang: Faculty of Computer and Software Engineering, Huaiyin Institute of Technology, Huaian, JS 223003, China
Junhao Xie: Faculty of Computer and Software Engineering, Huaiyin Institute of Technology, Huaian, JS 223003, China
Yilong Wu: Faculty of Computer and Software Engineering, Huaiyin Institute of Technology, Huaian, JS 223003, China
Qianfeng Cai: Faculty of Computer and Software Engineering, Huaiyin Institute of Technology, Huaian, JS 223003, China
Jinhui Jiang: Faculty of Computer and Software Engineering, Huaiyin Institute of Technology, Huaian, JS 223003, China
Jiaming Xu: School of Humanities, Huaiyin Institute of Technology, Huaian, JS 223003, China
Hui Zhang: Faculty of Computer and Software Engineering, Huaiyin Institute of Technology, Huaian, JS 223003, China

Journal volume & issue: Vol. 113
p. 102987

Abstract

Read online

The data-complete and detail-correct road network information serves as important evidence in numerous transportation-associated applications. Regular and rapid road network inventory updating is significantly necessary and meaningful to provide better services. Remote sensing images, due to their advantageous overlooking earth observation properties, have been widely used to assist in the road network interpretation tasks. However, it is still an open issue to accurately separate the road contents from the surrounding land covers in the remote sensing image with good connectivity and integrality because of the remarkably challengeable conditions of roads. In this regard, we develop a pyramidal deformable vision transformer architecture, termed as RoadFormer, to extract road networks with remote sensing images. Specifically, designed by a multi-context patch embedding scheme, a higher-quality token embedding can be obtained by adopting a multi-range, multi-view context observation strategy. Furthermore, formulated with a deformable transformer architecture, the semantic-relevant features can be focused on in a sparse global manner, which effectively promotes the feature representation quality and robustness. The proposed RoadFormer is elaborately evaluated on three large-scale road network extraction datasets. Quantitative assessments show that the RoadFormer achieves an overall performance of 0.8886 and 0.9407 with respect to the intersection over union (IoU) and F1-score metrics. In addition, contrastive evaluations also convince the promising potentiality and outstanding superiority of the RoadFormer for interpreting the road sections of varying circumstances under diverse challenging image scenarios.

Published in International Journal of Applied Earth Observations and Geoinformation

ISSN: 1569-8432 (Print); 1872-826X (Online)
Publisher: Elsevier
Country of publisher: Netherlands
LCC subjects: Geography. Anthropology. Recreation: Physical geography; Geography. Anthropology. Recreation: Environmental sciences
Website: https://www.journals.elsevier.com/international-journal-of-applied-earth-observation-and-geoinformation

About the journal

Abstract

Keywords