CA-STD: Scene Text Detection in Arbitrary Shape Based on Conditional Attention

Xing Wu; Yangyang Qi; Jun Song; Junfeng Yao; Yanzhong Wang; Yang Liu; Yuexing Han; Quan Qian

doi:10.3390/info13120565

Information (Dec 2022)

CA-STD: Scene Text Detection in Arbitrary Shape Based on Conditional Attention

Xing Wu,
Yangyang Qi,
Jun Song,
Junfeng Yao,
Yanzhong Wang,
Yang Liu,
Yuexing Han,
Quan Qian

Affiliations

Xing Wu: School of Computer Engineering & Science, Shanghai University, Shanghai 200444, China
Yangyang Qi: School of Computer Engineering & Science, Shanghai University, Shanghai 200444, China
Jun Song: Department of Geography, Faculty of Social Sciences, Hong Kong Baptist University, Hong Kong 999077, China
Junfeng Yao: Cssc Seago System Technology Co., Ltd., Shanghai 200010, China
Yanzhong Wang: Shanghai Jianke Engineering Project Management Co., Ltd., Shanghai 200032, China
Yang Liu: School of Computer Engineering & Science, Shanghai University, Shanghai 200444, China
Yuexing Han: School of Computer Engineering & Science, Shanghai University, Shanghai 200444, China
Quan Qian: School of Computer Engineering & Science, Shanghai University, Shanghai 200444, China

DOI: https://doi.org/10.3390/info13120565
Journal volume & issue: Vol. 13, no. 12
p. 565

Abstract

Read online

Scene Text Detection (STD) is critical for obtaining textual information from natural scenes, serving for automated driving and security surveillance. However, existing text detection methods fall short when dealing with the variation in text curvatures, orientations, and aspect ratios in complex backgrounds. To meet the challenge, we propose a method called CA-STD to detect arbitrarily shaped text against a complicated background. Firstly, a Feature Refinement Module (FRM) is proposed to enhance feature representation. Additionally, the conditional attention mechanism is proposed not only to decouple the spatial and textual information from scene text images, but also to model the relationship among different feature vectors. Finally, the Contour Information Aggregation (CIA) is presented to enrich the feature representation of text contours by considering circular topology and semantic information simultaneously to obtain the detection curves with arbitrary shapes. The proposed CA-STD method is evaluated on different datasets with extensive experiments. On the one hand, the CA-STD outperforms state-of-the-art methods and achieves 82.9 in precision on the dataset of TotalText. On the other hand, the method has better performance than state-of-the-art methods and achieves the F1 score of 83.8 on the dataset of CTW-1500. The quantitative and qualitative analysis proves that the CA-STD can detect variably shaped scene text effectively.

Published in Information

ISSN: 2078-2489 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering: Information technology
Website: http://www.mdpi.com/journal/information/

About the journal

Abstract

Keywords