A Page Object Detection Method Based on Mask R-CNN

Canhui Xu; Cao Shi; Hengyue Bi; Chuanqi Liu; Yongfeng Yuan; Haoyan Guo; Yinong Chen

doi:10.1109/ACCESS.2021.3121152

IEEE Access (Jan 2021)

A Page Object Detection Method Based on Mask R-CNN

Canhui Xu,
Cao Shi,
Hengyue Bi,
Chuanqi Liu,
Yongfeng Yuan,
Haoyan Guo,
Yinong Chen

Affiliations

Canhui Xu: School of Information Science and Technology, Qingdao University of Science and Technology, Qingdao, China
Cao Shi: ORCiD; School of Information Science and Technology, Qingdao University of Science and Technology, Qingdao, China
Hengyue Bi: School of Information Science and Technology, Qingdao University of Science and Technology, Qingdao, China
Chuanqi Liu: School of Information Science and Technology, Qingdao University of Science and Technology, Qingdao, China
Yongfeng Yuan: School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
Haoyan Guo: School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
Yinong Chen: School of Computing, Informatics and Decision Systems Engineering, Arizona State University, Tempe, AZ, USA

DOI: https://doi.org/10.1109/ACCESS.2021.3121152
Journal volume & issue: Vol. 9
pp. 143448 – 143457

Abstract

Read online

Page object detection is crucial for document understanding. Different granularities for objects can result in different performances. In this study, block level region object detection is considered among the inherent hierarchical structure for document images. Inspired by Mask R-CNN (Region-based Convolutional Neural Networks) method, an end to end network is proposed to perform object classification, bounding box identification, and page object mask generation at the same time. Latex based synthetic document generation is designed for enlarging the training data. A large number of synthetic page images are generated for training to alleviate the insufficient dataset problem. Compared with existing page object competition methods, the proposed method achieves better results, with mAP of 0.917 on page objects such as table, figure and maths detection.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords