HybridTabNet: Towards Better Table Detection in Scanned Document Images

Danish Nazir; Khurram Azeem Hashmi; Alain Pagani; Marcus Liwicki; Didier Stricker; Muhammad Zeshan Afzal

doi:10.3390/app11188396

Applied Sciences (Sep 2021)

HybridTabNet: Towards Better Table Detection in Scanned Document Images

Danish Nazir,
Khurram Azeem Hashmi,
Alain Pagani,
Marcus Liwicki,
Didier Stricker,
Muhammad Zeshan Afzal

Affiliations

Danish Nazir: Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany
Khurram Azeem Hashmi: Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany
Alain Pagani: German Research Institute for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany
Marcus Liwicki: Department of Computer Science, Luleå University of Technology, 971 87 Luleå, Sweden
Didier Stricker: Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany
Muhammad Zeshan Afzal: Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany

DOI: https://doi.org/10.3390/app11188396
Journal volume & issue: Vol. 11, no. 18
p. 8396

Abstract

Read online

Tables in document images are an important entity since they contain crucial information. Therefore, accurate table detection can significantly improve the information extraction from documents. In this work, we present a novel end-to-end trainable pipeline, HybridTabNet, for table detection in scanned document images. Our two-stage table detector uses the ResNeXt-101 backbone for feature extraction and Hybrid Task Cascade (HTC) to localize the tables in scanned document images. Moreover, we replace conventional convolutions with deformable convolutions in the backbone network. This enables our network to detect tables of arbitrary layouts precisely. We evaluate our approach comprehensively on ICDAR-13, ICDAR-17 POD, ICDAR-19, TableBank, Marmot, and UNLV. Apart from the ICDAR-17 POD dataset, our proposed HybridTabNet outperformed earlier state-of-the-art results without depending on pre- and post-processing steps. Furthermore, to investigate how the proposed method generalizes unseen data, we conduct an exhaustive leave-one-out-evaluation. In comparison to prior state-of-the-art results, our method reduced the relative error by 27.57% on ICDAR-2019-TrackA-Modern, 42.64% on TableBank (Latex), 41.33% on TableBank (Word), 55.73% on TableBank (Latex + Word), 10% on Marmot, and 9.67% on the UNLV dataset. The achieved results reflect the superior performance of the proposed method.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords