IEEE Access (Jan 2024)
Utilizing a Two Planes Model to Rectify Documents With a Single Arbitrary Crease
Abstract
Document image rectification problem is crucial in document analysis. Most of the current state-of-the-art methods addressing it are data-driven and rely on neural network approaches. However, despite satisfactory rectifications, such methods’ time performance is poor, making them unsuitable for mobile on-device acquisition. The present work concentrates on a specific (but common) case of document physical distortion – the documents with a single crease. We investigate the properties of a surface comprised of two planes captured by a pinhole camera. Namely, we provide the methods to obtain the transformation between such an image and the template image having successfully localized the document in a frame. It can be utilized in on-device recognition systems: it takes only 3 ms to estimate transformation parameters and about a quarter of a second to rectify an image on a smartphone CPU. We propose a novel dataset FDI-AC containing 200 real images of documents with a single crease in different positions. We conduct experiments comparing our approach with the current state-of-the-art setting a baseline performance on FDI-AC. These experiments show that the proposed algorithm outperforms image rectification transformer network GeoTr in rectification accuracy and time performance.
Keywords