Machine Learning with Applications (Dec 2024)
Document Layout Error Rate (DLER) metric to evaluate image segmentation methods
Abstract
Scholarly editions play a crucial role in humanities research, particularly in the study of literature and historical documents. The primary objective of these editions is to reconstruct the original text or provide insights into the author’s intentions. Traditionally, crafting a critical edition required a lifetime of dedication. However, thanks to recent advancements in deep learning and computer vision, modern text recognition tools can now be used to expedite this process. A key part of these tools is document layout analysis (DLA), where image segmentation methods are used to detect different text elements. Most existing DLA solutions have focused on evaluating the accuracy of these methods, often neglecting to study the practical consequences of method selection. In this study, we have developed a new metric, the Document Layout Error Rate (DLER), which evaluates the performance of fine-grained DLA methods within the overall pipeline. This metric helps identify the method with the lowest error rate, thereby minimizing the manual effort required for corrections. We applied this evaluation method to assess four different methods and their efficacy for the DLA task in the context of David Hume’s History of England.