PLoS ONE (Jan 2020)
Deep learning of cuneiform sign detection with weak supervision using transliteration alignment.
Abstract
The cuneiform script provides a glimpse into our ancient history. However, reading age-old clay tablets is time-consuming and requires years of training. To simplify this process, we propose a deep-learning based sign detector that locates and classifies cuneiform signs in images of clay tablets. Deep learning requires large amounts of training data in the form of bounding boxes around cuneiform signs, which are not readily available and costly to obtain in the case of cuneiform script. To tackle this problem, we make use of existing transliterations, a sign-by-sign representation of the tablet content in Latin script. Since these do not provide sign localization, we propose a weakly supervised approach: We align tablet images with their corresponding transliterations to localize the transliterated signs in the tablet image, before using these localized signs in place of annotations to re-train the sign detector. A better sign detector in turn boosts the quality of the alignments. We combine these steps in an iterative process that enables training a cuneiform sign detector from transliterations only. While our method works weakly supervised, a small number of annotations further boost the performance of the cuneiform sign detector which we evaluate on a large collection of clay tablets from the Neo-Assyrian period. To enable experts to directly apply the sign detector in their study of cuneiform texts, we additionally provide a web application for the analysis of clay tablets with a trained cuneiform sign detector.