Journal of Open Humanities Data (Jul 2024)

The EyCon Dataset: A Visual Corpus of Early Conflict Photography

  • Marina Giardinetti,
  • Daniel Foliard,
  • Julien Schuh,
  • Mohamed-Salim Aissi

DOI
https://doi.org/10.5334/johd.213
Journal volume & issue
Vol. 10
pp. 40 – 40

Abstract

Read online

The EyCon dataset, comprising nearly 130,000 JPEG images and pages, documents armed conflicts from the 1890s to 1918, with a focus on extra-European contexts. The project team aggregated thousands of digitized images and metadata from various institutions, including previously inaccessible documents. To enhance metadata, the team conducted visual and multimodal similarity analyses, as well as human and animal detection. Captions were processed to extract named entities for XML-formatted descriptive metadata. Challenges in identifying and publishing graphic images due to automated tools’ limitations in detecting violence were addressed with human expertise for accurate classification. Available online and on Zenodo for download and reuse, the dataset confronts issues in computer vision for heritage photographs, such as degradation from fading, discoloration, scratches and noise, which impair algorithms reliant on visual features. The under-representation of early photographic cultures in datasets introduces bias in applying standard solutions to archival materials.

Keywords