Scientific Data (Dec 2024)

GeoCrack: A High-Resolution Dataset For Segmentation of Fracture Edges in Geological Outcrops

  • Mohammed Yaqoob,
  • Mohammed Ishaq,
  • Mohammed Yusuf Ansari,
  • Venkata Ram Sagar Konagandla,
  • Tamim Al Tamimi,
  • Stefano Tavani,
  • Amerigo Corradetti,
  • Thomas Daniel Seers

DOI
https://doi.org/10.1038/s41597-024-04107-0
Journal volume & issue
Vol. 11, no. 1
pp. 1 – 13

Abstract

Read online

Abstract GeoCrack is the first large-scale open source annotated dataset of fracture traces from geological outcrops, enabling deep learning-based fracture segmentation, setting a new standard for natural fracture characterization datasets. GeoCrack contains images from photogrammetric surveys of fractured rock exposures across 11 sites in Europe and the Middle East, capturing diverse lithologies and tectonic settings. Each image was cleaned, normalized, and manually segmented, followed by a recursive annotation vetting process to ensure the quality and accuracy of the digitized fracture edges. The processed images and corresponding binary masks were divided into 224 × 224 patches, yielding 12,158 pairs. GeoCrack captures representive real-world challenges in fracture edge annotation, such as contrast variations between fracture traces and the host medium due to geological and geomorphological factors like aperture dilation, host rock composition, outcrop weathering, and groundwater staining. Physical occlusions like shadows and vegetation are also considered to minimize false positives. GeoCrack was validated using a U-Net implementation for fracture segmentation, achieving satisfactory IoU of 85%. GeoCrack holds strong potential to advance deep fracture segmentation in geological applications, effectively tackling the diverse challenges of real-world fracture identification.