Data in Brief (Dec 2021)

Non-melanoma skin cancer segmentation for histopathology dataset

  • Simon M. Thomas,
  • James G. Lefevre,
  • Glenn Baxter,
  • Nicholas A. Hamilton

Journal volume & issue
Vol. 39
p. 107587

Abstract

Read online

Densely labelled segmentation data for digital pathology images is costly to produce but is invaluable to training effective machine learning models. We make available 290 hand-annotated histopathology tissue sections of the 3 most common skin cancers; basal cell carcinoma (BCC), squamous cell carcinoma (SCC) and intraepidermal carcinoma (IEC). These non-melanoma skin cancers constitute over 90% of all skin cancer diagnoses and hence this dataset gives an opportunity to the scientific community to benchmark analytic methodologies on a significant portion of the dermatopathology workflow. The data represents typical cases of the three cancer types (not requiring a differential diagnosis) across shave, punch and excision biopsy contexts. Each image is accompanied with a segmentation mask which characterizes the section into 12 tissue types, specifically: keratin, epidermis, papillary dermis, reticular dermis, hypodermis, inflammation, glands, hair follicles and background, as well as BCC, SCC and IEC. Included also are cancer margin measurements to work towards automated assessment of surgical margin clearance and tumour invasion. This leaves open many opportunities for researchers to utilize or extend the dataset, building upon recent work on image analysis problems in skin cancer (Thomas et al., 2021).

Keywords