ECS Sensors Plus (Jan 2024)

Automated Quantification of DNA Damage Using Deep Learning and Use of Synthetic Data Generated from Basic Geometric Shapes

  • Srikanth Namuduri,
  • Prateek Mehta,
  • Lise Barbe,
  • Stephanie Lam,
  • Zohreh Faghihmonzavi,
  • Steven Finkbeiner,
  • Shekhar Bhansali

DOI
https://doi.org/10.1149/2754-2726/ad21ea
Journal volume & issue
Vol. 3, no. 1
p. 012401

Abstract

Read online

Comet assays are used to assess the extent of Deoxyribonucleic acid (DNA) damage, in human cells, caused by substances such as novel drugs or nano materials. Deep learning is showing promising results in automating the process of quantifying the percentage of damage, using the assay images. But the lack of large datasets and imbalanced data is a challenge. In this study, synthetic comet assay images generated from simple geometric shapes were used to augment the data for training the Convolutional Neural Network. The results from the model trained using the augmented data were compared with the results from a model trained exclusively on real images. It was observed that the use of synthetic data in training not only gave a significantly better coefficient of determination ( R ^2 ), but also resulted in a more robust model i.e., with less variation in R ^2 compared to training without synthetic data. This approach can lead to improved training while using a smaller training dataset, saving cost and effort involved in capturing additional experimental images and annotating them. Additional benefits include addressing imbalanced datasets, and data privacy concerns. Similar approaches must be explored in other low data domains to extract the same benefits.

Keywords