Data in Brief (Dec 2021)
QR-DN1.0: A new distorted and noisy QRs dataset
Abstract
Barcodes are playing a significant role in different industries in the recent years and among the two most popular 2D barcodes, the QR code has grown exponentially. The QR-DN1.0 dataset includes 5 categories of QR codes that will cover low to high density levels. Each group has 15 QR codes: 5 images for testing and 10 images for training. After embedding the QRs into 30 color images using blind watermarking techniques and then extracting the QRs from the images taken with the mobile phone camera with three different methods, we will have three groups of 2250 extracted QR images, which provides a total of 6750 distorted and noisy QR images. In each of the mentioned three categories, the data is divided into two parts: testing, with 750 images, and training, with 2250 images. For every distorted QR in the dataset, a non-distorted instance of it is placed as a ground truth. One of the advantages of this data set is that it is real. Because no simulated noise has been added to the images and this dataset is completely derived from the real word challenge of extracting embedded QRs in color images captured from the watermarked image on the screen. It also includes various types of QRs such as single character, short sentence, long sentence, URL and location.