Data in Brief (Apr 2023)
FIRST radio galaxy data set containing curated labels of classes FRI, FRII, compact and bent
Abstract
Automated classification of astronomical sources is often challenging due to the scarcity of labelled training data. We present a data set with a total number of 2158 data items that contains radio galaxy images with their corresponding morphological labels taken from various catalogues [1,2]. The data set is curated by removing duplicates, ambiguous morphological labels and by different meta data formats. The image data was acquired by the VLA FIRST (Faint Images of the Radio Sky at Twenty-Centimeters) survey [3]. The morphological labels are collected and the catalogue specific classification definition is converted into a 4-class classification scheme: FRI, FRII, Compact and Bent sources. FRI and FRII correspond to the two classes of the widely used Faranoff-Riley classification [4]. We consider two more classes: compact sources and bent-tail galaxies. For duplicates with different morphological labels, the galaxy is regarded as ambiguously labeled and both coordinates are removed. For the remaining list of coordinates, the radio galaxy images are collected from the virtual observatory skyview (https://skyview.gsfc.nasa.gov/current/cgi/query.pl). The gray value images are provided in the size of 300 × 300 pixel and all pixels with a value below three times the local RMS of the noise are set to this threshold value. The data set is useful for the development of robust machine learning models that automate the classification of radio galaxy images.