Sensors (Aug 2024)
Specifics of Data Collection and Data Processing during Formation of RailVista Dataset for Machine Learning- and Deep Learning-Based Applications
Abstract
This paper presents the methodology and outcomes of creating the Rail Vista dataset, designed for detecting defects on railway tracks using machine and deep learning techniques. The dataset comprises 200,000 high-resolution images categorized into 19 distinct classes covering various railway infrastructure defects. The data collection involved a meticulous process including complex image capture methods, distortion techniques for data enrichment, and secure storage in a data warehouse using efficient binary file formats. This structured dataset facilitates effective training of machine/deep learning models, enhancing automated defect detection systems in railway safety and maintenance applications. The study underscores the critical role of high-quality datasets in advancing machine learning applications within the railway domain, highlighting future prospects for improving safety and reliability through automated recognition technologies.
Keywords