Investigating the optimisation of real-world and synthetic object detection training datasets through the consideration of environmental and simulation factors

Callum Newman; Jon Petzing; Yee Mey Goh; Laura Justham

doi:10.1016/j.iswa.2022.200079

Intelligent Systems with Applications (May 2022)

Investigating the optimisation of real-world and synthetic object detection training datasets through the consideration of environmental and simulation factors

Callum Newman,
Jon Petzing,
Yee Mey Goh,
Laura Justham

Affiliations

Callum Newman: Corresponding author.; Wolfson School of Mechanical, Electrical and Manufacturing Engineering, Loughborough University, Loughborough, United Kingdom
Jon Petzing: Wolfson School of Mechanical, Electrical and Manufacturing Engineering, Loughborough University, Loughborough, United Kingdom
Yee Mey Goh: Wolfson School of Mechanical, Electrical and Manufacturing Engineering, Loughborough University, Loughborough, United Kingdom
Laura Justham: Wolfson School of Mechanical, Electrical and Manufacturing Engineering, Loughborough University, Loughborough, United Kingdom

DOI: https://doi.org/10.1016/j.iswa.2022.200079
Journal volume & issue: Vol. 14
p. 200079

Abstract

Read online

Computer vision is used for many industrial applications involving automation, especially those related to efficiency and safety. Computer vision techniques which use machine learning, such as object detectors, need a dataset of images for training and testing. Publicly available datasets or new datasets can be used. However, these datasets rarely consider whether the dataset is leading to optimal performance. Environmental factors, such as lighting and occlusion, will alter the appearance of the images and so images taken under certain condition may have different effects on training. A knowledge gap has formed as to how the test performance of deep neural networks can be improved by considering the effect and interactions of factors where either real or synthetic images are used. The following research illustrates that the different factors can have a significant impact on the test performance and demonstrates a process that can be used on real-world and synthetic images to identify the effect of each factor and discusses how this information may be used to create an optimal training dataset.

Published in Intelligent Systems with Applications

ISSN: 2667-3053 (Online)
Publisher: Elsevier
Country of publisher: United Kingdom
LCC subjects: Science: Science (General): Cybernetics; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.journals.elsevier.com/intelligent-systems-with-applications

About the journal

Abstract

Keywords