Auroral classification ergonomics and the implications for machine learning

D. McKay; A. Kvammen

doi:10.5194/gi-9-267-2020

Geoscientific Instrumentation, Methods and Data Systems (Jul 2020)

Auroral classification ergonomics and the implications for machine learning

D. McKay,
A. Kvammen

Affiliations

D. McKay: NORCE Norwegian Research Centre AS, Tromsø, Norway
A. Kvammen: Department of Physics and Technology, UiT – The Arctic University of Norway, Tromsø, Norway

DOI: https://doi.org/10.5194/gi-9-267-2020
Journal volume & issue: Vol. 9
pp. 267 – 273

Abstract

Read online

The machine-learning research community has focused greatly on bias in algorithms and have identified different manifestations of it. Bias in training samples is recognised as a potential source of prejudice in machine learning. It can be introduced by the human experts who define the training sets. As machine-learning techniques are being applied to auroral classification, it is important to identify and address potential sources of expert-injected bias. In an ongoing study, 13 947 auroral images were manually classified with significant differences between classifications. This large dataset allowed for the identification of some of these biases, especially those originating as a result of the ergonomics of the classification process. These findings are presented in this paper to serve as a checklist for improving training data integrity, not just for expert classifications, but also for crowd-sourced, citizen science projects. As the application of machine-learning techniques to auroral research is relatively new, it is important that biases are identified and addressed before they become endemic in the corpus of training data.

Published in Geoscientific Instrumentation, Methods and Data Systems

ISSN: 2193-0856 (Print); 2193-0864 (Online)
Publisher: Copernicus Publications
Country of publisher: Germany
LCC subjects: Science: Physics: Geophysics. Cosmic physics
Website: http://www.geoscientific-instrumentation-methods-and-data-systems.net

About the journal