Avoid Oversimplifications in Machine Learning: Going beyond the Class-Prediction Accuracy

Sung Yang Ho; Limsoon Wong; Wilson Wen Bin Goh

Patterns (May 2020)

Avoid Oversimplifications in Machine Learning: Going beyond the Class-Prediction Accuracy

Sung Yang Ho,
Limsoon Wong,
Wilson Wen Bin Goh

Affiliations

Sung Yang Ho: School of Biological Sciences, Nanyang Technological University, Singapore 637551, Singapore
Limsoon Wong: Department of Computer Science, National University of Singapore, Singapore 117417, Singapore; Corresponding author
Wilson Wen Bin Goh: School of Biological Sciences, Nanyang Technological University, Singapore 637551, Singapore; Corresponding author

Journal volume & issue: Vol. 1, no. 2
p. 100025

Abstract

Read online

Class-prediction accuracy provides a quick but superficial way of determining classifier performance. It does not inform on the reproducibility of the findings or whether the selected or constructed features used are meaningful and specific. Furthermore, the class-prediction accuracy oversummarizes and does not inform on how training and learning have been accomplished: two classifiers providing the same performance in one validation can disagree on many future validations. It does not provide explainability in its decision-making process and is not objective, as its value is also affected by class proportions in the validation set. Despite these issues, this does not mean we should omit the class-prediction accuracy. Instead, it needs to be enriched with accompanying evidence and tests that supplement and contextualize the reported accuracy. This additional evidence serves as augmentations and can help us perform machine learning better while avoiding naive reliance on oversimplified metrics. The Bigger Picture: There is a huge potential for machine learning, but blind reliance on oversimplified metrics can mislead. Class-prediction accuracy is a common metric used for determining classifier performance. This article provides examples to show how the class-prediction accuracy is superficial and even misleading. We propose some augmentative measures to supplement the class-prediction accuracy. This in turn helps us to better understand the quality of learning of the classifier.

Published in Patterns

ISSN: 2666-3899 (Online)
Publisher: Elsevier
Country of publisher: United States
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science: Computer software
Website: https://www.cell.com/patterns

About the journal

Abstract

Keywords