Handling the Imbalanced Problem in Agri-Food Data Analysis

Adeyemi O. Adegbenjo; Michael O. Ngadi

doi:10.3390/foods13203300

Foods (Oct 2024)

Handling the Imbalanced Problem in Agri-Food Data Analysis

Adeyemi O. Adegbenjo,
Michael O. Ngadi

Affiliations

Adeyemi O. Adegbenjo: Department of Bioresource Engineering, McGill University, 21111 Lakeshore Road, Ste-Anne-de-Bellevue, Montreal, QC H9X 3V9, Canada
Michael O. Ngadi: Department of Bioresource Engineering, McGill University, 21111 Lakeshore Road, Ste-Anne-de-Bellevue, Montreal, QC H9X 3V9, Canada

DOI: https://doi.org/10.3390/foods13203300
Journal volume & issue: Vol. 13, no. 20
p. 3300

Abstract

Read online

Imbalanced data situations exist in most fields of endeavor. The problem has been identified as a major bottleneck in machine learning/data mining and is becoming a serious issue of concern in food processing applications. Inappropriate analysis of agricultural and food processing data was identified as limiting the robustness of predictive models built from agri-food applications. As a result of rare cases occurring infrequently, classification rules that detect small groups are scarce, so samples belonging to small classes are largely misclassified. Most existing machine learning algorithms including the K-means, decision trees, and support vector machines (SVMs) are not optimal in handling imbalanced data. Consequently, models developed from the analysis of such data are very prone to rejection and non-adoptability in real industrial and commercial settings. This paper showcases the reality of the imbalanced data problem in agri-food applications and therefore proposes some state-of-the-art artificial intelligence algorithm approaches for handling the problem using methods including data resampling, one-class learning, ensemble methods, feature selection, and deep learning techniques. This paper further evaluates existing and newer metrics that are well suited for handling imbalanced data. Rightly analyzing imbalanced data from food processing application research works will improve the accuracy of results and model developments. This will consequently enhance the acceptability and adoptability of innovations/inventions.

Published in Foods

ISSN: 2304-8158 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Chemical technology
Website: http://www.mdpi.com/journal/foods

About the journal

Abstract

Keywords