Feature Ranking and Screening for Class-Imbalanced Metabolomics Data Based on Rank Aggregation Coupled with Re-Balance

Guang-Hui Fu; Jia-Bao Wang; Min-Jie Zong; Lun-Zhao Yi

doi:10.3390/metabo11060389

Metabolites (Jun 2021)

Feature Ranking and Screening for Class-Imbalanced Metabolomics Data Based on Rank Aggregation Coupled with Re-Balance

Guang-Hui Fu,
Jia-Bao Wang,
Min-Jie Zong,
Lun-Zhao Yi

Affiliations

Guang-Hui Fu: School of Science, Kunming University of Science and Technology, Kunming 650500, China
Jia-Bao Wang: School of Science, Kunming University of Science and Technology, Kunming 650500, China
Min-Jie Zong: School of Science, Kunming University of Science and Technology, Kunming 650500, China
Lun-Zhao Yi: Faculty of Agriculture and Food, Kunming University of Science and Technology, Kunming 650500, China

DOI: https://doi.org/10.3390/metabo11060389
Journal volume & issue: Vol. 11, no. 6
p. 389

Abstract

Read online

Feature screening is an important and challenging topic in current class-imbalance learning. Most of the existing feature screening algorithms in class-imbalance learning are based on filtering techniques. However, the variable rankings obtained by various filtering techniques are generally different, and this inconsistency among different variable ranking methods is usually ignored in practice. To address this problem, we propose a simple strategy called rank aggregation with re-balance (RAR) for finding key variables from class-imbalanced data. RAR fuses each rank to generate a synthetic rank that takes every ranking into account. The class-imbalanced data are modified via different re-sampling procedures, and RAR is performed in this balanced situation. Five class-imbalanced real datasets and their re-balanced ones are employed to test the RAR’s performance, and RAR is compared with several popular feature screening methods. The result shows that RAR is highly competitive and almost better than single filtering screening in terms of several assessing metrics. Performing re-balanced pretreatment is hugely effective in rank aggregation when the data are class-imbalanced.

Published in Metabolites

ISSN: 2218-1989 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Microbiology
Website: http://www.mdpi.com/journal/metabolites

About the journal

Abstract

Keywords