Algorithms (Dec 2017)

A Hierarchical Multi-Label Classification Algorithm for Gene Function Prediction

  • Shou Feng,
  • Ping Fu,
  • Wenbin Zheng

DOI
https://doi.org/10.3390/a10040138
Journal volume & issue
Vol. 10, no. 4
p. 138

Abstract

Read online

Gene function prediction is a complicated and challenging hierarchical multi-label classification (HMC) task, in which genes may have many functions at the same time and these functions are organized in a hierarchy. This paper proposed a novel HMC algorithm for solving this problem based on the Gene Ontology (GO), the hierarchy of which is a directed acyclic graph (DAG) and is more difficult to tackle. In the proposed algorithm, the HMC task is firstly changed into a set of binary classification tasks. Then, two measures are implemented in the algorithm to enhance the HMC performance by considering the hierarchy structure during the learning procedures. Firstly, negative instances selecting policy associated with the SMOTE approach are proposed to alleviate the imbalanced data set problem. Secondly, a nodes interaction method is introduced to combine the results of binary classifiers. It can guarantee that the predictions are consistent with the hierarchy constraint. The experiments on eight benchmark yeast data sets annotated by the Gene Ontology show the promising performance of the proposed algorithm compared with other state-of-the-art algorithms.

Keywords