Journal of Statistical Software (Jul 2018)

rmcfs: An R Package for Monte Carlo Feature Selection and Interdependency Discovery

  • Michał Dramiński,
  • Jacek Koronacki

DOI
https://doi.org/10.18637/jss.v085.i12
Journal volume & issue
Vol. 85, no. 1
pp. 1 – 28

Abstract

Read online

We describe the R package rmcfs that implements an algorithm for ranking features from high dimensional data according to their importance for a given supervised classification task. The ranking is performed prior to addressing the classification task per se. This R package is the new and extended version of the MCFS (Monte Carlo feature selection) algorithm where an early version was published in 2005. The package provides an easy R interface, a set of tools to review results and the new ID (interdependency discovery) component. The algorithm can be used on continuous and/or categorical features (e.g., gene expression and phenotypic data) to produce an objective ranking of features with a statistically well-defined cutoff between informative and non-informative ones. Moreover, the directed ID graph that presents interdependencies between informative features is provided.

Keywords