Revista Ion (Dec 2014)

Novel feature selection method based on Stochastic Methods Coupled to Support Vector Machines using H- NMR data (data of olive and hazelnut oils)

  • Oscar Eduardo Gualdron,
  • Claudia Isaza,
  • Cristhian Manuel Duran

Journal volume & issue
Vol. 27, no. 2
pp. 17 – 28

Abstract

Read online

One of the principal inconveniences that analysis and information processing presents is that of the representation of dataset. Normally, one encounters a high number of samples, each one with thousands of variables, and in many cases with irrelevant information and noise. Therefore, in order to represent findings in a clearer way, it is necessary to reduce the amount of variables. In this paper, a novel variable selection technique for multivariable data analysis, inspired on stochastic methods and designed to work with support vector machines (SVM), is described. The approach is demonstrated in a food application involving the detection of adulteration of olive oil (more expensive) with hazelnut oil (cheaper). Fingerprinting by H NMR spectroscopy was used to analyze the different samples. Results show that it is possible to reduce the number of variables without affecting classification results.

Keywords