Revista CENIC Ciencias Químicas (Oct 2013)

Classification of kerosene using physicochemical data and multivariate techniques

  • Yumirka Comesaña-García,
  • Alberto Cavado-Osorio,
  • Ernesto Linchenat-Dennes,
  • Ángel Dago-Morales

Journal volume & issue
Vol. 44, no. 1
pp. 001 – 010

Abstract

Read online

In this paper we have compared the abilities of two multivariate classification methods: soft independent modeling of class analogy (SIMCA) and support vector machines (SVM) for kerosene classification using physicochemical data. Two types of kerosene fractions (class A and class B) with different chemical compositions were collected from a refinery production. The physicochemical data used as variables for calculation of the models were: initial boiling point, 10 % of recovery, final boiling point, flash point, density, viscosity and sulfur percentage. A training set of 40 samples was used to calculate the supervised classification models. Besides, an independent validation set of 25 samples was used to evaluate their performance. The SIMCA model did not have enough discrimination power, just to distinguish the two types of kerosene fractions. The SVM model, which was calculated using a linear kernel with a capacity parameter of C=10 and 7 support vectors had an adequate sensitivity and selectivity in the training and validation steps. The model complexity (number of support vectors) was 17% of the total training data, therefore a reasonable separation was achieved and sufficient training data existed. SVM model must be considered acceptable on ability to classify and discriminate the two types of kerosene fractions (A and B), using physicochemical data as variables. The generalization ability of SVMs makes this technique attractive for systems having limited number of variables.