Engineering and Applied Science Research (Dec 2019)

Optimal data division for empowering artificial neural network models employing a modified M-SPXY algorithm

  • Wirote Apinantanakon,
  • Khamron Sunat,
  • Joel Alan Kinmond

DOI
https://doi.org/10.14456/easr.2019.31
Journal volume & issue
Vol. 46, no. 4
pp. 276 – 284

Abstract

Read online

Data splitting is an important step in artificial neural network (ANN) models, which is found in the form of training and testing subsets. In general, a random data splitting method is favored to divide a pool of samples into subsets, without considering the quality of data for the training step of a neural network. The drawback of poor data splitting methods is that they poses ill effects to the performance of the neural network when the data involves complex matrices or multivariate modeling. In order to overcome this drawback, the current paper presents our proposed M-SPXY method. It is based on a modified version of Sample Set Partitioning, which relies on a joint X-y distances (SPXY) method. The proposed method has resulted in better performance, compared to the modified Kennard-Stone (KS) method, using Mahalanobis distances (MDKS). In our experiments, the proposed approach was employed to compare various data splitting methods using data sets from the repository of the University of California in Irvine (UCI), processed through an Extreme Learning Machine (ELM) neural network. Performance was measured in terms of classification accuracy. The results indicate that the classification accuracy of the proposed M-SPXY process is superior to that of the MDKS data splitting method.

Keywords