A methodology for the design of experiments in computational intelligence with multiple regression models

Carlos Fernandez-Lozano; Marcos Gestal; Cristian R. Munteanu; Julian Dorado; Alejandro Pazos

doi:10.7717/peerj.2721

PeerJ (Dec 2016)

A methodology for the design of experiments in computational intelligence with multiple regression models

Carlos Fernandez-Lozano,
Marcos Gestal,
Cristian R. Munteanu,
Julian Dorado,
Alejandro Pazos

Affiliations

Carlos Fernandez-Lozano: Information and Communications Technologies Department, University of A Coruna, A Coruña, Spain
Marcos Gestal: Information and Communications Technologies Department, University of A Coruna, A Coruña, Spain
Cristian R. Munteanu: Information and Communications Technologies Department, University of A Coruna, A Coruña, Spain
Julian Dorado: Information and Communications Technologies Department, University of A Coruna, A Coruña, Spain
Alejandro Pazos: Information and Communications Technologies Department, University of A Coruna, A Coruña, Spain

DOI: https://doi.org/10.7717/peerj.2721
Journal volume & issue: Vol. 4
p. e2721

Abstract

Read online Read online

The design of experiments and the validation of the results achieved with them are vital in any research study. This paper focuses on the use of different Machine Learning approaches for regression tasks in the field of Computational Intelligence and especially on a correct comparison between the different results provided for different methods, as those techniques are complex systems that require further study to be fully understood. A methodology commonly accepted in Computational intelligence is implemented in an R package called RRegrs. This package includes ten simple and complex regression models to carry out predictive modeling using Machine Learning and well-known regression algorithms. The framework for experimental design presented herein is evaluated and validated against RRegrs. Our results are different for three out of five state-of-the-art simple datasets and it can be stated that the selection of the best model according to our proposal is statistically significant and relevant. It is of relevance to use a statistical approach to indicate whether the differences are statistically significant using this kind of algorithms. Furthermore, our results with three real complex datasets report different best models than with the previously published methodology. Our final goal is to provide a complete methodology for the use of different steps in order to compare the results obtained in Computational Intelligence problems, as well as from other fields, such as for bioinformatics, cheminformatics, etc., given that our proposal is open and modifiable.

Published in PeerJ

ISSN: 2167-8359 (Online)
Publisher: PeerJ Inc.
Country of publisher: United States
LCC subjects: Medicine; Science: Biology (General)
Website: https://peerj.com/

About the journal

Abstract

Keywords