SAGE Open (Jun 2024)

Integrating Relative Efficiency Models with Machine Learning Algorithms for Performance Prediction

  • Marcos Gonçalves Perroni,
  • Claudimar Pereira da Veiga,
  • Elaine Forteski,
  • Diego Antonio Bittencourt Marconatto,
  • Wesley Vieira da Silva,
  • Carlos Otávio Senff,
  • Zhaohui Su

DOI
https://doi.org/10.1177/21582440241257800
Journal volume & issue
Vol. 14

Abstract

Read online

Predicting operational performance enables organizations to develop operational effectiveness goals considering different combinations of resources. Measuring performance is consolidated with advances in relative efficiency analysis techniques, including data envelopment analysis (DEA) and stochastic frontier analysis (SFA), albeit these methods lack predictive capability. This paper proposes an approach for performance prediction by integrating relative efficiency measurement models with machine learning algorithms. Data analyses were conducted using data provided by the energy assessment project offered to small and medium-sized manufacturing companies in the United States ( n 7,548) using sales as the output, with the inputs being the number of employees, hours of operation, electricity, natural gas, cost of electricity, and cost of natural gas. Performance was estimated differently, employing parametric (SFA) and non-parametric (DEA) methods. The prediction benchmarking process occurred by adopting machine learning algorithms: regression (LM), support vector machine (SVM), K-nearest neighbor (KNN), linear discriminant analysis (LDA), random forest (RF), and decision tree (DT). The findings showed that it is possible to identify the best prediction algorithm associated with a performance model. However, the performance prediction may differ if different strategies for measuring performance or machine learning model configurations are used. In addition, SFA-LOG and SVM had the best performance for regression, and DEA-VRS/IRS excelled with random forest; the RF algorithm was the best fit across all performance approaches. The error rate depends on the algorithm and the performance model, and the number of classes must be reduced to obtain a higher success rate.