Applied Sciences (Apr 2024)

Impacts of Feature Selection on Predicting Machine Failures by Machine Learning Algorithms

  • Francisco Elânio Bezerra,
  • Geraldo Cardoso de Oliveira Neto,
  • Gabriel Magalhães Cervi,
  • Rafaella Francesconi Mazetto,
  • Aline Mariane de Faria,
  • Marcos Vido,
  • Gustavo Araujo Lima,
  • Sidnei Alves de Araújo,
  • Mauro Sampaio,
  • Marlene Amorim

DOI
https://doi.org/10.3390/app14083337
Journal volume & issue
Vol. 14, no. 8
p. 3337

Abstract

Read online

In the context of Industry 4.0, managing large amounts of data is essential to ensure informed decision-making in intelligent production environments. It enables, for example, predictive maintenance, which is essential for anticipating and identifying causes of failures in machines and equipment, optimizing processes, and promoting proactive management of human, financial, and material resources. However, generating accurate information for decision-making requires adopting suitable data preprocessing and analysis techniques. This study explores the identification of machine failures based on synthetic industrial data. Initially, we applied the feature selection techniques Principal Component Analysis (PCA), Minimum Redundancy Maximum Relevance (mRMR), Neighborhood Component Analysis (NCA), and Denoising Autoencoder (DAE) to the collected data and compared their results. In the sequence, a comparison among three widely known machine learning classifiers, namely Random Forest (RF), Support Vector Machine (SVM), and Multilayer Perceptron neural network (MLP), was conducted, with and without considering feature selection. The results showed that PCA and RF were superior to the other techniques, allowing the classification of failures with rates of 0.98, 0.97, and 0.98 for the accuracy, precision, and recall metrics, respectively. Thus, this work contributes by solving an industrial problem and detailing techniques to identify the most relevant variables and machine learning algorithms for predicting machine failures that negatively impact production planning. The findings provided by this study can assist industries in giving preference to employing sensors and collecting data that can contribute more effectively to machine failure predictions.

Keywords