IEEE Access (Jan 2024)

Denoising Raman Spectra Using Autoencoder for Improved Analysis of Contamination in HDD

  • Sarun Gulyanon,
  • Somrudee Deepaisarn,
  • Sorawit Chokphantavee,
  • Sirawit Chokphantavee,
  • Phuriphan Prathipasen,
  • Seksan Laitrakun,
  • Pakorn Opaprakasit,
  • Waranrach Viriyavit,
  • Narisara Jaikaew,
  • Jirawan Jindakaew,
  • Pornchai Rakpongsiri,
  • Thawanpat Meechamnan,
  • Duangporn Sompongse

DOI
https://doi.org/10.1109/ACCESS.2024.3441824
Journal volume & issue
Vol. 12
pp. 113661 – 113676

Abstract

Read online

Small particles contaminated in hard disk drives (HDD) potentially cause damage to the device, leading to data loss. Hard disk industries, therefore, pay attention to identifying the types and sources of these contaminants. However, expensive analytical procedures are required for precise identification when testing samples are relatively small and scarce. A traditional tool, Raman spectroscopy, provides spectra with poor signal-to-noise ratios when dealing with sub-micron particles. Hence, human experts find noisy Raman spectral identification a real burden. In this study, we proposed a practically applicable pipeline, consisting of a denoising autoencoder with the spectral gradient correlation for the classification task, followed by the novel validation step based on an ensemble of CNN models to remove the predictions with low certainty. In the experiments, three different backbone models for denoising autoencoders are studied, including multilayer perceptron (MLP), convolutional neural network (CNN), and U-Net. While the ensemble model consists of eight different CNN models that act as independent machine experts whose votes indicate agreement with the correlation approach. When less agreement is observed, the sample is said to be unidentified and rejected from the classification task. With our validation step, the results bestow exceptionally high classification accuracy of 0.965, 0.955, and 0.976 for spectra undergoing our proposed pipeline with MLP, CNN, and U-Net autoencoder denoising models, respectively. This highlights the effectiveness of our proposed pipeline in practical application.

Keywords