Sensors (Mar 2024)
Utilization of Synthetic Near-Infrared Spectra via Generative Adversarial Network to Improve Wood Stiffness Prediction
Abstract
Near-infrared (NIR) spectroscopy is widely used as a nondestructive evaluation (NDE) tool for predicting wood properties. When deploying NIR models, one faces challenges in ensuring representative training data, which large datasets can mitigate but often at a significant cost. Machine learning and deep learning NIR models are at an even greater disadvantage because they typically require higher sample sizes for training. In this study, NIR spectra were collected to predict the modulus of elasticity (MOE) of southern pine lumber (training set = 573 samples, testing set = 145 samples). To account for the limited size of the training data, this study employed a generative adversarial network (GAN) to generate synthetic NIR spectra. The training dataset was fed into a GAN to generate 313, 573, and 1000 synthetic spectra. The original and enhanced datasets were used to train artificial neural networks (ANNs), convolutional neural networks (CNNs), and light gradient boosting machines (LGBMs) for MOE prediction. Overall, results showed that data augmentation using GAN improved the coefficient of determination (R2) by up to 7.02% and reduced the error of predictions by up to 4.29%. ANNs and CNNs benefited more from synthetic spectra than LGBMs, which only yielded slight improvement. All models showed optimal performance when 313 synthetic spectra were added to the original training data; further additions did not improve model performance because the quality of the datapoints generated by GAN beyond a certain threshold is poor, and one of the main reasons for this can be the size of the initial training data fed into the GAN. LGBMs showed superior performances than ANNs and CNNs on both the original and enhanced training datasets, which highlights the significance of selecting an appropriate machine learning or deep learning model for NIR spectral-data analysis. The results highlighted the positive impact of GAN on the predictive performance of models utilizing NIR spectroscopy as an NDE technique and monitoring tool for wood mechanical-property evaluation. Further studies should investigate the impact of the initial size of training data, the optimal number of generated synthetic spectra, and machine learning or deep learning models that could benefit more from data augmentation using GANs.
Keywords