Heritage Science (Oct 2024)
Chemometric approaches for discriminating manufacturers of Korean handmade paper using infrared spectroscopy
Abstract
Abstract The objective of this study was to identify the manufacturer of Hanji, Korean handmade paper widely used in conservation science. To achieve this, machine learning models utilizing attenuated total reflectance–infrared spectroscopy (ATR–IR) were developed to assess the robustness and effectiveness of the computed models. Principal component analysis (PCA), partial least squares–discriminant analysis (PLS–DA), decision tree (DT), and k-NN models were constructed using IR spectral data, with the spectral region between 1800 and 1500 cm⁻1 identified as the critical input variable through Variable Importance in Projection (VIP) scores. The transformation of the obtained spectra into second derivative spectra proved beneficial in this key spectral region, leading to significant improvements in model performance. Additionally, the application of DBSCAN for outlier detection was effective in refining the dataset, further enhancing the performance of the models. Specifically, the k-NN model, when applied to the selected variables and preprocessed with the second derivative transformation, achieved an F1 score of 0.92. These findings underscore the importance of focusing on the 1800–1500 cm⁻1 spectral range and applying outlier detection techniques, such as DBSCAN, to enhance the robustness and accuracy of the Hanji classification models by eliminating the influence of atypical data points.
Keywords