Journal of Wood Science (Jan 2023)

Feature importance measures from random forest regressor using near-infrared spectra for predicting carbonization characteristics of kraft lignin-derived hydrochar

  • Sung-Wook Hwang,
  • Hyunwoo Chung,
  • Taekyeong Lee,
  • Jungkyu Kim,
  • YunJin Kim,
  • Jong-Chan Kim,
  • Hyo Won Kwak,
  • In-Gyu Choi,
  • Hwanmyeong Yeo

DOI
https://doi.org/10.1186/s10086-022-02073-y
Journal volume & issue
Vol. 69, no. 1
pp. 1 – 12

Abstract

Read online

Abstract This study investigated the feature importance of near-infrared spectra from random forest regression models constructed to predict the carbonization characteristics of hydrochars produced by hydrothermal carbonization of kraft lignin. The model achieved high coefficients of determination of 0.989, 0.988, and 0.985 with root mean square errors of 0.254, 0.003, and 0.008 when predicting the carbon content, atomic O/C ratio, and H/C ratio, respectively. The random forest models outperformed the multilayer perceptron models for all predictions. In the feature importance analysis, the spectral regions at 1600–1800 nm, the first overtone of C–H stretching vibrations, and 2000–2300 nm, the combination bands, were highly important for predicting the carbon content and O/C predictions, whereas the region at 1250–1711 nm contributed to predicting H/C. The random forest models trained with the high-importance regions achieved better prediction performances than those trained with the entire spectral range, demonstrating the usefulness of the feature importance yielded by the random forest and the feasibility of selective application of the spectral data.

Keywords