Chemical and Biological Technologies in Agriculture (Mar 2024)

Improving the performance of a spectral model to estimate total nitrogen content with small soil samples sizes

  • Weihao Tang,
  • Wenfeng Hu,
  • Chuang Li,
  • Jinjing Wu,
  • Hong Liu,
  • Chao Wang,
  • Xiaochuan Luo,
  • Rongnian Tang

DOI
https://doi.org/10.1186/s40538-024-00552-6
Journal volume & issue
Vol. 11, no. 1
pp. 1 – 14

Abstract

Read online

Abstract The application of near-infrared spectroscopy (NIRS) for rapid quantitative analysis of soil total nitrogen (STN) is of great significance to recycling nitrogen in the ecosystem and crops growth. However, collecting thousands of soil samples and chemical analysis are impracticable, more importantly a deviation from NIRS advantages of rapid, inexpensive and nondestructive. To more efficiently improve the estimation performance and reduce uncertainty of the model when working with small sample sizes (less than 100), solutions from soil particle size decomposition and model fusion were investigated. Elaborately, 123 Latosols samples were collected and decomposed them according to particle sizes to extend limited data at multiple scales. Based on all soil groups decomposed, a hyperspectral data recapture and model decision fusion method were implemented. The results demonstrated that the proposed method increased the scale of spectral data, extracted more STN-related spectral information, improved estimation accuracy, and reduced uncertainty. The fused model based on data from all decomposed groups yielded the best estimated results (root mean square error $$(RMSE) = 0.075g.kg^{-1}$$ ( R M S E ) = 0.075 g . k g - 1 , $$R^2 = 0.784$$ R 2 = 0.784 , and a ratio of performance to inter-quartile distance $$(RPIQ) = 3.787$$ ( R P I Q ) = 3.787 ) on the validation set. Through a tenfold cross-validation, the weighted fusion model with six groups of particle sizes data showed an improvement of 0.307 in $$R^2_cv$$ R c 2 v and an improved RPIQ of 1.015 compared to models constructed using conventional machine learning (ML) techniques and limited pristine data ( $$R^2_cv = 0.442, RMSE = 0.119$$ R c 2 v = 0.442 , R M S E = 0.119 ). Therefore, when utilizing NIRS to build rapid and accurate STN predictive models, the proposed method demonstrates great potential in improving the reliability of soil spectral models under small sample sizes. Graphical Abstract

Keywords