International Journal of Applied Earth Observations and Geoinformation (Oct 2020)
Retrieval of aboveground crop nitrogen content with a hybrid machine learning method
Abstract
Hyperspectral acquisitions have proven to be the most informative Earth observation data source for the estimation of nitrogen (N) content, which is the main limiting nutrient for plant growth and thus agricultural production. In the past, empirical algorithms have been widely employed to retrieve information on this biochemical plant component from canopy reflectance. However, these approaches do not seek for a cause-effect relationship based on physical laws. Moreover, most studies solely relied on the correlation of chlorophyll content with nitrogen, and thus neglected the fact that most N is bound in proteins. Our study presents a hybrid retrieval method using a physically-based approach combined with machine learning regression to estimate crop N content. Within the workflow, the leaf optical properties model PROSPECT-PRO including the newly calibrated specific absorption coefficients (SAC) of proteins, was coupled with the canopy reflectance model 4SAIL to PROSAIL-PRO. The latter was then employed to generate a training database to be used for advanced probabilistic machine learning methods: a standard homoscedastic Gaussian process (GP) and a heteroscedastic GP regression that accounts for signal-to-noise relations. Both GP models have the property of providing confidence intervals for the estimates, which sets them apart from other machine learners. Moreover, a GP-based sequential backward band removal algorithm was employed to analyze the band-specific information content of PROSAIL-PRO simulated spectra for the estimation of aboveground N. Data from multiple hyperspectral field campaigns, carried out in the framework of the future satellite mission Environmental Mapping and Analysis Program (EnMAP), were exploited for validation. In these campaigns, corn and winter wheat spectra were acquired to simulate spectral EnMAP data. Moreover, destructive N measurements from leaves, stalks and fruits were collected separately to enable plant-organ-specific validation. The results showed that both GP models can provide accurate aboveground N simulations, with slightly better results of the heteroscedastic GP in terms of model testing and against in situ N measurements from leaves plus stalks, with root mean square error (RMSE) of 2.1 g/m². However, the inclusion of fruit N content for validation deteriorated the results, which can be explained by the inability of the radiation to penetrate the thick tissues of stalks, corn cobs and wheat ears. GP-based band analysis identified optimal spectral settings with ten bands mainly situated in the shortwave infrared (SWIR) spectral region. Use of well-known protein absorption bands from the literature showed comparative results. Finally, the heteroscedastic GP model was successfully applied on airborne hyperspectral data for N mapping. We conclude that GP algorithms, and in particular the heteroscedastic GP, should be implemented for global agricultural monitoring of aboveground N from future imaging spectroscopy data.