Journal of Agricultural Machinery (Sep 2017)

The Effect of Sample Size and Data Numbering on Precision of Calibration Model to predict Soil Properties

  • H Mohamadi-Monavar

DOI
https://doi.org/10.22067/jam.v7i2.57501
Journal volume & issue
Vol. 7, no. 2
pp. 536 – 545

Abstract

Read online

Introduction Precision agriculture (PA) is a technology that measures and manages within-field variability, such as physical and chemical properties of soil. The nondestructive and rapid VIS-NIR technology detected a significant correlation between reflectance spectra and the physical and chemical properties of soil. On the other hand, quantitatively predict of soil factors such as nitrogen, carbon, cation exchange capacity and the amount of clay in precision farming is very important. The emphasis of this paper is comparing different techniques of choosing calibration samples such as randomly selected method, chemical data and also based on PCA. Since increasing the number of samples is usually time-consuming and costly, then in this study, the best sampling way -in available methods- was predicted for calibration models. In addition, the effect of sample size on the accuracy of the calibration and validation models was analyzed. Materials and Methods Two hundred and ten soil samples were collected from cultivated farm located in Avarzaman in Hamedan province, Iran. The crop rotation was mostly potato and wheat. Samples were collected from a depth of 20 cm above ground and passed through a 2 mm sieve and air dried at room temperature. Chemical analysis was performed in the soil science laboratory, faculty of agriculture engineering, Bu-ali Sina University, Hamadan, Iran. Two Spectrometer (AvaSpec-ULS 2048- UV-VIS) and (FT-NIR100N) were used to measure the spectral bands which cover the UV-Vis and NIR region (220-2200 nm). Each soil sample was uniformly tiled in a petri dish and was scanned 20 times. Then the pre-processing methods of multivariate scatter correction (MSC) and base line correction (BC) were applied on the raw signals using Unscrambler software. The samples were divided into two groups: one group for calibration 105 and the second group was used for validation. Each time, 15 samples were selected randomly and tested the accuracy of models, then other15 samples were added randomly to the previous set and it was done continuously. Finally, seven groups (15, 30... 105) were placed in each category. Results and Discussion All regression models on the whole pre-processed soil spectra were obtained in absorption mode. By increasing the number of samples in the calibration set of random group, RMSE was decreased from 0.2 to 0.13 nonlinearly. RMSE in the chemical test was also decreased almost linearly from 0.17 to 0.11. At the same time, R2 and RPD were increased from 0.46 to 0.72 and from 1.3 to 2.0 respectively. Finally, in categories based on PCA, the RMSE fell down almost linearly (0.19-0.13). Potassium prediction model with the least amount of R2 (0.48) and phosphorus with highest number of errors (RMSE = 5.28) were the weakest models in whole data. Other properties of the soil had a higher coefficient of determination (R2> 0.5). Therefore, prediction models had acceptable accuracy. At least 77, 105 and 105 samples are required for precise calibration model of OC, nitrogen and pH respectively. Due to the different conditions of farms, comparing these results with previous findings is very complex. Furthermore, model accuracy did not improve by increasing data of calibration models to the total number of samples. While in previous studies, more precise model was calibrated by considering the entire data sets. Among all factors of soil, acidity has little dependence on the other soil properties. The pH modeling is also confirmed by Moros (2009) however, the more error was reported here. There is no certain pH range in the NIR spectra, and usually it is distinguishable from the other properties of the soil (Kuang and Mouazen, 2011). Conclusions Spectroscopic methods exhibited good potential for detecting soil properties. MSC and BC can effectively remove irrelevant information to improve prediction accuracy. Using different methods to select numbers of data for the calibration models presented similar results, but in the meantime PCA technique provided the best answer. Supplementary, the ever-increasing number of data does not always improve modeling accuracy. It is better to choose numbers of data according to principal components (PCs) of PCA to obtain acceptable answer. It must be noted that every crops requires a specific soil and nutrients, so it is necessary to develop models for other soil properties.

Keywords