An appropriate data set size for digital soil mapping in Erechim, Rio Grande do Sul, Brazil

Alexandre ten Caten; Ricardo Simão Diniz Dalmolin; Fabrício de Araújo Pedron; Luis Fernando Chimelo Ruiz; Carlos Antônio da Silva

doi:10.1590/S0100-06832013000200007

Revista Brasileira de Ciência do Solo (Apr 2013)

An appropriate data set size for digital soil mapping in Erechim, Rio Grande do Sul, Brazil

Alexandre ten Caten,
Ricardo Simão Diniz Dalmolin,
Fabrício de Araújo Pedron,
Luis Fernando Chimelo Ruiz,
Carlos Antônio da Silva

Affiliations

Alexandre ten Caten: Universidade Federal de Santa Catarina
Ricardo Simão Diniz Dalmolin: Universidade Federal de Santa Maria
Fabrício de Araújo Pedron: Universidade Federal de Santa Maria
Luis Fernando Chimelo Ruiz: Universidade Federal de Santa Maria
Carlos Antônio da Silva: Universidade Regional Integrada

DOI: https://doi.org/10.1590/S0100-06832013000200007
Journal volume & issue: Vol. 37, no. 2
pp. 359 – 366

Abstract

Read online

Digital information generates the possibility of a high degree of redundancy in the data available for fitting predictive models used for Digital Soil Mapping (DSM). Among these models, the Decision Tree (DT) technique has been increasingly applied due to its capacity of dealing with large datasets. The purpose of this study was to evaluate the impact of the data volume used to generate the DT models on the quality of soil maps. An area of 889.33 km² was chosen in the Northern region of the State of Rio Grande do Sul. The soil-landscape relationship was obtained from reambulation of the studied area and the alignment of the units in the 1:50,000 scale topographic mapping. Six predictive covariates linked to the factors soil formation, relief and organisms, together with data sets of 1, 3, 5, 10, 15, 20 and 25 % of the total data volume, were used to generate the predictive DT models in the data mining program Waikato Environment for Knowledge Analysis (WEKA). In this study, sample densities below 5 % resulted in models with lower power of capturing the complexity of the spatial distribution of the soil in the study area. The relation between the data volume to be handled and the predictive capacity of the models was best for samples between 5 and 15 %. For the models based on these sample densities, the collected field data indicated an accuracy of predictive mapping close to 70 %.

Published in Revista Brasileira de Ciência do Solo

ISSN: 0100-0683 (Print); 1806-9657 (Online)
Publisher: Sociedade Brasileira de Ciência do Solo
Country of publisher: Brazil
LCC subjects: Agriculture: Agriculture (General)
Website: https://www.rbcsjournal.org/pt-br/

About the journal

Abstract

Keywords