Geoscience Data Journal (Jun 2022)
Thematic quality assessment of land surface geospatial data based on confusion matrices: A matrix set for research on measures and procedures
Abstract
Abstract The confusion matrix has long been adopted as the ‘de facto’ and ‘de jure’ standard method of reporting on the thematic accuracy assessment of any land surface geospatial dataset. This type of data supports decision‐making in many different fields, so suitable quality is therefore essential in order to take the best decisions. Nevertheless, the creation and exploitation of the confusion matrix remains as an open topic with issues related to sampling design, quantitative indices derived from the matrix, statistical hypotheses that could be applied, etc. In connection with the latter, a confusion matrix dataset would be useful for a researcher in this matter. We have developed such a dataset retrieving confusion matrices from the literature, mainly research articles published in scientific journals included in WoS. We have collected almost 200 matrices in a database. This allows us to access the complete matrices and query different interesting properties of them and of the project where they were developed such as matrix size, sample size, location, year of data capture, labels of the classes, quality indices used, and extension and location of the project (where available).
Keywords