Big Earth Data (Jul 2023)
Classification framework and semantic labeling for Big Earth Data
Abstract
ABSTRACTBig Earth Data refers to the multidimensional integration and association of scientific data, including geography, resources, environment, ecology, and biology. An effective data classification system and label management strategy are important foundations for long-term management of data resources. The objective of this study was to construct a classification system and realize multidimensional semantic data label management for the Big Earth Data Science Engineering Program (CASEarth). This study constructed two sets of classification and coding systems that realize classification by mapping each other; namely, the geosphere-level and Sustainable Development Goals (SDGs) indicator classifications. This technique was based on natural language processing technology and solved problems with subject-word segmentation, weight calculation, and dynamic matching. A prototype system for classification and label management was constructed based on existing CASEarth datasets of more than 1,100. Furthermore, we expect our study to provide the methodology and technical support for user-oriented classification and label management services for Big Earth Data.
Keywords