The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences (Jun 2022)
TOPIC MODELING AND ASSOCIATION RULE MINING TO DISCOVER GEOSPATIAL SEMANTIC INFORMATION FROM UNSTRUCTURED DATA SOURCES
Abstract
As the amount of semi-structured and unstructured information sources expands at an exponential rate, there is a growing demand for semantic information elicitation of the immanent knowledge included in these sources. Semantic information elicitation processes such as semantic information extraction, linking, and annotation aim to make the knowledge explicit and unveil aspects latent in these sources to support knowledge discovery, semantic analysis, and visualization. The paper describes the implementation of Latent Dirichlet Allocation (LDA) topic modeling and association rule mining with FP-Growth for knowledge discovery. RapidMiner, an open-source data mining software is used for the objectives of this work.