IEEE Access (Jan 2020)
A Similarity Measure in Formal Concept Analysis Containing General Semantic Information and Domain Information
Abstract
Formal concept analysis (FCA) gets into good graces by increasing big data scientists due to its unique advantages. Concept similarity measurement is the key to the FCA-based application. Most of the previous methods are based on set theory and less concerned with semantic information, whereas those methods focusing on semantic information usually rely on ontologies or knowledge bases to obtain the relevant semantic knowledge. However, it is difficult for knowledge methods to obtain domain knowledge in formal contexts (datasets), which are not suited well for domain text data. To tackle these problems, this paper proposes a novel formal concept similarity measure that synthesizes the Semantic information in knowledge bases and Domain information in the formal context (S&D measure). S&D uses word vectors as the representations of words to obtain the semantic information in general knowledge bases while defining novel semantic relations of intent words to obtain the domain information contained in the data itself. It can measure the similarity relation of concepts more comprehensively and precisely, particularly in a domain textual formal context, and it can be implemented automatically and unsupervisedly without any knowledge base, ontology or external corpus. Compared with other related works, experiments show that this method has a better correlation with human judgment.
Keywords