Egyptian Informatics Journal (Jul 2018)
Feature selection for document classification based on topology
Abstract
Feature selection is the method of how to select the best subset of the document occurring in data core for using it in purposes of data mining or applications. In this paper, we introduced a new technique using topological spaces for developing Information Retrieval System (IRS). First, we introduced the definition of topological information retrieval systems (TIRS) as a generalization of the information retrieval system. Second, we applied some topological near open sets to these systems for feature selection. Indiscernibility of keywords in these systems are discussed and their applications are given. We suggested and examined the order relation that representing the relationships among documents of the document space. Keywords: Information retrieval system, Document classification, Topological space, Feature selection, Near open sets, Rough sets