Anale: Seria Informatică (Jan 2009)

Using Genetic Algorithms for Texts Classification Problems

  • A. A. Shumeyko,
  • S. L. Sotnik

Journal volume & issue
Vol. VII, no. 1
pp. 325 – 340

Abstract

Read online

The avalanche quantity of the information developed by mankind has led to concept of automation of knowledge extraction – Data Mining ([1]). This direction is connected with a wide spectrum of problems - from recognition of the fuzzy set to creation of search machines. Important component of Data Mining is processing of the text information. Such problems lean on concept of classification and clustering ([2]). Classification consists in definition of an accessory of some element (text) to one of in advance created classes. Clustering means splitting a set of elements (texts) on clusters which quantity are defined by localization of elements of the given set in vicinities of these some natural centers of these clusters. Realization of a problem of classification initially should lean on the given postulates, basic of which – the aprioristic information on primary set of texts and a measure of affinity of elements and classes.