ITM Web of Conferences (Jan 2017)

Evaluation of Filtering Methods Applied to the Unstructured Datasets in the Predictive Learning Services

  • Ilin Dmitry,
  • Mateshuk Egor,
  • Gilaztdinov Rustam,
  • Bubnov Gregory

DOI
https://doi.org/10.1051/itmconf/20171002005
Journal volume & issue
Vol. 10
p. 02005

Abstract

Read online

Predictive learning services perform aggregation and homogenization of open data from public sources, in particular from the online recruitment agencies. However, the sample of vacancies may contain various percentage of noise due to the frequent occurrence of homonyms. This article will consider two approaches of noise reduction: the first one is based on the cosine similarity and the second one is based on the contextual words.