Proceedings of the XXth Conference of Open Innovations Association FRUCT (Nov 2024)

Rapid Unsupervised Keyphrase Extraction from Single Document

  • Svetlana Popova,
  • Vera Danilova,
  • John Cardiff

DOI
https://doi.org/10.23919/FRUCT64283.2024.10749871
Journal volume & issue
Vol. 36, no. 1
pp. 609 – 616

Abstract

Read online

Keyphrases offer a concise representation of a document’s content. They are valuable for improving web search results and enhancing tasks such as document tagging, text classification, or summarization. This makes keyphrase extraction is an essential component of text mining. Among the widely used constraints and features in existing keyphrase extraction methods, we identified several effective techniques that have not yet been used together: Part-of-Speech (PoS) restrictions, extended stop-word lists, and position-based features. To address this gap, we propose an approach that leverages automatically extracted extended stop word lists combined with PoS restrictions in keyphrases, and incorporates positional criteria. The main goal of the work was to develop a fast keyphrase extraction algorithm, which was built upon the three mentioned features. Experimental results on the INSPEC and SemEval 2010 datasets demonstrate the effectiveness of the proposed method.

Keywords