Comparison of Unigram, HMM, CRF and Brill's Part-of-Speech Taggers Available in NLTK Library

Michal Kvet; Miroslav Potočár

doi:10.23919/FRUCT58615.2023.10143061

Proceedings of the XXth Conference of Open Innovations Association FRUCT (May 2023)

Comparison of Unigram, HMM, CRF and Brill's Part-of-Speech Taggers Available in NLTK Library

Michal Kvet,
Miroslav Potočár

Affiliations

Michal Kvet: Zilinska univerzita v Ziline
Miroslav Potočár: UNIZA

DOI: https://doi.org/10.23919/FRUCT58615.2023.10143061
Journal volume & issue: Vol. 33, no. 1
pp. 226 – 235

Abstract

Read online

Part-of-speech tagging is for many NLP researchers the first task they encounter in the field of natural language processing. This task is undoubtedly related to part-of-speech taggers. We focus on a detailed description of the functioning of the unigram, hidden Markov model, conditional random fields and Brill taggers, followed by a comparison of these models. We use implementations available in the natural language toolkit library, without addressing the selection of the best parameters. We focus on finding out which tagger produces the best results using default settings or in other words, which one works best in "take it as it is" mode. To determine this, we make an experiment in which we track various metrics such as prediction time, accuracy on unknown words, number of correctly labeled sentences and others. From the results of the experiment, we find out that the CRF tagger achieves the highest accuracy among all participants in the experiment. It is also able to tag previously unseen words with the highest accuracy among all taggers compared.

hidden markov model unigram conditional random fields brill's tagger nltk part-of-speech tagger tagging

Published in Proceedings of the XXth Conference of Open Innovations Association FRUCT

ISSN: 2305-7254 (Print); 2343-0737 (Online)
Publisher: FRUCT
Country of publisher: Finland
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Telecommunication
Website: http://fruct.org/publication

About the journal

Abstract

Keywords