Internetikeele automaatne süntaktiline analüüs kitsenduste grammatikaga

Eesti Rakenduslingvistika Ühingu Aastaraamat. 2016;12:253-267 DOI 10.5128/ERYa12.15

 

Journal Homepage

Journal Title: Eesti Rakenduslingvistika Ühingu Aastaraamat

ISSN: 1736-2563 (Print); 2228-0677 (Online)

Publisher: Eesti Rakenduslingvistika Ühing (Estonian Association for Applied Linguistics)

Society/Institution: Eesti Rakenduslingvistika Ühing

LCC Subject Category: Language and Literature: Philology. Linguistics | Language and Literature: Ural-Altaic languages: Finnic. Baltic-Finnic

Country of publisher: Estonia

Language of fulltext: Estonian, English

Full-text formats available: PDF

 

AUTHORS

Dage Särg

EDITORIAL INFORMATION

Double blind peer review

Editorial Board

Instructions for authors

Time From Submission to Publication: 28 weeks

 

Abstract | Full Text

"Syntactic analysis of Estonian netspeak using Constraint Grammar" The paper provides an overview of an attempt to adapt the Estonian Constraint Grammar rule set for netspeak. The rule set has been developed by Kaili Müürisep and Tiina Puolakainen for shallow and dependency parsing of Estonian literary language, and it has previously been adapted for shallow parsing of spoken Estonian by Kaili Müürisep and Heli Uibo. First, in order to adapt the rules, a chatroom corpus was parsed with the existing rule set. The corpus was manually revised and based on the errors that were found, changes were made to the rule set. The changes regarded detection of clause boundaries and particle verbs, as well as assignment of syntactic tags and dependency relations. Extensive use of discourse particles and direct addresses, short sentence length, and small percentage of attributes among the syntactic functions used in text appeared to be the most distinctive features of netspeak, as well as the large amount of elliptical sentences from which, in addition to other syntactic functions, a predicate can be left out. As a result of adapting the rule set, the results of both shallow and dependency parsing improved. The most error-prone syntactic functions were subjects, predicatives, and adverbials. In dependency parsing, the largest number of errors was made in determining the governors of adverbials.