Anuario del Seminario de Filología Vasca "Julio de Urquijo" (Feb 2003)

Learning argument/adjunct distinction for Basque

  • Izaskun Aldezabal,
  • M.ª Jesús Aranzabe,
  • Koldo Gojenola,
  • Kepa Sarasola,
  • Aitziber Atutxa

DOI
https://doi.org/10.1387/asju.9711

Abstract

Read online

This paper presents experiments performed on lexical knowledge acquisition in the form of verbal argumental information. The system obtains the data from raw corpora after the application of a partial parser and statistical filters. We used two different statistical filters to acquire the argumental information: Mutual Information, and Fisher's Exact test. Due to the characteristics of agglutinative languages like Basque, the usual classification of arguments in terms of their syntactic category (such as NP or PP) is not suitable. For that reason, the arguments will be classified in 48 different kinds of case markers, which makes the system fine grained if compared to equivalent systems developed for other languages. This work addresses the problem of learning subcategorization frames by distinguishing arguments from adjuncts, being the last ones the most significant source of noise in subcategorization frame acquisition.