Electronic Proceedings in Theoretical Computer Science (Aug 2017)
On h-Lexicalized Restarting Automata
Abstract
Following some previous studies on restarting automata, we introduce a refined model - the h-lexicalized restarting automaton (h-RLWW). We argue that this model is useful for expressing lexicalized syntax in computational linguistics. We compare the input languages, which are the languages traditionally considered in automata theory, to the so-called basic and h-proper languages, which are (implicitly) used by categorial grammars, the original tool for the description of lexicalized syntax. The basic and h-proper languages allow us to stress several nice properties of h-lexicalized restarting automata, and they are suitable for modeling the analysis by reduction and, subsequently, for the development of categories of a lexicalized syntax. Based on the fact that a two-way deterministic monotone restarting automaton can be transformed into an equivalent deterministic monotone RL-automaton in (Marcus) contextual form, we obtain a transformation from monotone RLWW-automata that recognize the class CFL of context-free languages as their input languages to deterministic monotone h-RLWW-automata that recognize CFL through their h-proper languages. Through this transformation we obtain automata with the complete correctness preserving property and an infinite hierarchy within CFL, based on the size of the read/write window. Additionally, we consider h-RLWW-automata that are allowed to perform multiple rewrite steps per cycle, and we establish another infinite hierarchy above CFL that is based on the number of rewrite steps that may be executed within a cycle. The corresponding separation results and their proofs illustrate the transparency of h-RLWW-automata that work with the (complete or cyclic) correctness preserving property