Naučno-tehničeskij Vestnik Informacionnyh Tehnologij, Mehaniki i Optiki (Sep 2015)

SEMSIN SEMANTIC AND SYNTACTIC PARSER

  • K. K. Boyarsky,
  • E. A. Kanevskiy

DOI
https://doi.org/10.17586/2226-1494-2015-15-5-869-876
Journal volume & issue
Vol. 15, no. 5
pp. 869 – 876

Abstract

Read online

The paper deals with the principle of operation for SemSin semantic and syntactic parser creating a dependency tree for the Russian language sentences. The parser consists of 4 blocks: a dictionary, morphological analyzer, production rules and lexical analyzer. An important logical part of the parser is pre-syntactical module, which harmonizes and complements morphological analysis results, separates the text paragraphs into individual sentences, and also carries out predisambiguation. Characteristic feature of the presented parser is an open type of control – it is done by means of a set of production rules. A varied set of commands provides the ability to both morphological and semantic-syntactic analysis of the sentence. The paper presents the sequence of rules usage and examples of their work. Specific feature of the rules is the decision making on establishment of syntactic links with simultaneous removal of the morphological and semantic ambiguity. The lexical analyzer provides the execution of commands and rules, and manages the parser in manual or automatic modes of the text analysis. In the first case, the analysis is performed interactively with the possibility of step-by-step execution of the rules and scanning the resulting parse tree. In the second case, analysis results are filed in an xml-file. Active usage of syntactic and semantic dictionary information gives the possibility to reduce significantly the ambiguity of parsing. In addition to marking the text, the parser is also usable as a tool for information extraction from natural language texts.

Keywords