Scientific Journal of Astana IT University (Jun 2020)
APPLICATION OF INFORMATION TECHNOLOGIES FOR SEMANTIC TEXT PROCESSING
Abstract
An expert system for text analysis based on the heuristic knowledge of an expert linguist is proposed. Methods of linguistic analysis of the text through the use of computer technology have been further developed. Data verification was performed on the example of the Germanic language group. The algorithm of the system operation is given. The sequence of actions of the text analysis process is described. Research relates to the subject of computational linguistics and helps to automate text analysis processes. The main purpose of the research is to improve the machine’s understanding of the semantic structure of the text by finding current connections between the main members of the sentence, current connections between secondary members of the sentence, the best concept of the current word and the function of the current word in the sentence. Semantic networks are used in the software solution. The Java programming shell, such as NetBeans IDE 8.1, and the CLIPS shell, were used to create the software product. The main logical connections and structure of the program are described in the article. Methods and relations are considered on the example of the Germanic group of languages. All languages of the Germanic group are similar because they have a direct line of words which makes them even more similar: subject + predicate + subordinate clauses. Thus, to reflect the structure of the Germanic group of languages, it is sufficient to consider one of them. Namely, English, as it is the most common (1.5 billion people), international, has the largest vocabulary among the group (500 thousand words) and, in our opinion, the most complex.