Cybernetics and Information Technologies (Sep 2016)
The Distribution of Semantic Fields in Author’s Texts
Abstract
The paper describes the analysis of frequency distribution of semantic fields of nouns and verbs in the texts of English fiction. To such distributions, we applied Shapiro-Wilk test. The null hypothesis of normal distribution of semantic fields frequencies in the array of texts under analysis is rejected for some semantic fields. This makes it possible to consider the frequency distribution of semantic fields as a categorized mixture of normal distributions. As a factor of categorization, we chose text authorship. We divided the author’s categories with rejected hypothesis of normal distribution into subcategories with normal distribution. Paired Student’s t-test for the distributions of semantic fields in the texts of different authors revealed a measure of authorship representation in the structure of semantic fields. The analysis of the results showed that the author’s idiolect is represented in the vector space of semantic fields. Such a space can be used in the analysis of the authorship and author’s idiolect of texts.
Keywords