Topics in Linguistics (Dec 2019)
Aspectual coding asymmetries: Predicting aspectual verb lengths by the effects frequency and information content
Abstract
The topic of this paper is the interaction of aspectual verb coding, information content and lengths of verbs, as generally stated in Shannon’s source coding theorem on the interaction between the coding and length of a message. We hypothesize that, based on this interaction, lengths of aspectual verb forms can be predicted from both their aspectual coding and their information. The point of departure is the assumption that each verb has a default aspectual value and that this value can be estimated based on frequency – which has, according to Zipf’s law, a negative correlation with length. Employing a linear mixed-effects model fitted with a random effect for LEMMA, effects of the predictors’ DEFAULT – i.e. the default aspect value of verbs, the Zipfian predictor FREQUENCY and the entropy-based predictor AVERAGE INFORMATION CONTENT – are compared with average aspectual verb form lengths. Data resources are 18 UD treebanks. Significantly differing impacts of the predictors on verb lengths across our test set of languages have come to light and, in addition, the hypothesis of coding asymmetry does not turn out to be true for all languages in focus.
Keywords