Advances in Human-Computer Interaction (Jan 2018)

Factor Analysis of Utterances in Japanese Fiction-Writing Based on BCCWJ Speaker Information Corpus

  • Hajime Murai

DOI
https://doi.org/10.1155/2018/5056268
Journal volume & issue
Vol. 2018

Abstract

Read online

To analyse the characteristics of utterances in Japanese novels, several attributes (e.g., the speaker, listener, relationship between the speaker and listener, and gender of the speaker) were added to a randomly extracted Japanese novel corpus. A total of 887 data sets, with 5632 annotated utterances, were prepared. Based on the attribute annotated utterance corpus, the characteristics of utterance styles were extracted quantitatively. A chi-square test was used for particles and auxiliary verbs to extract utterance characteristics which reflected the genders of and relationships between the speakers and listeners. Results revealed that the use of imperative words was higher among male characters than their female counterparts, who used more particle verbs, and that auxiliaries of politeness were used more frequently for ‘coworkers’ and ‘superior authorities’. In addition, utterances varied between close and intimate relationships between the speaker and listener. Moreover, repeated factor analyses for 7576 data sets in BCCWJ speaker information corpus revealed ten typical utterance styles (neutral, frank, dialect, polite, feminine, crude, aged, interrogative, approval, and dandy). The factor scores indicated relationships between various utterance styles and fundamental attributes of speakers. Thus, results of this study would be utilisable in speaker identification tasks, automatic speech generation tasks, and scientific interpretation of stories and characters.