Electronics (Feb 2022)

Key Information Extraction and Talk Pattern Analysis Based on Big Data Technology: A Case Study on YiXi Talks

  • Hao Xu,
  • Chengzhi Jiang,
  • Chuanfeng Huang,
  • Yiyang Chen,
  • Mengxue Yi,
  • Zhentao Zhu

DOI
https://doi.org/10.3390/electronics11040640
Journal volume & issue
Vol. 11, no. 4
p. 640

Abstract

Read online

In the attempt to extract key information and talk patterns from YiXi talks in China to realize “strategic reading” for readers and newcomers of the speaking field, text mining methods are used by this work. The extraction of key information is realized by keyword extraction using the TF-IDF algorithm to show key information of one talk or one category of talks. Talk pattern recognition is realized by manual labeling (100 transcripts) and rule-based automatic programs (590 transcripts). The labeling accuracy rate of “main narrative angle” recognition is the highest (70.34%), followed by “opening form” (65.25%) and “main narrative object”, and the “ending form” is around 50%, with the overall accuracy of the rule-based automatic recognition program for talk patterns at approximately 60%. The obtained results show that the proposed keyword extraction technology for transcripts can provide “strategic reading” to a certain extent. Mature speech mode can be summarized as follows: speakers tend to adopt a self-introducing opening format. They tell stories and experiences through a first-person narrative angle and express expectations and prospects for the future. This pattern is reasonable and can be referenced by new speakers.

Keywords