IEEE Access (Jan 2022)

Extractive Summarization of Call Transcripts

  • Pratik K. Biswas,
  • Aleksandr Iakubovich

DOI
https://doi.org/10.1109/ACCESS.2022.3221404
Journal volume & issue
Vol. 10
pp. 119826 – 119840

Abstract

Read online

Automatic text summarization is one of the most challenging and interesting problems in natural language processing (NLP). Text summarization is the process of extracting the most important information from the text and presenting it concisely in fewer sentences. Call transcript involves textual description of a phone conversation between a customer (caller) and agent(s) (customer representatives). Call transcripts pose unique challenges that are not adequately addressed by most open-source automatic text summarizers, which are developed to summarize continuous texts such as articles and stories. This paper presents an indigenously developed method that combines topic modeling and sentence selection with punctuation restoration in condensing ill-punctuated or un-punctuated call transcripts to produce more readable summaries. This unique combination is what distinguishes the proposed summarizer from other text summarizers. Extensive testing, evaluation and comparisons, with an open-source, state-of-the-art extractive summarizer using three different pre-trained language models, have demonstrated the efficacy of this summarizer for call transcript summarization. The summaries generated by the proposed summarizer are shown to be more compelling and useful based on multiple criteria.

Keywords