Applied Sciences (Feb 2022)

Active Correction for Incremental Speaker Diarization of a Collection with Human in the Loop

  • Yevhenii Prokopalo,
  • Meysam Shamsi,
  • Loïc Barrault,
  • Sylvain Meignier,
  • Anthony Larcher

DOI
https://doi.org/10.3390/app12041782
Journal volume & issue
Vol. 12, no. 4
p. 1782

Abstract

Read online

State of the art diarization systems now achieve decent performance but those performances are often not good enough to deploy them without any human supervision. Additionally, most approaches focus on single audio files while many use cases involving multiple recordings with recurrent speakers require the incremental processing of a collection. In this paper, we propose a framework that solicits a human in the loop to correct the clustering by answering simple questions. After defining the nature of the questions for both single file and collection of files, we propose two algorithms to list those questions and associated stopping criteria that are necessary to limit the work load on the human in the loop. Experiments performed on the ALLIES dataset show that a limited interaction with a human expert can lead to considerable improvement of up to 36.5% relative diarization error rate (DER) for single files and 33.29% for a collection.

Keywords