IEEE Access (Jan 2024)

An Interpersonal Dynamics Analysis Procedure With Accurate Voice Activity Detection Using Low-Cost Recording Sensors

  • Dongcheol Lim,
  • Hyewon Kang,
  • Beomseok Choi,
  • Woonki Hong,
  • Junghye Lee

DOI
https://doi.org/10.1109/ACCESS.2024.3387279
Journal volume & issue
Vol. 12
pp. 68427 – 68440

Abstract

Read online

Voice data is of special interest to organizational behavior researchers since it can be easily collected and provides a wealth of information for understanding interpersonal dynamics. Due to these benefits, voice activity detection (VAD) using wearable sensor badges has received considerable attention as one of the more objective and effective data-driven analysis methods rather than self-report methods for analyzing interpersonal dynamics. Moreover, with the VAD results, several prior works extracted conversational features, such as speaking times, turn-taking, and overlapped speaking, for attaining high-level organizational insights. However, there is still room for improvement in the accuracy of the VAD models, and there is a lack of research to develop reliable conversational feature extracting algorithms in the previous works. Most of these prior studies relied on sociometric badges, which are costly electronic devices, including voice recording machines and redundant sensors. In this paper, we propose an interpersonal dynamics analysis procedure based on low-cost commercial recording sensors, consisting of data-driven VAD modeling and conversational feature extracting. To accurately identify voice presence, the VAD modeling incorporates three steps: signal preprocessing, derived-variable generation, and VAD with machine learning-based classification and smoothing technique. We conducted experiments, from collecting datasets comprising the human verbal interaction of 15 three-person groups by commercial recording sensors to implementing our procedure on the datasets. Our results show that the proposed procedure excels in voice activity detection, achieving superior accuracy compared to prior studies. This remarkable accuracy subsequently ensures greater reliability in our conversational feature extraction. Thus, the proposed procedure can encourage organizational behavior researchers to acquire objective information about interpersonal dynamics and efficiently obtain high-level organizational insights with cost-effectiveness.

Keywords