IEEE Access (Jan 2025)
Continuous Speech-Based Fatigue Detection and Transition State Prediction for Air Traffic Controllers
Abstract
Air traffic controllers (ATC) play a critical role in ensuring aviation safety, but their demanding workload can lead to fatigue, potentially compromising their performance. This paper presents a study that investigates speech features responsible for detecting ATC fatigue and proposes an approach to predict the timestamp at which an ATC transitions into a fatigue state from a continuous speech sample. The main contributions of this work are the creation of a continuous speech ATC dataset and the identification of a lightweight optimum feature set for fatigue classification from ATC speech. For the initial task, the classification of raw speech signals into fatigue and non-fatigue categories was performed using the top-10 best features selected from the openSMILE feature set. The evaluation was carried out using various learning algorithms such as XGBoost, Adaboost, Random Forest, HistogramGB, and 1D-CNN. The ensemble algorithms demonstrated the best performance, achieving a maximum accuracy of 100% on the XGBoost test set. Further, interpretability was analyzed using the SHAP tool, which identified the prominent features for the task. The second task involved creating a continuous speech dataset comprising approximately 18,900 samples from the ATC corpus, with an average duration of 63-65 seconds per sample. The continuous speech samples were prepared by the randomized concatenation of fatigue and non-fatigue chunks, each with a duration of approximately 15 seconds. Automated sequence labeling was performed on uniformly segmented continuous speech samples. MFCC and statistical features were extracted from the labelled continuous speech and input into various Recurrent Neural Networks, such as bi-LSTM and Bi-GRU, for fatigue state prediction tasks. A combination of these features using bi-LSTM modeling achieved a maximum precision, recall, F-score, and average accuracy of 99% each. Finally, sample-wise timestamp prediction was performed using the labels: fatigue, non-fatigue, and ambiguous (transition). To the best of the authors’ knowledge, this research is the first of its kind to address continuous speech-based fatigue state prediction for ATCs. All tasks were conducted using the Civil Aviation Administration of China ATC corpus.
Keywords