Naučno-tehničeskij Vestnik Informacionnyh Tehnologij, Mehaniki i Optiki (Oct 2024)

Automatic sign language translation: a review of neural network methods for recognition and synthesis of spoken and signed language

  • Denis V. Ivanko,
  • Dmitry A. Ryumin

DOI
https://doi.org/10.17586/2226-1494-2024-24-5-669-686
Journal volume & issue
Vol. 24, no. 5
pp. 669 – 686

Abstract

Read online

A review of modern methods and technologies for automatic machine translation for the deaf and hard of hearing is presented, including recognition and synthesis of both spoken and sign languages. These methods aim to facilitate effective communication between deaf/hard-of-hearing and hearing individuals. The proposed solutions have potential applications in contemporary human-machine interaction interfaces. Key aspects of new technologies are examined, including methods for sign language recognition and synthesis, audiovisual speech recognition and synthesis, existing corpora for training neural network models, and current systems for automatic machine translation. Current neural network approaches are presented, including the use of deep learning methods such as convolutional and recurrent neural networks as well as transformers. An analysis of existing corpora for training recognition and synthesis systems is provided, along with an evaluation of the challenges and limitations of existing machine translation systems. The main shortcomings and specific problems of current automatic machine translation technologies are identified, and promising solutions are proposed. Special attention is given to the applicability of automatic machine translation systems in real-world scenarios. The need for further research in data collection and annotation, development of new methods and neural network models, and creation of innovative technologies for processing audio and video data to enhance the quality and efficiency of the existing automatic machine translation systems is highlighted.

Keywords