Intent Detection Problem Solving via Automatic DNN Hyperparameter Optimization

Jurgita Kapočiūtė-Dzikienė; Kaspars Balodis; Raivis Skadiņš

doi:10.3390/app10217426

Applied Sciences (Oct 2020)

Intent Detection Problem Solving via Automatic DNN Hyperparameter Optimization

Jurgita Kapočiūtė-Dzikienė,
Kaspars Balodis,
Raivis Skadiņš

Affiliations

Jurgita Kapočiūtė-Dzikienė: JSC Tilde Information Technology, Naugarduko Str. 100, LT-03160 Vilnius, Lithuania
Kaspars Balodis: Tilde SIA, Vienības Str. 75A, LV-1004 Riga, Latvia
Raivis Skadiņš: Tilde SIA, Vienības Str. 75A, LV-1004 Riga, Latvia

DOI: https://doi.org/10.3390/app10217426
Journal volume & issue: Vol. 10, no. 21
p. 7426

Abstract

Read online

Accurate intent detection-based chatbots are usually trained on larger datasets that are not available for some languages. Seeking the most accurate models, three English benchmark datasets that were human-translated into four morphologically complex languages (i.e., Estonian, Latvian, Lithuanian, Russian) were used. Two types of word embeddings (fastText and BERT), three types of deep neural network (DNN) classifiers (convolutional neural network (CNN); long short-term memory method (LSTM), and bidirectional LSTM (BiLSTM)), different DNN architectures (shallower and deeper), and various DNN hyperparameter values were investigated. DNN architecture and hyperparameter values were optimized automatically using the Bayesian method and random search. On three datasets of 2/5/8 intents for English, Estonian, Latvian, Lithuanian, and Russian languages, accuracies of 0.991/0.890/0.712, 0.972/0.890/0.644, 1.000/0.890/0.644, 0.981/0.872/0.712, and 0.972/0.881/0.661 were achieved, respectively. The BERT multilingual vectorization with the CNN classifier was proven to be a good choice for all datasets for all languages. Moreover, in the majority of models, the same set of optimal hyperparameter values was determined. The results obtained in this research were also compared with the previously reported values (where hyperparameter values of DNN models were selected by an expert). This comparison revealed that automatically optimized models are competitive or even more accurate when created with larger training datasets.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords