Journal of Library and Information Studies (Jun 2018)

Identifying Food-related Word Association and Topic Model Processing using LDA

  • Yu-Chin Li,
  • Tsung-Chih Hu,
  • Kuo-En Chang

DOI
https://doi.org/10.6182/jlis.201806_16(1).023
Journal volume & issue
Vol. 16, no. 1
pp. 23 – 43

Abstract

Read online

This paper presents an interdisciplinary study that combines natural language processing and psycholinguistics research. The latent Dirichlet allocation (LDA) model was used for semantic relatedness computation to enable an understanding of the mechanisms and processes through which humans encode and retrieve lexical units. To test the similarity of the output of the topic model and human word association, the “Time-limited Multiple Divergent Thinking Test of Word Associative Strategy” (TLM-DTTWAS) was used to collect data and conduct tests with three food-related stimulus words. A total of 101 subjects took the tests, producing 4,251 words. The empirical results were analyzed on two levels: (1) by the expert word association classification: taxonomic and script proposed by Ross and Murphy (1999); (2) followed by the associative hierarchy theory of Mednick(1962), to sort the vocabulary test results into two associative hierarchies, “steep” and “flat.” The analysis indicated that human word association displays randomness, as well as generalization and continuity. After the experimental text was passed through the LDA latent semantic model which demonstrated highly significant correlation. This was a whole new attempt to train a data science model to make inference and prediction of human concept association which could be very useful in teaching as well as commercial applications.

Keywords