Applied Sciences (Mar 2023)

Thematic Analysis: A Corpus-Based Method for Understanding Themes/Topics of a Corpus through a Classification Process Using Long Short-Term Memory (LSTM)

  • Yaser Altameemi,
  • Mohammed Altamimi

DOI
https://doi.org/10.3390/app13053308
Journal volume & issue
Vol. 13, no. 5
p. 3308

Abstract

Read online

Using advanced algorithms to conduct a thematic analysis reduces the time taken and increases the efficiency of the analysis. Long short-term memory (LSTM) is effective in the field of text classification and natural language processing (NLP). In this study, we adopt LSTM for text classification in order to perform a thematic analysis using concordance lines that are taken from a corpora of news articles. However, the statistical and quantitative analyses of corpus linguistics are not enough to fully identify the semantic shift of terms and concepts. Therefore, we suggest that a corpus should be classified from a linguistic theoretical perspective, as this would help to determine the level of the linguistic patterns that should be applied in the experiment of the classification process. We suggest investigating the concordance lines of the articles rather than only the relationship between collocates, as this has been a limitation for many studies. The findings of this research work highlight the effectiveness of the proposed methodology for the thematic analysis of media coverage, reaching 84% accuracy. This method provides a deeper thematic analysis than only applying the classification process through the collocational analysis.

Keywords