Journal of Systemics, Cybernetics and Informatics (Feb 2019)

Current State and Modeling of Research Topics in Cybersecurity and Data Science

  • Tamir Bechor,
  • Bill Jung

Journal volume & issue
Vol. 17, no. 1
pp. 129 – 156

Abstract

Read online

Arguably, the two domains closely related to information technology recently gaining the most attention are 'cybersecurity' and 'data science'. Yet, the intersection of both domains often faces the conundrum of discussions intermingled with ill-understood concepts and terminologies. A topic model is desired to illuminate significant concepts and terminologies, straddling in cybersecurity and data science. Also, the hope exists to knowledge-discover under-researched topics and concepts, yet deserving more attention for the intersection crossing both domains. Motivated by these, then retaining most of the already accepted IMCIC (the International Multi-Conference on Complexity, Informatics, and Cybernetics) 2019 conference paper's content and supplementing it with implicit design activities while conducting the research, this study attempts to take on a challenge to model cybersecurity and data science topics clustered with significant concepts and terminologies, grounded on a textmining approach based on the recent scholarly articles published between 2012 and 2018. As the means to the end of modeling topic clusters, the research is approached with a text-mining technique, comprised of key-phrases extraction, topic modeling, and visualization. The trained LDA Model in the research analyzed and generated significant terms from the text-corpus from 48 articles and found that six latent topic clusters comprised the key terms. Afterwards, the researchers labeled the six topic clusters for future cybersecurity and data science researchers as follows: Advanced/Unseen Attack Detection, Contextual Cybersecurity, Cybersecurity Applied Domain, Data-Driven Adversary, Power System in Cybersecurity, and Vulnerability Management. The subsequent qualitative evaluation of the articles found the LDA Model supplied the six topic clusters in unveiling latent concepts and terminologies in cybersecurity and data science to enlighten both domains. The main contribution of this research is the identification of key concepts in the topic clusters and text-mining key-phrases from the recent scholarly articles focusing on cybersecurity and data science. By undertaking this research, this study aims to advance the fields of cybersecurity and data science. Besides the main contribution, the additional research contributions are as follows: First, the topic modeling approached using text-mining makes the cybersecurity domain unearth the terminologies that make IST (Information Systems and Technology) researchers investigate further. Secondly, using the result of the study's analysis, IST researchers can decide terms of interest and further investigate the articles that supplied the terms.

Keywords