Economy Informatics (Sep 2019)
Machine Learning Based System for Semantic Indexing Documents Related to Cybersecurity
Abstract
This article presents a semantic indexing software system which uses natural language processing (NLP) techniques to understand documents related to cybersecurity. The purpose of this solution is to facilitate the cybersecurity documentation process as well as increasing cybersecurity awareness. The solution automatically collects documents related to cybersecurity available on the internet, keep relevant data, perform a cognitive analysis and enrich the documents, store the annotated documents and offer the possibility to access them according to users’choices. The paper describes the components of the system, the methods, technologies and tools proposed in order to implement the system. The solution includes a domain ontology and a machine learning (ML) model specialized in cybersecurity as well as a scraper to automatically download relevant data.
Keywords