IEEE Access (Jan 2019)
Technical Mapping of the Grooming Anatomy Using Machine Learning Paradigms: An Information Security Approach
Abstract
In the field of information security, there are several areas of study that are under development. Social engineering is one of them that addresses the multidisciplinary challenges of cyber security. Nowadays, the attacks associated with social engineering are diverse, including the so-called Advanced Persistent Threats (APTs). These have been the subject of numerous investigations; however, cybernetic attacks of similar nature as grooming have been excluded from these studies. In the last decade, various efforts have been made to understand the structure and approach of grooming from the field of computer science with the use of computational learning algorithms. Nevertheless, these studies are not aligned with information security. In this work, the study of grooming is formalized as a social engineering attack, contrasting its stages or phases with life cycles associated with APTs. To achieve this goal, we use a database of real cyber-pedophile chats; this information was refined and the Latent Dirichlet Allocation (LDA) topic modeling was applied to determine the stages of the attack. Once the number of stages was determined, we proceed to give them a linguistic context, and with the use of machine learning, a linear model was trained to obtain 97.6% of training accuracy. With these results, it was determined that the study of grooming could support research associated with social engineering and contribute to new fields of information security.
Keywords