IEEE Access (Jan 2016)
You Are Probably Not the Weakest Link: Towards Practical Prediction of Susceptibility to Semantic Social Engineering Attacks
Abstract
Semantic social engineering attacks are a pervasive threat to computer and communication systems. By employing deception rather than by exploiting technical vulnerabilities, spear-phishing, obfuscated URLs, drive-by downloads, spoofed websites, scareware, and other attacks are able to circumvent traditional technical security controls and target the user directly. Our aim is to explore the feasibility of predicting user susceptibility to deception-based attacks through attributes that can be measured, preferably in real-time and in an automated manner. Toward this goal, we have conducted two experiments, the first on 4333 users recruited on the Internet, allowing us to identify useful high-level features through association rule mining, and the second on a smaller group of 315 users, allowing us to study these features in more detail. In both experiments, participants were presented with attack and non-attack exhibits and were tested in terms of their ability to distinguish between the two. Using the data collected, we have determined practical predictors of users' susceptibility against semantic attacks to produce and evaluate a logistic regression and a random forest prediction model, with the accuracy rates of. 68 and. 71, respectively. We have observed that security training makes a noticeable difference in a user's ability to detect deception attempts, with one of the most important features being the time since last self-study, while formal security education through lectures appears to be much less useful as a predictor. Other important features were computer literacy, familiarity, and frequency of access to a specific platform. Depending on an organisation's preferences, the models learned can be configured to minimise false positives or false negatives or maximise accuracy, based on a probability threshold. For both models, a threshold choice of 0.55 would keep both false positives and false negatives below 0.2.
Keywords