Applied Sciences (Oct 2020)

Generating Optimized Guessing Candidates toward Better Password Cracking from Multi-Dictionaries Using Relativistic GAN

  • Sungyup Nam,
  • Seungho Jeon,
  • Jongsub Moon

DOI
https://doi.org/10.3390/app10207306
Journal volume & issue
Vol. 10, no. 20
p. 7306

Abstract

Read online

Despite their well-known weaknesses, passwords are still the de-facto authentication method for most online systems. Due to its importance, password cracking has been vibrantly researched both for offensive and defensive purposes. Hashcat and John the Ripper are the most popular cracking tools, allowing users to crack millions of passwords in a short time. However, their rule-based cracking has an explicit limitation of depending on password-cracking experts to come up with creative rules. To overcome this limitation, a recent trend has been to apply machine learning techniques to research on password cracking. For instance, state-of-the-art password guessing studies such as PassGAN and rPassGAN adopted a Generative Adversarial Network (GAN) and used it to generate high-quality password guesses without knowledge of password structures. However, compared with the probabilistic context-free grammar (PCFG), rPassGAN shows inferior password cracking performance in some cases. It was also observed that each password cracker has its own cracking space that does not overlap with other models. This observation led us to realize that an optimized candidate dictionary can be made by combining the password candidates generated by multiple password generation models. In this paper, we suggest a deep learning-based approach called REDPACK that addresses the weakness of the cutting-edge cracking tools based on GAN. To this end, REDPACK combines multiple password candidate generator models in an effective way. Our approach uses the discriminator of rPassGAN as the password selector. Then, by collecting passwords selectively, our model achieves a more realistic password candidate dictionary. Also, REDPACK improves password cracking performance by incorporating both the generator and the discriminator of GAN. We evaluated our system on various datasets with password candidates composed of symbols, digits, upper and lowercase letters. The results clearly show that our approach outperforms all existing approaches, including rule-based Hashcat, GAN-based PassGAN, and probability-based PCFG. The proposed model was also able to reduce the number of password candidates by up to 65%, with only 20% cracking performance loss compared to the union set of passwords cracked by multiple-generation models.

Keywords