Skip to main content

SePass: Semantic Password Guessing Using k-nn Similarity Search in Word Embeddings

  • Conference paper
  • First Online:
Advanced Data Mining and Applications (ADMA 2022)

Abstract

Password guessing describes the process of finding a password for a secured system. Use cases include password recovery, IT forensics and measuring password strength. Commonly used tools for password guessing work with passwords leaks and use these lists for candidate generation based on handcrafted or inferred rules. These methods are often limited in their capability of producing entirely novel passwords, based on vocabulary not included in the given password lists. However, there are often semantic similarities between words and phrases of the given lists that are highly relevant for guessing the actual used passwords. In this paper, we propose SePass, a novel method that utilizes word embeddings to discover and exploit these semantic similarities. We compare SePass to a number of competitors and illustrate that our method not only is on par with these competitors, but also generates a significant higher amount of entirely novel password candidates. Using SePass in combination with existing methods, such as PCFG, improves the number of correctly guessed passwords considerably.

M. Hünemörder and L. Schäfer—Contributed equally to this research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://fasttext.cc/docs/en/crawl-vectors.html.

  2. 2.

    https://hashcat.net/wiki/doku.php?id=rule_based_attack.

  3. 3.

    https://github.com/iphelix/pack.

  4. 4.

    https://github.com/hashcat/hashcat/blob/master/rules/unix-ninja-leetspeak.rule.

  5. 5.

    https://github.com/hashcat/hashcat/blob/master/rules/best64.rule.

  6. 6.

    https://github.com/RUB-SysSec/OMEN/blob/master/README.md.

  7. 7.

    https://github.com/lakiw/pcfg_cracker.

  8. 8.

    https://github.com/vialab/semantic-guesser.

  9. 9.

    https://github.com/brannondorsey/PassGAN.

  10. 10.

    https://github.com/Knuust/SePass.

References

  1. Almeida, F., Xexéo, G.: Word embeddings: A survey. CoRR abs/1901.09069 (2019). http://arxiv.org/abs/1901.09069

  2. Biesner, D., Cvejoski, K., Georgiev, B., Sifa, R., Krupicka, E.: Generative deep learning techniques for password generation (2020)

    Google Scholar 

  3. Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)

    Article  Google Scholar 

  4. Burns, W.J.: Common password list (rockyou.txt) (2019). https://www.kaggle.com/wjburns/common-password-list-rockyoutxt

  5. Cubrilovic, N.: Rockyou hack: From bad to worse (2009). https://techcrunch.com/2009/12/14/rockyou-hack-security-myspace-facebook-passwords/

  6. Dürmuth, M., Angelstorf, F., Castelluccia, C., Perito, D., Chaabane, A.: OMEN: faster password guessing using an ordered markov enumerator. In: Piessens, F., Caballero, J., Bielova, N. (eds.) ESSoS 2015. LNCS, vol. 8978, pp. 119–132. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-15618-7_10

    Chapter  Google Scholar 

  7. Grave, E., Bojanowski, P., Gupta, P., Joulin, A., Mikolov, T.: Learning word vectors for 157 languages. In: Proceedings of the International Conference on Language Resources and Evaluation (LREC 2018) (2018)

    Google Scholar 

  8. Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.C.: Improved training of wasserstein gans. CoRR abs/1704.00028 (2017). http://arxiv.org/abs/1704.00028

  9. Hitaj, B., Gasti, P., Ateniese, G., Perez-Cruz, F.: PassGAN: a deep learning approach for password guessing. In: Deng, R.H., Gauthier-Umaña, V., Ochoa, M., Yung, M. (eds.) ACNS 2019. LNCS, vol. 11464, pp. 217–237. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-21568-2_11

    Chapter  Google Scholar 

  10. Melicher, W., et al.: Fast, lean, and accurate: Modeling password guessability using neural networks. In: Proceedings of the 25th USENIX Conference on Security Symposium, pp. 175–191. SEC’16, USENIX Association, USA (2016)

    Google Scholar 

  11. Miller, G.A.: WordNet: An electronic lexical database. MIT press (1998)

    Google Scholar 

  12. Narayanan, A., Shmatikov, V.: Fast dictionary attacks on passwords using time-space tradeoff. In: Proceedings of the 12th ACM Conference on Computer and Communications Security, CCS 2005, pp. 364–372. Association for Computing Machinery, New York (2005)

    Google Scholar 

  13. Steube, J.: hashcat (2002). https://hashcat.net/hashcat/

  14. Veras, R., Collins, C., Thorpe, J.: On the semantic patterns of passwords and their security impact, January 2014

    Google Scholar 

  15. Veras, R., Collins, C., Thorpe, J.: A large-scale analysis of the semantic password model and linguistic patterns in passwords. ACM Trans. Priv. Secur. 24(3), April 2021

    Google Scholar 

  16. Wang, S., Zhou, W., Jiang, C.: A survey of word embeddings based on deep learning. Computing 102(3), 717–740 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  17. Weir, M., Aggarwal, S., Medeiros, B.d., Glodek, B.: Password cracking using probabilistic context-free grammars. In: 2009 30th IEEE Symposium on Security and Privacy, pp. 391–405 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maximilian Hünemörder .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hünemörder, M., Schäfer, L., Schüler, NS., Eichberg, M., Kröger, P. (2022). SePass: Semantic Password Guessing Using k-nn Similarity Search in Word Embeddings. In: Chen, W., Yao, L., Cai, T., Pan, S., Shen, T., Li, X. (eds) Advanced Data Mining and Applications. ADMA 2022. Lecture Notes in Computer Science(), vol 13726. Springer, Cham. https://doi.org/10.1007/978-3-031-22137-8_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-22137-8_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-22136-1

  • Online ISBN: 978-3-031-22137-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics