Skip to main content

Exploring the Network of Real-World Passwords: Visualization and Estimation

  • Conference paper
  • First Online:
  • 1593 Accesses

Abstract

The distribution of passwords has been the focus of many researchers when we come to security and privacy issues. In this paper, the spatial structure of empirical password sets is revealed through the visualization of disclosed password sets from the website of hotmail, 12306, phpbb and yahoo. Even though the choices of passwords, in most of the cases, are made independently and privately, on closer scrutiny, we surprisingly found that the networks of passwords sets of large scale individuals have similar topological structure and identical properties, regardless of demographic factors and site usage characteristics. The visualized graph of passwords is considered to be a scale-free network for whose degree distribution the power law is a good candidate fit. Furthermore, on the basis of the network graph of the password set we proposed, the optimal dictionary problem in dictionary-based password cracking is demonstrated to be equivalent in computing complexity to the dominating set problem, which is one of the well-known NP-complete problems in graph theory. Hence the optimal dictionary problem is also NP-complete.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   143.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    These data sets were disclosed after a series of serve leakages and were collected subsequently. Each one of the data sets has been mentioned at least once in previous literature. For instance, hotmail in [4], 12306 in [5], phpbb in [6], yahoo in [7]. Details are omitted to conserve space.

  2. 2.

    http://www.wordle.net/.

  3. 3.

    The 12306’s data set is one of the data sets used in this paper. Refer to the subsequent sections for more details about the data sets.

  4. 4.

    https://arxiv.org/.

  5. 5.

    https://github.com/googlr/.

  6. 6.

    http://tuvalu.santafe.edu/~aaronc/powerlaws/.

References

  1. Schechter, S., Herley, C., Mitzenmacher, M.: Popularity is everything: a new approach to protecting passwords from statistical-guessing attacks. In: Proceedings of the 5th USENIX Conference on Hot Topics in Security, pp. 1–8. USENIX Association (2010)

    Google Scholar 

  2. Bonneau, J.: The science of guessing: analyzing an anonymized corpus of 70 million passwords. In: 2012 IEEE Symposium on Security and Privacy, pp. 538–552. IEEE (2012)

    Google Scholar 

  3. Morris, R., Thompson, K.: Password security: a case history. Commun. ACM 22(11), 594–597 (1979)

    Article  Google Scholar 

  4. Malone, D., Maher, K.: Investigating the distribution of password choices. In: Proceedings of the 21st International Conference on World Wide Web, pp. 301–310. ACM (2012)

    Google Scholar 

  5. Carnavalet, X.D.C.D., Mannan, M.: A large-scale evaluation of high-impact password strength meters. ACM Trans. Inf. Syst. Secur. (TISSEC) 18(1), 1 (2015)

    Article  Google Scholar 

  6. Weir, M., Aggarwal, S., Collins, M., Stern, H.: Testing metrics for password creation policies by attacking large sets of revealed passwords. In: Proceedings of the 17th ACM Conference on Computer and Communications Security, pp. 162–175. ACM (2010)

    Google Scholar 

  7. Das, A., Bonneau, J., Caesar, M., Borisov, N., Wang, X.: The tangled web of password reuse. In: NDSS, vol. 14, pp. 23–26 (2014)

    Google Scholar 

  8. Dell’Amico, M., Michiardi, P., Roudier, Y.: Password strength: an empirical analysis. In: INFOCOM, vol. 10, pp. 983–991 (2010)

    Google Scholar 

  9. Mazurek, M.L., Komanduri, S., Vidas, T., Bauer, L., Christin, N., Cranor, L.F., Kelley, P.G., Shay, R., Ur, B.: Measuring password guessability for an entire university. In: Proceedings of the 2013 ACM SIGSAC Conference on Computer & Communications Security, pp. 173–186. ACM (2013)

    Google Scholar 

  10. Voyiatzis, A.G., Fidas, C.A., Serpanos, D.N., Avouris, N.M.: An empirical study on the web password strength in Greece. In: 2011 15th Panhellenic Conference on Informatics (PCI), pp. 212–216. IEEE (2011)

    Google Scholar 

  11. Li, Z., Han, W., Xu, W.: A large-scale empirical analysis of Chinese web passwords. In: USENIX Security Symposium, pp. 559–574 (2014)

    Google Scholar 

  12. Wang, D., Cheng, H., Wang, P., Huang, X., Jian, G.: Zipf’s law in passwords. IEEE Trans. Inf. Forensics Secur. 12(11), 2776–2791 (2017)

    Article  Google Scholar 

  13. Narayanan, A., Shmatikov, V.: Fast dictionary attacks on passwords using time-space tradeoff. In: Proceedings of the 12th ACM Conference on Computer and Communications Security, pp. 364–372. ACM (2005)

    Google Scholar 

  14. Weir, M., Aggarwal, S., De Medeiros, B., Glodek, B.: Password cracking using probabilistic context-free grammars. In: 2009 30th IEEE Symposium on Security and Privacy, pp. 391–405. IEEE (2009)

    Google Scholar 

  15. Veras, R., Thorpe, J., Collins, C.: Visualizing semantics in passwords: the role of dates. In: Proceedings of the Ninth International Symposium on Visualization for Cyber Security, pp. 88–95. ACM (2012)

    Google Scholar 

  16. Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions and reversals. In: Soviet Physics Doklady, vol. 10, p. 707 (1966)

    Google Scholar 

  17. Navarro, G.: A guided tour to approximate string matching. ACM Comput. Surv. (CSUR) 33(1), 31–88 (2001)

    Article  Google Scholar 

  18. Wagner, R.A., Fischer, M.J.: The string-to-string correction problem. J. ACM (JACM) 21(1), 168–173 (1974)

    Article  MathSciNet  Google Scholar 

  19. Ur, B., Segreti, S.M., Bauer, L., Christin, N., Cranor, L.F., Komanduri, S., Kurilova, D., Mazurek, M.L., Melicher, W., Shay, R.: Measuring real-world accuracies and biases in modeling password guessability. In: USENIX Security Symposium, pp. 463–481 (2015)

    Google Scholar 

  20. Bastian, M., Heymann, S., Jacomy, M.: Gephi: An Open Source Software for Exploring and Manipulating Networks (2009)

    Google Scholar 

  21. Blondel, V.D., Guillaume, J.-L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech: Theory Exp. 2008(10), P10008 (2008)

    Article  Google Scholar 

  22. Barabási, A.-L., Albert, R., Jeong, H.: Scale-free characteristics of random networks: the topology of the world-wide web. Phys. A: Stat. Mech. Appl. 281(1), 69–77 (2000)

    Article  Google Scholar 

  23. R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2016)

    Google Scholar 

  24. Wickham, H.: ggplot2: Elegant Graphics for Data Analysis. Springer, New York (2009). https://doi.org/10.1007/978-0-387-98141-3

    Book  MATH  Google Scholar 

  25. Alstott, J., Bullmore, E., Plenz, D.: powerlaw: a python package for analysis of heavy-tailed distributions. PLoS ONE 9(1), e85777 (2014)

    Article  Google Scholar 

  26. Clauset, A., Shalizi, C.R., Newman, M.E.: Power-law distributions in empirical data. SIAM Rev. 51(4), 661–703 (2009)

    Article  MathSciNet  Google Scholar 

  27. Hell, P.: Graphs with given neighborhoods i. In: Proc. Colloque, Inter. CNRS, Orsay, pp. 219–223 (1976)

    Google Scholar 

  28. Klein, D.V.: Foiling the cracker: a survey of, and improvements to, password security. In: Proceedings of the 2nd USENIX Security Workshop, pp. 5–14 (1990)

    Google Scholar 

  29. Mónica, D., Ribeiro, C.: Local password validation using self-organizing maps. In: Kutyłowski, M., Vaidya, J. (eds.) ESORICS 2014. LNCS, vol. 8712, pp. 94–111. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11203-9_6

    Chapter  Google Scholar 

  30. Hedetniemi, S.T., Laskar, R.C.: Bibliography on domination in graphs and some basic definitions of domination parameters. Discret. Math. 86(1), 257–277 (1990)

    Article  MathSciNet  Google Scholar 

  31. Garey, M., Johnson, D.: Computers and Intractability-A Guide to NP-Completeness (1979)

    Google Scholar 

Download references

Acknowledgement

The authors would like to thank Ping Wang, Tian Liu, Yongzhi Cao, Wenxin Li, Eric Liang, Kaigui Bian, Haibo Cheng, Ding Wang, Gaopeng Jian, Chen Zhu, Xin Huang, Qiancheng Gu, Hang Li, Jun Yang, Junfeng Zhang, Xuqing Liu, Xiangyu Xu, Xiang Yin, Wenying Teng, Meredith Mante, Justin Edwin Marquez, Alex Wilke and Niall Pereira for helpful conversations and the anonymous reviewers for their insightful comments. This work was sponsored by the National Science Foundation of China under grant No. 61371131.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiujia Guo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Guo, X., Wang, Z., Chen, Z. (2018). Exploring the Network of Real-World Passwords: Visualization and Estimation. In: Lin, X., Ghorbani, A., Ren, K., Zhu, S., Zhang, A. (eds) Security and Privacy in Communication Networks. SecureComm 2017. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 238. Springer, Cham. https://doi.org/10.1007/978-3-319-78813-5_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-78813-5_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-78812-8

  • Online ISBN: 978-3-319-78813-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics