Skip to main content
Log in

Zipf’s law analysis on the leaked Iranian users’ passwords

  • Original Paper
  • Published:
Journal of Computer Virology and Hacking Techniques Aims and scope Submit manuscript

Abstract

Textual passwords are one of the most common methods of authentication and an important factor in systems security. Knowing the correct distribution of users’ passwords can play an important role in defining password policies and preventing various attacks. Culture and language can affect the pattern of users’ password selection and consequently, influence the vulnerability of passwords to guessing attacks. Therefore, knowing the distribution of English users’ passwords may not be appropriate for the security analysis of non-English users’ passwords. The main purpose of this paper is to analyze the passwords of Iranian users and investigating their differences from English-speaking users. The paper also examines the existence of Zipf’s law on Iranian passwords as the most well-known distribution for passwords. Password analysis of Iranian users shows that the popular password length between Iranian users and users of other countries is not much different, but in terms of the combination of characters used in the passwords, Iranian users are more inclined to use numeric passwords while English language users are more inclined to use passwords made up of alphabet. In this paper, Zipf’s law is reviewed on five datasets of Iranian users’ passwords using three different approaches including PDF, PDF with removing unpopular passwords and, CDF. Among these methods, in the CDF method, the passwords best matched with a Zipf’s law distribution between 0.02 and 0.07. Finally, the robustness of Iranians’ passwords to statistical guessing attacks has been measured and it is concluded that the passwords of Iranian users are more vulnerable to guessing attacks than English language users.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. https://www.radware.com/ert-report-2018/.

  2. https://diligent.com/en-gb/blog/cost-of-a-data-breach-ponemon-institute-report.

  3. These datasets are available athttps://github.com/amirjalaly/Iranian-users-passwords.

  4. http://www.7k7k.com/html/about.htm.

  5. https://www.000webhost.com/.

  6. Pinglish a.k.a Finglish stands for Persian text written with English letters.

  7. http://www.physics.csbsju.edu/stats/KS-test.html.

References

  1. Saltzer, J.H.: Protection and the control of information sharing in multics. Commun. ACM 17(7), 388–402 (1974)

    Article  Google Scholar 

  2. Morris, R., Thompson, K.: Password security: a case history. Commun. ACM 22(11), 594–597 (1979)

    Article  Google Scholar 

  3. Houshmand, S., Aggarwal, S.: Building better passwords using probabilistic techniques. In: Proceedings of the 28th ACM International Conference Proceeding Series, pp. 109–118 (2012)

  4. Das, A., Bonneau, J., Caesar, M., Borisov, N., Wang, X.: The Tangled Web of Password Reuse, NDSS 2014: 21st Network & Distributed System Security Symposium, pp. 23–26 (2014)

  5. Nelson, D., Vu, K.P.L.: Effectiveness of image-based mnemonic techniques for enhancing the memorability and security of user-generated passwords. Comput. Hum. Behav. 26(4), 705–715 (2010)

    Article  Google Scholar 

  6. Newman, M.E.J.: Power laws, Pareto distributions and Zipf’s law. Contemp. Phys. 46(5), 323–351 (2005)

  7. Malone, D., Maher, K.: Investigating the relationship between password distribution and Zipf’s law. In: Proceedings of WWW, pp. 301–310 (2012)

  8. Bonneau, J.: The science of guessing: analyzing an anonymized corpus of 70 million passwords. In: Proceedings—IEEE Symposium on Security and Privacy, pp. 538–552 (2012)

  9. Wang, D., Cheng, H., Wang, P., Huang, X., Jian, G.: Zipf’s law in passwords. IEEE Trans. Inf. Forensics Secur. 12(11), 2776–2791 (2017)

  10. Riddle, B.L., Miron, M.S., Semo, J.A.: Passwords in use in a university timesharing environment. Comput. Secur. 8(7), 569–579 (1989)

    Article  Google Scholar 

  11. Zviran, M., Haga, W.J.: Password security: an empirical study. J. Manag. Inf. Syst. 15(4), 161–185 (1998)

    Article  Google Scholar 

  12. AlSabah, M., Oligeri, G., Riley, R.: Your culture is in your password: an analysis of a demographically-diverse password dataset. Comput. Secur. 77, 427–441 (2018)

    Article  Google Scholar 

  13. Kuo, C., Romanosky, S., Cranor, L.F.: Human selection of mnemonic phrase-based passwords. ACM Int. Conf. Proc. Ser. 149, 67–78 (2006)

    Google Scholar 

  14. Shay, R., Komanduri, S., Kelley, P.G., Leon, P.G., Mazurek, M.L., Bauer, L., Christin, N., Cranor, L.F.: Encountering stronger password requirements: user attitudes and behaviors. In: Proceedings of the Sixth Symposium on Usable Privacy and Security (SOUPS) (2010)

  15. Schechter, S., Herley, C., Mitzenmacher, M.: Popularity is everything A new approach to protecting passwords from statistical-guessing attacks. USENIX: Hot Topics on Security, pp. 1–6 (2010, 2010)

  16. Gao, X., Yang, Y., Liu, C., Mitropoulos, C., Lindqvist, J., Oulasvirta, A.: Forgetting of passwords: ecological theory and data. In: SEC’18: Proceedings of the 27th USENIX Conference on Security Symposium, pp. 221–238 (2018)

  17. Shay, R., Bertino, E.: A comprehensive simulation tool for the analysis of password policies. Int. J. Inf. Secur. 8(4), 275–289 (2009)

    Article  Google Scholar 

  18. Adams, A., Sasse, M.A., Lunt, P.: Making passwords secure and usable. People and Computers XII (1997)

  19. Inglesant, P.G., Sasse, M.A.: The true cost of unusable password policies: password use in the wild. In: ACM Conference on Human Factors in Computing Systems, pp. 383–392 (2010)

  20. Shay, R.J.K., Bhargav-Spantzel, A., Bertino, E.: Password policy simulation and analysis. In: DIM’07—Proceedings of the 2007 ACM Workshop on Digital Identity Management, pp. 1–10 (2007)

  21. Davis, H.: Self-reference and the encoding of personal information in depression. Cogn. Ther. Res. 3(1), 97–110 (1979)

    Article  Google Scholar 

  22. Greenwald, A.G., Banaji, M.R.: The self as a memory system: powerful, but ordinary. J. Pers. Soc. Psychol. 57(1), 41–54 (1989)

    Article  Google Scholar 

  23. Barton, B.F., Barton, M.S.: User-friendly password methods for computer-mediated information systems. Comput. Secur. 3(3), 186–195 (1984)

    Article  Google Scholar 

  24. Komanduri, S., et al.: Of passwords and people: measuring the effect of password-composition policies. In: CHI ’11: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 2595–2604 (2011)

  25. Guo, Y., Zhang, Z., Guo, Y., Guo, X.: Nudging personalized password policies by understanding users’ personality. Comput. Secur. 94,(2020)

  26. de Carné de Carnavalet, X., Mannan, M.: From very weak to very strong: analyzing password-strength meters. In: 21st Annual Network and Distributed System Security Symposium, NDSS 2014, San Diego, California, USA, February 23–26 (2014)

  27. Yang, S., Ji, S., Beyah, R.: DPPG: a dynamic password policy generation system. IEEE Trans. Inf. Forensics Secur. 13(3), 545–558 (2018)

    Article  Google Scholar 

  28. Chou, H.C., Lee, H.C., Yu, H.J., Lai, F.P., Huang, K.H., Hsueh, C.W.: Password cracking based on learned patterns from disclosed passwords. Int. J. Innov. Comput. Inf. Control 9(2), 821–839 (2013)

    Google Scholar 

  29. Devillers, M.M.A.: Analyzing Password Strength. Radboud University Nijmegen. Technical Report (2010)

  30. Li, Z., Han, W., Xu, W.: A large-scale empirical analysis of Chinese web passwords. In: Proceedings of the 23rd USENIX Security Symposium, pp. 559–574 (2014)

  31. Mourouzis, T., Pavlou, K.E., Kampakis, S.: The Evolution of User-Selected Passwords: A Quantitative Analysis of Publicly Available Datasets. arXiv: 1804.03946 (2018)

  32. Wang, D., Wang, P., He, D., Tian, Y., Birthday: Name and bifacial-security: understanding passwords of Chinese web users. In: Proceedings of the 28th USENIX Security Symposium, pp. 1537–1554 (2019)

  33. Mori, K., Watanabe, T., Zhou, Y., Akiyama Hasegawa, A., Akiyama, M., Mori, T.: Comparative analysis of three language spheres: are linguistic and cultural differences reflected in password selection habits? In: Proceedings—4th IEEE European Symposium on Security and Privacy Workshops, EUROS and PW, pp. 159–171 (2019)

  34. Grobler, M., Chamikara, M.A.P., Abbott, J., Jeong, J.J., Nepal, S., Paris, C.: The importance of social identity on password formulations. Pers. Ubiquit. Comput. 1–15 (2020)

  35. van Schaik, P., Jeske, D., Onibokun, J., Coventry, L., Jansen, J., Kusev, P.: Risk perceptions of cyber-security and precautionary behaviour. Comput. Hum. Behav. 75, 547–559 (2017)

  36. He, D., et al.: Group-based password characteristics analysis. IEEE Netw. 35(1), 311–317 (2021)

    Article  Google Scholar 

  37. Martin, R.: Amid Widespread Data Breaches China. [Online]. Available (2011). (http://www.techinasia.com/alipay-hack/)

  38. Bowes, R.: Passwords. [Online]. Available (2015). (https://wiki.skullsecurity.org/Passwords)

  39. Allan, C.: 32 Million Rockyou Passwords Stolen. [Online]. Available (2009). (http://www.hardwareheaven.com/news.php?newsid=526)

  40. Weir, M., Aggarwal, S., De Medeiros, B., Glodek, B.: Password cracking using probabilistic context-free grammars. In: Proceedings–IEEE Symposium on Security and Privacy, pp. 391–405 (2009)

  41. Adamic, L.: Zipf, power-laws, and pareto-a ranking tutorial. Xerox Palo Alto Research Center. http://www.hpl.hp.com/research/idl/papers/ranking/r (2000)

  42. Bain, R.: Human Behavior and the Principle of Least Effort: an Introduction to Human Ecology. By George Kingsley Zipf. Cambridge, Mass.: Addison-Wesley Press, Inc., 1949. 573. Soc. Forces 28(3), 340–341 (1950)

    Article  Google Scholar 

  43. Bakan, D.: The test of significance in psychological research. Psychol. Bull. 66(6), 423–437 (1966)

    Article  Google Scholar 

  44. Nunnally, J.: Educational and Psychological Measurement, Educational and Psychological Measurement, XX(4), 641–650 (1960)

  45. Royall, R.M., Royall, R.M.: The effect of sample size on the meaning of significance tests. Am. Stat. 40(4), 313–315 (2012)

    MATH  Google Scholar 

  46. Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27, 623–656 (1948)

    Article  MathSciNet  Google Scholar 

  47. Weir, M., Aggarwal, S., Collins, M., Stern, H.: Testing metrics for password creation policies by attacking large sets of revealed passwords. In: CCS ’10: Proceedings of the 17th ACM Conference on Computer and Communications Security, pp. 162–175 (2010)

  48. William, E., Donna, F., Elaine, M., Ray, A., William, T., Emad, A., NIST: Special Publication 800-63-2 Electronic Authentication Guideline (2017)

  49. Massey, J.L.: Guessing and entropy. In: Proceedings of the 1994 IEEE International Symposium on Information Theory, p. 204 (1994)

  50. Pliam, J.O.: On the incomparability of entropy and marginal guesswork in brute-force attacks. In: International conference on cryptology in India, pp. 67–79 (2000)

  51. Bonneau, J., Just, M., Matthews, G.: What’s in a name? Evaluating statistical attacks on personal knowledge questions. In: Proceedings of the Fourteenth International Conference on Financial Cryptography and Data Security, vol. 6052, pp. 98–113 (2010)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amir Jalaly Bidgoly.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Alebouyeh, Z., Bidgoly, A.J. Zipf’s law analysis on the leaked Iranian users’ passwords. J Comput Virol Hack Tech 18, 101–116 (2022). https://doi.org/10.1007/s11416-021-00397-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11416-021-00397-9

Keywords

Navigation