Skip to main content

On the Privacy Impacts of Publicly Leaked Password Databases

  • Conference paper
  • First Online:
Detection of Intrusions and Malware, and Vulnerability Assessment (DIMVA 2017)

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 10327))

Abstract

Regularly, hackers steal data sets containing user identifiers and passwords. Often these data sets become publicly available. The most prominent and important leaks use bad password protection mechanisms, e.g. rely on unsalted password hashes, despite longtime known recommendations. The accumulation of leaked password data sets allows the research community to study the problems of password strength estimation, password breaking and to conduct usability and usage studies. The impact of these leaks in terms of privacy has not been studied.

In this paper, we consider attackers trying to break the privacy of users, while not breaking a single password. We consider attacks revealing that distinct identifiers are in fact used by the same physical person. We evaluate large scale linkability attacks based on properties and relations between identifiers and password information. With these attacks, stronger passwords lead to better predictions. Using a leaked and publicly available data set containing 130 \(\times \,10^{6}\) encrypted passwords, we show that a privacy attacker is able to build a database containing the multiple identifiers of people, including their secret identifiers. We illustrate potential consequences by showing that a privacy attacker is capable of deanonymizing (potentially embarrassing) secret identifiers by intersecting several leaked password databases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Such as recalled in the OWASP Password Storage Cheat Sheet.

  2. 2.

    See game http://zed0.co.uk/crossword and picture http://xkcd.com/1286.

  3. 3.

    See https://www.census.gov/genealogy/www/data/2000surnames.

  4. 4.

    The \(uid\) of D increases monotonically with the time of creation of the identifier. It allows the reconstruction of a timeline, by e.g. using creation dates of some identifiers or by searching in the fields \(name\) and \(hint\) for events having a worldwide notoriety.

References

  1. Bonneau, J.: The science of guessing: analyzing an anonymized corpus of 70 million passwords. In: IEEE Symposium on Security and Privacy (2012)

    Google Scholar 

  2. Bonneau, J.: Statistical metrics for individual password strength. In: 20th International Workshop on Security Protocols, April 2012

    Google Scholar 

  3. Castelluccia, C., Dürmuth, M., Perito, D.: Adaptive password-strength meters from markov models. In: Network and Distributed System Security (NDSS) Symposium (2012)

    Google Scholar 

  4. Cohen, W.W., Ravikumar, P., Fienberg, S.E.: A comparison of string distance metrics for name-matching tasks. In: KDD Workshop on Data Cleaning and Object Consolidation (2003)

    Google Scholar 

  5. Das, A., Bonneau, J., Caesar, M., Borisov, N., Wang, X.: The tangled web of password reuse. In: Network and Distributed System Security (NDSS) Symposium (2014)

    Google Scholar 

  6. Dell’Amico, M., Michiardi, P., Roudier, Y.: Password strength: an empirical analysis. In: IEEE INFOCOM (2010)

    Google Scholar 

  7. Ding, W., Wang, P.: On the implications of zipf’s law in passwords. In: ESORICS (2016)

    Google Scholar 

  8. Egelman, S., Bonneau, J., Chiasson, S., Dittrich, D., Schechter, S.: It’s not stealing if you need it: a panel on the ethics of performing research using public data of illicit origin. In: Blyth, J., Dietrich, S., Camp, L.J. (eds.) FC 2012. LNCS, vol. 7398, pp. 124–132. Springer, Heidelberg (2012). doi:10.1007/978-3-642-34638-5_11

    Chapter  Google Scholar 

  9. Florencio, D., Herley, C.: A large-scale study of web password habits. In: ACM WWW (2007)

    Google Scholar 

  10. Gambs, S., Heen, O., Potin, C.: A comparative privacy analysis of geosocial networks. In: 4th ACM SIGSPATIAL International Workshop on Security and Privacy in GIS and LBS, SPRINGL 2011 (2011)

    Google Scholar 

  11. Halevi, S., Krawczyk, H.: Strengthening digital signatures via randomized hashing. In: Dwork, C. (ed.) CRYPTO 2006. LNCS, vol. 4117, pp. 41–59. Springer, Heidelberg (2006). doi:10.1007/11818175_3

    Chapter  Google Scholar 

  12. Janssens, J., Huszßr, F., Postma, E., van den Herik, J.: TiCC TR 2012–001, Stochastic Outlier Selection. Technical report, Tilburg University (2012)

    Google Scholar 

  13. Kelley, P.G., Komanduri, S., Mazurek, M.L., Shay, R., Vidas, T., Bauer, L., Christin, N., Cranor, L.F., Lopez, J.: Guess again (and again and again): Measuring password strength by simulating password-cracking algorithms. In: IEEE Symposium on Security and Privacy (2012)

    Google Scholar 

  14. Malone, D., Maher, K.: Investigating the distribution of password choices. In: ACM WWW, pp. 301–310. ACM (2012)

    Google Scholar 

  15. Mazurek, M.L., Komanduri, S., Vidas, T., Bauer, L., Christin, N., Cranor, L.F., Kelley, P.G., Shay, R., Ur, B.: Measuring password guessability for an entire university. In: ACM CCS (2013)

    Google Scholar 

  16. Almishari, M., Tsudik, G.: Exploring linkability of user reviews. In: Foresti, S., Yung, M., Martinelli, F. (eds.) ESORICS 2012. LNCS, vol. 7459, pp. 307–324. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33167-1_18

    Chapter  Google Scholar 

  17. Narayanan, A., Paskov, H., Gong, N.Z., Bethencourt, J., Stefanov, E., Shin, E.C.R., Song, D.: On the feasibility of internet-scale author identification. In: IEEE Symposium on Security and Privacy (2012)

    Google Scholar 

  18. Narayanan, A., Shmatikov, V.: Fast dictionary attacks on passwords using time-space tradeoff. In: ACM CCS (2005)

    Google Scholar 

  19. Narayanan, A., Shmatikov, V.: Robust de-anonymization of large sparse datasets. In: IEEE Symposium on Security and Privacy (2008)

    Google Scholar 

  20. Narayanan, A., Shmatikov, V.: De-anonymizing social networks. In: IEEE Symposium on Security and Privacy (2009)

    Google Scholar 

  21. Newman, M.E.: Power laws, pareto distributions and zipf’s law. Contemp. Phys. 46(5), 323–351 (2005)

    Article  Google Scholar 

  22. Oechslin, P.: Making a faster cryptanalytic time-memory trade-off. In: Boneh, D. (ed.) CRYPTO 2003. LNCS, vol. 2729, pp. 617–630. Springer, Heidelberg (2003). doi:10.1007/978-3-540-45146-4_36

    Chapter  Google Scholar 

  23. Perito, D., Castelluccia, C., Kaafar, M.A., Manils, P.: How unique and traceable are usernames? In: Fischer-Hübner, S., Hopper, N. (eds.) PETS 2011. LNCS, vol. 6794, pp. 1–17. Springer, Heidelberg (2011). doi:10.1007/978-3-642-22263-4_1

    Chapter  Google Scholar 

  24. Schechter, S., Herley, C., Mitzenmacher, M.: Popularity is everything: a new approach to protecting passwords from statistical-guessing attacks. In: USENIX HotSec (2010)

    Google Scholar 

  25. Ur, B., Kelley, P.G., Komanduri, S., Lee, J., Maass, M., Mazurek, M., Passaro, T., Shay, R., Vidas, T., Bauer, L., et al.: How does your password measure up? The effect of strength meters on password creation. In: USENIX Security (2012)

    Google Scholar 

  26. Weir, M., Aggarwal, S., de Medeiros, B., Glodek, B.: Password cracking using probabilistic context-free grammars. In: IEEE Symposium on Security and Privacy (2009)

    Google Scholar 

Download references

Acknowledgements

We thank the Program Committee and reviewers for the many valuable comments that significantly improved the final version of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Olivier Heen .

Editor information

Editors and Affiliations

A Appendix

A Appendix

1.1 A.1 Terms for ‘as usual’

always, usual, the rest, for all, normal, same as, standard, regular, costumbres, siempre, sempre, wie immer, toujours, habit, d’hab, comme dab, altijd.

1.2 A.2 List of generic email addresses

abuse admin administrator contact design email info intern it legal kontakt mail marketing no-reply office post press print printer sales security service spam support sysadmin test web webmaster webmestre.

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Heen, O., Neumann, C. (2017). On the Privacy Impacts of Publicly Leaked Password Databases. In: Polychronakis, M., Meier, M. (eds) Detection of Intrusions and Malware, and Vulnerability Assessment. DIMVA 2017. Lecture Notes in Computer Science(), vol 10327. Springer, Cham. https://doi.org/10.1007/978-3-319-60876-1_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-60876-1_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-60875-4

  • Online ISBN: 978-3-319-60876-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics