Disclosing user relationships in email networks


To reveal patterns of communications of users in a network, an attacker may repeatedly obtain partial information on behavior and finally derive relationships between pairs of users through the modeling of this statistical information. This work is an enhancement of a previously presented statistical disclosure attack. The improvement of the attack is based on the use of the EM algorithm to improve the estimation of messages sent by users and to derive what pairs of users really communicate. Two methods are presented using the EM algorithm and the best method is used over real email data over 32 different network domains. Results are encouraging with high classification and positive predictive value rates.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4


  1. 1.

    Chaum DL (1981) Untraceable electronic mail, return addresses, and digital pseudonyms. Commun ACM 24:84–88

    Article  Google Scholar 

  2. 2.

    Dingledine R, Mathewson N, Syverson P (2004) Tor: the second generation onion router. In: Proceedings of the 13th USENIX security symposium, pp 303–320. San Diego, 9–13 Aug 2004

  3. 3.

    Pfitzmann A, Pfitzmann B, Waidner M (1991) ISDN-Mixes: untraceable communication with small bandwidth overhead. In: GI/ITG conference: communication in distributed systems, pp 451–463. Springer, Berlin

  4. 4.

    Gulcu C, Tsudik G (1996) Mixing e-mail with Babel. In: Proceedings of the symposium on network and distributed system security, San Diego, 22–23 Feb 1996, pp 2–16

  5. 5.

    Moller U, Cottrell L, Palfrader P, Sassaman L (2015) Mixmaster protocol version 2. Internet draft draft-sassaman-mixmaster-03, internet engineering task force. http://tools.ietf.org/html/draft-sassaman-mixmaster-03. Accessed 9 Feb 2015

  6. 6.

    Danezis G, Dingledine R, Mathewson N (2003) Mixminion: design of a type III anonymous remailer protocol. In: Proceedings of the 2003 symposium on security and privacy, pp 2–5. Oakland, 11–14 May 2003

  7. 7.

    Serjantov A, Sewell P (2003) Passive attack analysis for connection-based anonymity systems. In: Proceedings of European symposium on research in computer security, pp 116–131. Gjovik, 13–15 October 2003

  8. 8.

    Raymond JF (2000) Traffic analysis: protocols, attacks, design issues, and open problems. In: Proceedings of the international workshop on designing privacy enhancing technologies: design issues in anonymity and unobservability, pp 10–29. Berkeley, 25–26 July 2000

  9. 9.

    Agrawal D, Kesdogan D (2003) Measuring anonymity: the disclosure attack. IEEE Secur Priv 1:27–34

    Article  Google Scholar 

  10. 10.

    Danezis G (2003) Statistical disclosure attacks: traffic confirmation in open environments. In: Proceedings of security and privacy in the age of uncertainty, IFIP TC11, pp 421–426. Kluwer, Athens

  11. 11.

    Danezis G, Serjantov A (2004) Statistical disclosure or intersection attacks on anonymity systems. In: Proceedings of the 6th international conference on information hiding, pp 293–308. Toronto, 23–25 May 2004

  12. 12.

    Mathewson N, Dingledine R (2004) Practical traffic analysis: extending and resisting statistical disclosure. In: Proceedings of privacy enhancing technologies workshop, pp 17–34. Toronto, 26–28 May 2004

  13. 13.

    Danezis G, Diaz C, Troncoso C (2007) Two-sided statistical disclosure attack. In: Proceedings of the 7th international conference on privacy enhancing technologies, pp 30–44. Ottawa, 20–22 June 2007

  14. 14.

    Troncoso C, Gierlichs B, Preneel B, Verbauwhede I (2008) Perfect matching disclosure attacks. In: Proceedings of the 8th international symposium on privacy enhancing technologies, pp 2–23. Leuven, 23–25 July

  15. 15.

    Danezis G, Troncoso C (2009) Vida: how to use bayesian inference to de-anonymize persistent communications. In: Proceedings of the 9th international symposium on privacy enhancing technologies, pp 56–72. Seattle, 5–7 August 2009

  16. 16.

    Kesdogan D, Pimenidis L (2004) The hitting set attack on anonymity protocols. In: Proceedings of the 6th international conference on information hiding, pp 326–339. Toronto, 23–25 May 2004

  17. 17.

    Bagai R, Lu H, Tang B (2010) On the sender cover traffic countermeasure against an improved statistical disclosure attack. In: Proceedings of the IEEE/IFIP 8th international conference on embedded and ubiquitous computing, pp 555–560. Hong Kong, 11–13 December 2010

  18. 18.

    Perez-Gonzalez F, Troncoso C, Oya S (2014) A least squares approach to the static traffic analysis of high-latency anonymous communication systems. IEEE Trans Inf Forensics Secur 9:1341–1355

    Article  Google Scholar 

  19. 19.

    Oya S, Troncoso C, Pérez-González F (2015) Do dummies pay off? limits of dummy traffic protection in anonymous communications. http://link.springer.com/chapter/10.1007/978-3-319-08506-7_11. Accessed 3 Feb 2015

  20. 20.

    Mallesh N, Wright M (2011) An analysis of the statistical disclosure attack and receiver-bound. Comput Secur 30:597–612

    Article  Google Scholar 

  21. 21.

    Chen Y, Diaconis P, Holmes SP, Liu JS (2005) Sequential Monte Carlo methods for statistical analysis of tables. J Am Stat Assoc 100:109–120

    MathSciNet  Article  MATH  Google Scholar 

  22. 22.

    Portela J, García Villalba LJ, Silva A, Sandoval AL, Kim T (2015) Extracting association patterns in network communications. Sensors 15:4052–4071

    Article  Google Scholar 

Download references


Part of the computations of this work was performed in EOLO, the HPC of Climate Change of the International Campus of Excellence of Moncloa, funded by MECD and MICINN. This work was supported by the “Programa de Financiación de Grupos de Investigación UCM validados de la Universidad Complutense de Madrid - Banco Santander”.

Author information



Corresponding author

Correspondence to Tai-Hoon Kim.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Portela, J., García Villalba, L.J., Silva Trujillo, A.G. et al. Disclosing user relationships in email networks. J Supercomput 72, 3787–3800 (2016). https://doi.org/10.1007/s11227-015-1524-7

Download citation


  • Anonymity
  • Mixes
  • Network communications
  • Statistical disclosure attack