Automatic Categorization of Email into Folders by Ant Colony Decision Tree and Social Networks

  • Urszula Boryczka
  • Barbara ProbierzEmail author
  • Jan Kozak
Conference paper
Part of the Smart Innovation, Systems and Technologies book series (SIST, volume 57)


This paper presents a new approach to an automatic categorization of email messages into mailbox folders. The aim of this paper is to create an algorithm that would allow one to improve the classification of emails into folders by using solutions that have been applied in Ant Colony Decision Tree (ACDT). Additionally, elements of Social Network Analysis (SNA) were included in this algorithm. The new algorithm that is proposed here was tested on the publicly available Enron E-mail data set and all experiments were conducted on uncleaned data. For the purpose of comparing the results, additional tests were carried out by using selected classifiers which were generally available. The obtained results confirm that the proposed approach allows one to improve the accuracy with which new emails are assigned to particular folders based on an analysis of previous correspondence, even when uncleaned data sets are used.


Ant colony optimization Social network analysis Enron E-mail 


  1. 1.
    Aral, S., Van Alstyne, M.: Network structure & information advantage. In: Proceedings of the Academy of Management Conference, vol. 3, Philadelphia, PA. Citeseer (2007)Google Scholar
  2. 2.
    Boryczka, U., Kozak, J.: Ant Colony Decision Trees—a new method for constructing decision trees based on Ant Colony Optimization. In: Computational Collective Intelligence. Technologies and Applications, LNCS, vol. 6421, pp. 373–382. Springer (2010)Google Scholar
  3. 3.
    Boryczka, U., Probierz, B., Kozak, J.: An ant colony optimization algorithm for an automatic categorization of emails. Computational Collective Intelligence. Technologies and Applications, LNCS, vol. 8733, pp. 583–592. Springer, Berlin (2014)Google Scholar
  4. 4.
    Cummings, J.N., Cross, R.: Structural properties of work groups and their consequences for performance. Soc. Netw. 25(3), 197–210 (2003)CrossRefGoogle Scholar
  5. 5.
    Gloor, P.A.: Swarm creativity: competitive advantage through collaborative innovation networks. Oxford University Press (2005)Google Scholar
  6. 6.
    Gloor, P.A., Grippa, F., Putzke, J., Lassenius, C., Fuehres, H., Fischbach, K., Schoder, D.: Measuring social capital in creative teams through sociometric sensors. Int. J. Organ. Des. Eng. 2(4), 380–401 (2012)Google Scholar
  7. 7.
    Kozak, J., Boryczka, U.: Enhancing the effectiveness of ant colony decision tree algorithms by co-learning. Appl. Soft Comput. 30, 166–178 (2015)CrossRefGoogle Scholar
  8. 8.
    Moreno, J.L.: Who shall survive? Foundations of Sociometry, Group Psychotherapy and Socio-drama. Beacon House (1953)Google Scholar
  9. 9.
    Tkacz, M.: Artificial neural networks in incomplete data sets processing. In: Intelligent Information Processing and Web Mining, pp. 577–583. Springer (2005)Google Scholar
  10. 10.
    Wilson, G., Banzhaf, W.: Discovery of email communication networks from the enron corpus with a genetic algorithm using social network analysis. In: IEEE Congress on Evolutionary Computation, 2009. CEC’09, pp. 3256–3263. IEEE (2009)Google Scholar
  11. 11.
    Witten, I.H., Frank, E., Hall, M.A.: Data Mining: Practical Machine Learning Tools and Techniques, 3rd edn. Morgan Kaufmann Publishers Inc. (2011)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 2.5 International License (, which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Authors and Affiliations

  1. 1.Institute of Computer ScienceUniversity of SilesiaSosnowiecPoland

Personalised recommendations