Abstract
Email overload is a recent problem that there is increasingly difficulty people have faced to process the large number of emails received daily. Currently this problem becomes more and more serious and it has already affected the normal usage of email as a knowledge management tool. It has been recognized that categorizing emails into meaningful groups can greatly save cognitive load to process emails and thus this is an effective way to manage email overload problem. However, most current approaches still require significant human input when categorizing emails. In this paper we develop an automatic email clustering system, underpinned by a new nonparametric text clustering algorithm. This system does not require any predefined input parameters and can automatically generate meaningful email clusters. Experiments show our new algorithm outperforms existing text clustering algorithms with higher efficiency in terms of computational time and clustering quality measured by different gauges.
Chapter PDF
Similar content being viewed by others
References
IDC: IDC Examines the Future of Email As It Navigates Security Threats, Compliance Requirements, and Market Alternatives (2005), http://www.idc.com/getdoc.jsp?containerId=prUS20033705
Schultze, U., Vandenbosch, B.: Information Overload in a Groupware Environment: Now You See It, Now You Don’t. Journal of Organizational Computing and Electronic Commerce 8, 127–148 (1998)
Schuff, D., Turetken, O., D’Arcy, J., Croson, D.: Managing E-Mail Overload: Solutions and Future Challenges. IEEE Computer 40, 31–36 (2007)
Schuff, D., Turetken, O., D’Arcy, J.: A Multi-attribute, Multi-weight Clustering Approach to Managing, E-Mail Overload. Decision Support Systems 42, 1350–1365 (2006)
Roussinov, D.G., Chen, H.: Document Clustering for Electronic Meetings: An Experimental Comparison of Two Techniques. Decision Support Systems 27, 67–79 (1999)
Mock, K.: An Experimental Framework for Email Categorization and Management. In: 24th ACM International Conference on Research and Development in Information Retrieval, pp. 392–393 (2001)
Whittaker, S., Sidner, C.: Email Overload: Exploring Personal Information Management of Email. In: ACM SIGCHI conference on Human Factors in Computing Systems, pp. 276–283 (1996)
Baker, F.B., Hubert, L.J.: Measuring the Power of Hierarchical Cluster Analysis. Journal of the American Statistical Association 70, 31–38 (1975)
Aldenderfer, M.S., Blashfield, R.K.: Cluster Analysis. Sage Publications, Thousand Oaks (1984)
Tabachnick, B.G., Fidell, L.S.: Using Multivariate Statistics. Harper Collins College Publishers, New York (1996)
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley, Reading (1999)
Payne, T., Edwards, P.: Interface Agents that Learn: An Investigation of Learning Issues in a Mail Interface. Applied Artificial Intelligence 11, 1–32 (1997)
Kushmerick, N., Lau, T.: Automated E-Mail Activity Management: An Unsupervised Learning Approach. In: 10th International Conf. on Intelligent User Interfaces, pp. 67–74 (2005)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 IFIP International Federation for Information Processing
About this paper
Cite this paper
Xiang, Y., Zhou, W., Chen, J. (2007). Managing Email Overload with an Automatic Nonparametric Clustering Approach. In: Li, K., Jesshope, C., Jin, H., Gaudiot, JL. (eds) Network and Parallel Computing. NPC 2007. Lecture Notes in Computer Science, vol 4672. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74784-0_9
Download citation
DOI: https://doi.org/10.1007/978-3-540-74784-0_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74783-3
Online ISBN: 978-3-540-74784-0
eBook Packages: Computer ScienceComputer Science (R0)