Clustering E-Mails for the Swedish Social Insurance Agency – What Part of the E-Mail Thread Gives the Best Quality?

Dalianis, Hercules; Rosell, Magnus; Sneiders, Eriks

doi:10.1007/978-3-642-14770-8_14

Hercules Dalianis²²,
Magnus Rosell^22,23 &
Eriks Sneiders²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6233))

Included in the following conference series:

International Conference on Natural Language Processing

1172 Accesses
3 Citations

Abstract

We need to analyse a large number of e-mails sent by the citizens to the customer services department of a governmental organisation based in Sweden. To carry out this analysis we clustered a large number of e-mails with the aim of automatic e-mail answering. One issue that came up was whether we should use the whole e-mail including the thread or just the original query for the clustering. In this paper we describe this investigation. Our results show that only the query and the answering part should be used, but not necessarily the whole e-mail thread. The results clearly show that the original question contains more useful information than only the answer, although a combination is even better. Using the full e-mail thread does not downgrade the result.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Knutsson, O., Pargman, T., Dalianis, H., Rosell, M., Sneiders, E.: Increasing the efficiency and quality of e-mail communication in e-Governmnent using language technology. In: Proc. of IFIP e-Government Conference 2010 (EGOV 2010), Lausanne, Switzerland, August 29-September 2 (2010) (to be published)
Google Scholar
Lampert, A., Dale, R., Paris, C.: Segmenting email message text into zones. In: Proc. of the 2009 Conference on Empirical Methods in Natural Language Processing, EMNLP 2009 (2009)
Google Scholar
Huang, Y., Govindaraju, D., Mitchell, T.M., de Carvalho, V.R., Cohen, W.W.: Inferring ongoing activities of workstation users by clustering email. In: CEAS – Conference on Email and Anti-Spam (2004)
Google Scholar
Schuff, D., Turetken, O., D’Arcy, J.: A multi-attribute, multi-weight clustering approach to managing “e-mail overload”. Decision Support Systems 42, 1350–1365 (2006)
Article Google Scholar
Domeij, R., Knutsson, O., Carlberger, J., Kann, V.: Granska – an efficient hybrid system for Swedish grammar checking. In: Proc. 12th Nordic Conf. on Comp. Ling. – NODALIDA 1999 (1999)
Google Scholar
Rosell, M.: Text Clustering Exploration – Swedish Text Representation and Clustering Results Unraveled. PhD thesis, School of Computer Science and Communication, Royal Institute of Technology, Stockholm, Sweden (2009)
Google Scholar
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)
MATH Google Scholar
Cutting, D.R., Pedersen, J.O., Karger, D., Tukey, J.W.: Scatter/Gather: A cluster-based approach to browsing large document collections. In: Proc. 15th Annual Int. ACM SIGIR Conf. on Research and Development in Information Retrieval (1992)
Google Scholar
Strehl, A., Ghosh, J.: Cluster ensembles — a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2003)
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer and Systems Science, (DSV) Stockholm University Forum 100, 164 40, Kista, Sweden
Hercules Dalianis, Magnus Rosell & Eriks Sneiders
KTH CSC, 100 44, Stockholm, Sweden
Magnus Rosell

Authors

Hercules Dalianis
View author publications
You can also search for this author in PubMed Google Scholar
Magnus Rosell
View author publications
You can also search for this author in PubMed Google Scholar
Eriks Sneiders
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer Science, Reykjavik University, Kringlan 1, 103, Reykjavik, Iceland
Hrafn Loftsson
Department of Icelandic, University of Iceland, Árnagardur v/Sudurgötu, 101, Reykjavik, Iceland
Eiríkur Rögnvaldsson
Arni Magnusson Institute for Icelandic Studies, Neshagi 16, 101, Reykjavik, Iceland
Sigrún Helgadóttir

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dalianis, H., Rosell, M., Sneiders, E. (2010). Clustering E-Mails for the Swedish Social Insurance Agency – What Part of the E-Mail Thread Gives the Best Quality?. In: Loftsson, H., Rögnvaldsson, E., Helgadóttir, S. (eds) Advances in Natural Language Processing. NLP 2010. Lecture Notes in Computer Science(), vol 6233. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14770-8_14

Download citation

DOI: https://doi.org/10.1007/978-3-642-14770-8_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14769-2
Online ISBN: 978-3-642-14770-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics