Email Classification with Temporal Features

Kiritchenko, Svetlana; Matwin, Stan; Abu-Hakima, Suhayya

doi:10.1007/978-3-540-39985-8_61

Svetlana Kiritchenko³,
Stan Matwin³ &
Suhayya Abu-Hakima⁴

Part of the book series: Advances in Soft Computing ((AINSC,volume 25))

622 Accesses
9 Citations

Abstract

We propose a novel solution to the email classification problem: the integration of temporal information with the traditional content-based classification approaches. We discover temporal relations in an email sequence in the form of temporal sequential patterns and embed the discovered information into contentbased learning methods. The new heterogeneous classification system shows a good performance reducing the classification error by up to 22%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R., Srikant, R. (1994). Fast Algorithms for Mining Association Rules. Proc. of the 20th Int. Conf. on Very Large Data Bases (VLDB), 487–499. Morgan Kaufmann.
Google Scholar
Agrawal, R., Srikant, R. (1995). Mining Sequential Patterns. Proc. of the 11th Int. Conf. on Data Engineering (ICDE), 3–14. IEEE Computer Society Press.
Google Scholar
Borgelt, C. (2002). Bayes, version 2. 7. http://fuzzy.cs.uni-magdeburg.de/’borgelt/software.html#bayes.
Google Scholar
Chang, C.-C., Lin, C.-J. (2001). LIBSVM: a library for support vector machines. http://www.csie.ntu.edu.tw/“cjlin/libsvm/.
Google Scholar
Ferris Research (2003). Spam Control: Problems & Opportunities. http:://www.ferris.com.
Google Scholar
Gama, J. (1998). Combining classifiers by constructive induction. Proc. of the 10th European Conf. on Machine Learning (ECML), 178–189. Springer.
Google Scholar
Höppner, F., Klawonn, F. (2002). Finding Informative Rules in Interval Sequences. Intelligent Data Analysis, 6, 237–255.
MATH Google Scholar
Kam, P., Fu, A. W. (2000). Discovering Temporal Patterns for Interval-based Events. Proc. of the 2nd Int. Conf. on Data Warehousing and Knowledge Discovery (DaWaK), 317–326. Springer.
Google Scholar
Kay, J., McCreath, E. (2001). Automatic Induction of Rules for E-Mail Classification. UM2001: 8th Int. Conf. on User Modeling, Workshop on User Modeling, Machine Learning and Information Retrieval.
Google Scholar
Kleinberg, J. (2002). Bursty and Hierarchical Structure in Streams Proc. of the 8th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD), 91–101.
Google Scholar
Laxman, S., Unnikrishnan, K.P., Sastry, P.S. (2002). Generalized Frequent Episodes in Event Sequences. 8th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, Workshop on Temporal Data Mining.
Google Scholar
Mannila, H., Toivonen, H., Verkamo, A. I. (1995). Discovering Frequent Episodes in Sequences. Proc. of the 1st Int. Conf. on Knowledge Discovery and Data Mining (KDD), 210–215. AAAI Press.
Google Scholar
Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.-C. (2001). PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth. Proc. of the 17th Int. Conf. on Data Engineering (ICDE), 251–225. IEEE Computer Society Press.
Google Scholar
Quinlan, J.R. (1992). C4. 5: Programs for Machine Learning. Morgan Kaufmann.
Google Scholar
Sahami, M., Dumais, S., Heckerman, D., Horvitz, E. (1998). A Bayesian Approach to Filtering Junk E-Mail. Proc. of the AAAI Workshop on Learning for Text Categorization.
Google Scholar
Srikant, R. Agrawal, R. (1996). Mining Sequential Patterns: Generalizations and Performance Improvements. Proc. of the 5th Int. Conf. on Extending Database Technology (EDBT), 3–17. Springer.
Google Scholar
Zaki, M. (2001). SPADE: An Efficient Algorithm for Mining Frequent Sequences. Machine Learning, 42 (1–2), 31–60.
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

School of Information Technology and Engineering, University of Ottawa, Ottawa, ON, K1N 6N5, Canada
Svetlana Kiritchenko & Stan Matwin
AmikaNow! Corporation, 700 March Road, Suite 203, Kanata, K2K 2V9, Ontario, Canada
Suhayya Abu-Hakima

Authors

Svetlana Kiritchenko
View author publications
You can also search for this author in PubMed Google Scholar
Stan Matwin
View author publications
You can also search for this author in PubMed Google Scholar
Suhayya Abu-Hakima
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Computer Sciences, Polish Academy of Sciences, ul. Ordona 21, 01-237, Warszawa, Poland
Mieczysław A. Kłopotek (Conference Co-Chair), Sławomir T. Wierzchoń (Conference Co-Chair) & Krzysztof Trojanowski (Organizing Committee Chair) (Conference Co-Chair), (Conference Co-Chair) & (Organizing Committee Chair)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kiritchenko, S., Matwin, S., Abu-Hakima, S. (2004). Email Classification with Temporal Features. In: Kłopotek, M.A., Wierzchoń, S.T., Trojanowski, K. (eds) Intelligent Information Processing and Web Mining. Advances in Soft Computing, vol 25. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39985-8_61

Download citation

DOI: https://doi.org/10.1007/978-3-540-39985-8_61
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21331-4
Online ISBN: 978-3-540-39985-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics