Skip to main content

Email Classification with Temporal Features

  • Conference paper
Intelligent Information Processing and Web Mining

Part of the book series: Advances in Soft Computing ((AINSC,volume 25))

Abstract

We propose a novel solution to the email classification problem: the integration of temporal information with the traditional content-based classification approaches. We discover temporal relations in an email sequence in the form of temporal sequential patterns and embed the discovered information into contentbased learning methods. The new heterogeneous classification system shows a good performance reducing the classification error by up to 22%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Srikant, R. (1994). Fast Algorithms for Mining Association Rules. Proc. of the 20th Int. Conf. on Very Large Data Bases (VLDB), 487–499. Morgan Kaufmann.

    Google Scholar 

  2. Agrawal, R., Srikant, R. (1995). Mining Sequential Patterns. Proc. of the 11th Int. Conf. on Data Engineering (ICDE), 3–14. IEEE Computer Society Press.

    Google Scholar 

  3. Borgelt, C. (2002). Bayes, version 2. 7. http://fuzzy.cs.uni-magdeburg.de/’borgelt/software.html#bayes.

    Google Scholar 

  4. Chang, C.-C., Lin, C.-J. (2001). LIBSVM: a library for support vector machines. http://www.csie.ntu.edu.tw/“cjlin/libsvm/.

    Google Scholar 

  5. Ferris Research (2003). Spam Control: Problems & Opportunities. http:://www.ferris.com.

    Google Scholar 

  6. Gama, J. (1998). Combining classifiers by constructive induction. Proc. of the 10th European Conf. on Machine Learning (ECML), 178–189. Springer.

    Google Scholar 

  7. Höppner, F., Klawonn, F. (2002). Finding Informative Rules in Interval Sequences. Intelligent Data Analysis, 6, 237–255.

    MATH  Google Scholar 

  8. Kam, P., Fu, A. W. (2000). Discovering Temporal Patterns for Interval-based Events. Proc. of the 2nd Int. Conf. on Data Warehousing and Knowledge Discovery (DaWaK), 317–326. Springer.

    Google Scholar 

  9. Kay, J., McCreath, E. (2001). Automatic Induction of Rules for E-Mail Classification. UM2001: 8th Int. Conf. on User Modeling, Workshop on User Modeling, Machine Learning and Information Retrieval.

    Google Scholar 

  10. Kleinberg, J. (2002). Bursty and Hierarchical Structure in Streams Proc. of the 8th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD), 91–101.

    Google Scholar 

  11. Laxman, S., Unnikrishnan, K.P., Sastry, P.S. (2002). Generalized Frequent Episodes in Event Sequences. 8th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, Workshop on Temporal Data Mining.

    Google Scholar 

  12. Mannila, H., Toivonen, H., Verkamo, A. I. (1995). Discovering Frequent Episodes in Sequences. Proc. of the 1st Int. Conf. on Knowledge Discovery and Data Mining (KDD), 210–215. AAAI Press.

    Google Scholar 

  13. Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.-C. (2001). PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth. Proc. of the 17th Int. Conf. on Data Engineering (ICDE), 251–225. IEEE Computer Society Press.

    Google Scholar 

  14. Quinlan, J.R. (1992). C4. 5: Programs for Machine Learning. Morgan Kaufmann.

    Google Scholar 

  15. Sahami, M., Dumais, S., Heckerman, D., Horvitz, E. (1998). A Bayesian Approach to Filtering Junk E-Mail. Proc. of the AAAI Workshop on Learning for Text Categorization.

    Google Scholar 

  16. Srikant, R. Agrawal, R. (1996). Mining Sequential Patterns: Generalizations and Performance Improvements. Proc. of the 5th Int. Conf. on Extending Database Technology (EDBT), 3–17. Springer.

    Google Scholar 

  17. Zaki, M. (2001). SPADE: An Efficient Algorithm for Mining Frequent Sequences. Machine Learning, 42 (1–2), 31–60.

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kiritchenko, S., Matwin, S., Abu-Hakima, S. (2004). Email Classification with Temporal Features. In: Kłopotek, M.A., Wierzchoń, S.T., Trojanowski, K. (eds) Intelligent Information Processing and Web Mining. Advances in Soft Computing, vol 25. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39985-8_61

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-39985-8_61

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-21331-4

  • Online ISBN: 978-3-540-39985-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics