Skip to main content

A Novel Framework for Spam Hunting by Tracking Concept Drift

  • Conference paper
  • First Online:
Second International Conference on Computer Networks and Communication Technologies (ICCNCT 2019)

Part of the book series: Lecture Notes on Data Engineering and Communications Technologies ((LNDECT,volume 44))

  • 1423 Accesses

Abstract

In mid seventies a new method of exchanging messages between electronic devices originated which revolutionized the global community into a new world of computer networks called internet. The users identified the potential usage of this method presently known as email and started using it as the means of communication and marketing. But the competence of this method was lessened by the wide spread proliferation of spam. Researchers have come up with many proposals and tools to fight against spam. But the dynamic nature of spam makes the tools ineffective and raises the requirement for developing a filter that is to be successful over time in identifying spam. Hence spam filtering is a particularly exigent machine learning task as the data distribution and concept being learned changes over time. This paper explores this phenomenon called concept drift seen in email datasets and proposes a new framework in identifying the strategies for developing spam detection systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Schlimmer, J.C., Granger, R.H.: Incremental learning from noisy data. Mach. Learn. 1(3), 317–354 (1986)

    Article  Google Scholar 

  2. Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. Mach. Learn. 23(1), 69–101 (1996)

    Article  Google Scholar 

  3. Aha, D.W., Kibler, D., Albert, M.K.: Instance-based learning algorithms. Mach. Learn. 6(1), 37–66 (1991)

    Article  Google Scholar 

  4. Narendra, K.S., Parthasarathy, K.: Identification and control of dynamical systems using neural networks. IEEE Trans. Neural Netw. 1(1), 4–27 (1990)

    Article  Google Scholar 

  5. Harries, M., Horn, K.: Detecting concept drift in financial time series prediction using symbolic machine learning. In: Proceedings of the 8th Australian Joint Conference on Artificial Intelligence (ACAI) (1995)

    Google Scholar 

  6. Klinkenberg, R.: Concept drift and the importance of examples. In: Text Mining—Theoretical Aspects and Applications, pp. 55–77. Physica-Verlag (2003)

    Google Scholar 

  7. Kilander, F., Jansson, C.G.: COBBIT—a control procedure for COBWEB in the presence of concept drift. In: Brazdil, P.B. (ed.) Proceedings of European Conference on Machine Learning (ECML), pp. 244–261. Springer, Berlin (1993). https://doi.org/10.1007/3-540-56602-3_140

  8. Spinosa, E.J., de Leon F., de Carvalho, A.P., Gama, J.: Olindda: A cluster-based approach for detecting novelty and concept drift in data streams. In: Proceedings of the 2007 ACM Symposium on Applied Computing, pp. 448–452, March 2007. ACM (2007)

    Google Scholar 

  9. Lazarescu, M.M., Venkatesh, S., Bui, H.H.: Using multiple windows to track concept drift. Intell. Data Analy. 8(1), 29–59 (2004)

    Article  Google Scholar 

  10. Delany, S.J., Cunningham, P., Tsymbal, A.: A comparison of ensemble and case-base maintenance techniques for handling concept drift in spam filtering. In: FLAIRS Conference, pp. 340–345, January 2006

    Google Scholar 

  11. Delany, S.J., Cunningham, P., Tsymbal, A., Coyle, L.: A case-based technique for tracking concept drift in spam filtering. In: International Conference on Innovative Techniques and Applications of Artificial Intelligence, December 2004, pp. 3–16. Springer, London (2004)

    Google Scholar 

  12. Fdez-Riverola, F., Iglesias, E.L., Díaz, F., Méndez, J.R., Corchado, J.M.: Applying lazy learning algorithms to tackle concept drift in spam filtering. Expert Syst. Appl. 33(1), 36–48 (2007)

    Article  Google Scholar 

  13. Nosrati, L., Pour, A.N.: DWM-CDD: dynamic weighted majority concept drift detection for spam mail filtering. World Acad. Sci. Eng. Technol. Int. J. Comput. Electr. Autom. Control Inf. Eng. 5(8), 829–832 (2011)

    Google Scholar 

  14. Ruano-Ordas, D., Fdez-Riverola, F., Méndez, J.R.: Concept drift in e-mail datasets: an empirical study with practical implications. Inf. Sci. 428, 120–135 (2018)

    Article  MathSciNet  Google Scholar 

  15. Brzezinski, D., Stefanowsk, J.: Mining data streams with concept drift. Poznan University of Technology Faculty of Computing Science and Management Institute of Computing Science (2011)

    Google Scholar 

  16. Tsymbal, A.: The problem of concept drift: definitions and related work. Computer Science Department, Trinity College Dublin, vol. 106, no. 2, p. 58 (2004)

    Google Scholar 

  17. https://www.cs.cmu.edu/~/enron/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to V. Bindu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bindu, V., Thomas, C. (2020). A Novel Framework for Spam Hunting by Tracking Concept Drift. In: Smys, S., Senjyu, T., Lafata, P. (eds) Second International Conference on Computer Networks and Communication Technologies. ICCNCT 2019. Lecture Notes on Data Engineering and Communications Technologies, vol 44. Springer, Cham. https://doi.org/10.1007/978-3-030-37051-0_103

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-37051-0_103

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-37050-3

  • Online ISBN: 978-3-030-37051-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics