Hierarchical Training of Multiple SVMs for Personalized Web Filtering

Erdmann, Maike; Nguyen, Duc Dung; Takeyoshi, Tomoya; Hattori, Gen; Matsumoto, Kazunori; Ono, Chihiro

doi:10.1007/978-3-642-32695-0_5

Maike Erdmann²²,
Duc Dung Nguyen²³,
Tomoya Takeyoshi²²,
Gen Hattori²²,
Kazunori Matsumoto²² &
…
Chihiro Ono²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7458))

Included in the following conference series:

Pacific Rim International Conference on Artificial Intelligence

2876 Accesses
2 Citations

Abstract

The abundance of information published on the Internet makes filtering of hazardous Web pages a difficult yet important task. Supervised learning methods such as Support Vector Machines can be used to identify hazardous Web content. However, scalability is a big challenge, especially if we have to train multiple classifiers, since different policies exist on what kind of information is hazardous. We therefore propose a transfer learning approach called Hierarchical Training for Multiple SVMs. HTMSVM identifies common data among similar training sets and trains the common data sets first, in order to obtain initial solutions. These initial solutions then reduce the time for training the individual training sets without influencing classification accuracy. In an experiment, in which we trained five Web content filters with 80% of common and 20% of inconsistently labeled training examples, HTMSVM was able to predict hazardous Web pages with a training time of only 26% to 41% compared to LibSVM, but the same classification accuracy (more than 91%).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ikeda, K., Yanagihara, T., Hattori, G., Matsumoto, K., Takisima, Y.: Hazardous Document Detection Based on Dependency Relations and Thesaurus. In: Li, J. (ed.) AI 2010. LNCS, vol. 6464, pp. 455–465. Springer, Heidelberg (2010)
Chapter Google Scholar
Nguyen, D.D., Matsumoto, K., Takishima, Y., Hashimoto, K.: Condensed vector machines: Learning fast machine for large data. IEEE Transactions on Neural Networks 21(12), 1903–1914 (2010)
Article Google Scholar
Cortes, C., Vapnik, V.: Support-vector networks. Machine Learning 20(3), 273–297 (1995)
MATH Google Scholar
Platt, J.C.: Sequential minimal optimization: A fast algorithm for training support vector machines. Technical report, Advances in Kernel Methods - Support Vector Learning (1998)
Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 27:1–27:27 (2011), Software, http://www.csie.ntu.edu.tw/~cjlin/libsvm
Menon, A.K.: Large-scale support vector machines: Algorithms and theory, research exam. Technical report, University of California San Diego (2009)
Google Scholar
Cervantes, J., Li, X., Yu, W.: Svm classification for large data sets by considering models of classes distribution. In: Mexican International Conference on Artificial Intelligence (MIKAI), pp. 51–60 (2007)
Google Scholar
Abu-Mostafa, Y.S.: Learning from hints in neural networks. Journal of Complexity 6(2), 192–198 (1990)
Article MathSciNet MATH Google Scholar
Caruana, R.: Multitask learning: A knowledge-based source of inductive bias. In: Proceedings of the Tenth International Conference on Machine Learning, pp. 41–48 (1993)
Google Scholar
Thrun, S.: Is learning the n-th thing any easier than learning the first? In: Advances in Neural Information Processing Systems, pp. 640–646 (1996)
Google Scholar
Baxter, J.: A model of inductive bias learning. Journal of Artificial Intelligence Research 12, 149–198 (2000)
MathSciNet MATH Google Scholar
Arnold, A., Nallapati, R., Cohen, W.W.: A comparative study of methods for transductive transfer learning. In: Proceedings of the Seventh IEEE International Conference on Data Mining Workshops, pp. 77–82 (2007)
Google Scholar
Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering 22(10), 1345–1359 (2010)
Article Google Scholar
Bickel, S.: Ecml-pkdd discovery challenge 2006 overview. In: ECML-PKDD Discovery Challenge Workshop, pp. 1–9 (2008)
Google Scholar
Cauwenberghs, G., Poggio, T.: Incremental and decremental support vector machine learning. In: Advances in Neuronal Information Processing Systems, vol. 13, pp. 409–415 (2000)
Google Scholar
Ruping, S.: Incremental learning with support vector machines. In: IEEE International Conference on Data Mining, pp. 641–642 (2001)
Google Scholar
Shilton, A., Palaniswami, M., Ralph, D., Tsoi, A.C.: Incremental training of support vector machines. IEEE Transactions on Neural Networks 16(1), 114–131 (2005)
Article Google Scholar
Tsoumakas, G., Katakis, I.: Multi-label classification: An overview. International Journal of Data Warehousing and Mining, 1–13 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

KDDI R&D Laboratories, Saitama, Japan
Maike Erdmann, Tomoya Takeyoshi, Gen Hattori, Kazunori Matsumoto & Chihiro Ono
Vietnam Academy of Science and Technology, Hanoi, Vietnam
Duc Dung Nguyen

Authors

Maike Erdmann
View author publications
You can also search for this author in PubMed Google Scholar
Duc Dung Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Tomoya Takeyoshi
View author publications
You can also search for this author in PubMed Google Scholar
Gen Hattori
View author publications
You can also search for this author in PubMed Google Scholar
Kazunori Matsumoto
View author publications
You can also search for this author in PubMed Google Scholar
Chihiro Ono
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Environment, Society and Design, Department of Applied Computing, Lincoln University, P.O. Box 84, 7647, Christchurch, New Zealand
Patricia Anthony
School of Information Science and Technology, University of Tokyo, 7-3-1, Hongo, 113-8656, Bunkyo-ku, Tokyo, Japan
Mitsuru Ishizuka
MIMOS Berhad, Knowledge Technology, Technology Park Malaysia,, 57000, Kuala Lumpur, Malaysia
Dickson Lukose

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Erdmann, M., Nguyen, D.D., Takeyoshi, T., Hattori, G., Matsumoto, K., Ono, C. (2012). Hierarchical Training of Multiple SVMs for Personalized Web Filtering. In: Anthony, P., Ishizuka, M., Lukose, D. (eds) PRICAI 2012: Trends in Artificial Intelligence. PRICAI 2012. Lecture Notes in Computer Science(), vol 7458. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32695-0_5

Download citation

DOI: https://doi.org/10.1007/978-3-642-32695-0_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32694-3
Online ISBN: 978-3-642-32695-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics