Adaptive Ensemble Classification in P2P Networks

Ang, Hock Hee; Gopalkrishnan, Vivekanand; Hoi, Steven C. H.; Ng, Wee Keong

doi:10.1007/978-3-642-12026-8_5

Hock Hee Ang²⁰,
Vivekanand Gopalkrishnan²⁰,
Steven C. H. Hoi²⁰ &
…
Wee Keong Ng²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5981))

Included in the following conference series:

International Conference on Database Systems for Advanced Applications

1295 Accesses
5 Citations

Abstract

Classification in P2P networks has become an important research problem in data mining due to the popularity of P2P computing environments. This is still an open difficult research problem due to a variety of challenges, such as non-i.i.d. data distribution, skewed or disjoint class distribution, scalability, peer dynamism and asynchronism. In this paper, we present a novel P2P Adaptive Classification Ensemble (PACE) framework to perform classification in P2P networks. Unlike regular ensemble classification approaches, our new framework adapts to the test data distribution and dynamically adjusts the voting scheme by combining a subset of classifiers/peers according to the test data example. In our approach, we implement the proposed PACE solution together with the state-of-the-art linear SVM as the base classifier for scalable P2P classification. Extensive empirical studies show that the proposed PACE method is both efficient and effective in improving classification performance over regular methods under various adverse conditions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bhaduri, K., Wolff, R., Giannella, C., Kargupta, H.: Distributed decision-tree induction in peer-to-peer systems. Statistical Analysis and Data Mining 1(2), 85–103 (2008)
Article MathSciNet Google Scholar
Ang, H.H., Gopalkrishnan, V., Hoi, S.C.H., Ng, W.K.: Cascade RSVM in peer-to-peer networks. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part I. LNCS (LNAI), vol. 5211, pp. 55–70. Springer, Heidelberg (2008)
Chapter Google Scholar
Gorodetskiy, V., Karsaev, O., Samoilov, V., Serebryakov, S.: Agent-based service-oriented intelligent P2P networks for distributed classification. In: Hybrid Information Technology, pp. 224–233 (2006)
Google Scholar
Luo, P., Xiong, H., Lü, K., Shi, Z.: Distributed classification in peer-to-peer networks. In: KDD, pp. 968–976 (2007)
Google Scholar
Siersdorfer, S., Sizov, S.: Automatic document organization in a P2P environment. In: ECIR, pp. 265–276 (2006)
Google Scholar
Datta, S., Bhaduri, K., Giannella, C., Wolff, R., Kargupta, H.: Distributed data mining in peer-to-peer networks. IEEE Internet Computing, Special issue on Distributed Data Mining 10(4), 18–26 (2006)
Google Scholar
Ang, H.H., Gopalkrishnan, V., Hoi, S.C.H., Ng, W.K., Datta, A.: Classification in P2P networks by bagging cascade RSVMs. In: VLDB Workshop on DBISP2P, pp. 13–25 (2008)
Google Scholar
Chan, P.K., Stolfo, S.J.: Toward scalable learning with non-uniform class and cost distributions: A case study in credit card fraud detection. In: KDD, pp. 164–168 (1998)
Google Scholar
Jordan, M.I., Xu, L.: Convergence results for the em approach to mixtures of experts architectures. Neural Networks 8(9), 1409–1431 (1995)
Article Google Scholar
Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Transactions on Information Theory 13(1), 21–27 (1967)
Article MATH Google Scholar
Breiman, L.: Pasting small votes for classification in large databases and on-line. Machine Learning 36(1-2), 85–103 (1999)
Article Google Scholar
Hsieh, C.J., Chang, K.W., Lin, C.J., Keerthi, S.S., Sundararajan, S.: A dual coordinate descent method for large-scale linear SVM. In: ICML, pp. 408–415 (2008)
Google Scholar
Berchtold, S., Ertl, B., Keim, D.A., Kriegel, H.P., Seidl, T.: Fast nearest neighbor search in high-dimensional space. In: ICDE, pp. 209–218 (1998)
Google Scholar
Andoni, A., Indyk, P.: Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In: FOCS, pp. 459–468 (2006)
Google Scholar
Arthur, D., Vassilvitskii, S.: K-means++: the advantages of careful seeding. In: SODA, pp. 1027–1035 (2007)
Google Scholar
Asuncion, A., Newman, D.: UCI machine learning repository (2007)
Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)
Google Scholar

Download references

Author information

Authors and Affiliations

Nanyang Technological University, Singapore
Hock Hee Ang, Vivekanand Gopalkrishnan, Steven C. H. Hoi & Wee Keong Ng

Authors

Hock Hee Ang
View author publications
You can also search for this author in PubMed Google Scholar
Vivekanand Gopalkrishnan
View author publications
You can also search for this author in PubMed Google Scholar
Steven C. H. Hoi
View author publications
You can also search for this author in PubMed Google Scholar
Wee Keong Ng
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Graduate School of Systems and Information Engineering, University of Tsukuba, Tennodai, Tsukuba, 305–8573, Ibaraki, Japan
Hiroyuki Kitagawa
Information Technology Center, Nagoya University, Furo-cho, Chikusa-ku, 464-8601, Nagoya, Japan
Yoshiharu Ishikawa
City University of Hong Kong, Department of Computer Science, 83 Tat Chee Avenue, Kowloon, Hong Kong, China
Qing Li
Department of Information Science, Ochanomizu University, 2-1-1, Otsuka, Bunkyo-ku, 112-8610, Tokyo, Japan
Chiemi Watanabe

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ang, H.H., Gopalkrishnan, V., Hoi, S.C.H., Ng, W.K. (2010). Adaptive Ensemble Classification in P2P Networks. In: Kitagawa, H., Ishikawa, Y., Li, Q., Watanabe, C. (eds) Database Systems for Advanced Applications. DASFAA 2010. Lecture Notes in Computer Science, vol 5981. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12026-8_5

Download citation

DOI: https://doi.org/10.1007/978-3-642-12026-8_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12025-1
Online ISBN: 978-3-642-12026-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics