Advertisement

P2P RVM for Distributed Classification

  • Muhammad Umer KhanEmail author
  • Alexandros Nanopoulos
  • Lars Schmidt-Thieme
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)

Abstract

In recent years there is an increasing interest for analytical methods that learn patterns over large-scale data distributed over Peer-to-Peer (P2P) networks and support applications. Mining patterns in such distributed and dynamic environment is a challenging task, because centralization of data is not feasible. In this paper, we have proposed a distributed classification technique based on relevance vector machines (RVM) and local model exchange among neighboring peers in a P2P network. In such networks, the evaluation criteria for an efficient distributed classification algorithm is based on the size of resulting local models (communication efficiency) and their prediction accuracy. RVM utilizes dramatically fewer kernel functions than a state-of-the-art “support vector machine” (SVM), while demonstrating comparable generalization performance. This makes RVM a suitable choice to learn compact and accurate local models at each peer in a P2P network. Our model propagation approach, exchange resulting models with peers in a local neighborhood to produce more accurate network wide global model, while keeping the communication cost low throughout the network. Through extensive experimental evaluations, we demonstrate that by using more relevant and compact models, our approach outperforms the baseline model propagation approaches in terms of accuracy and communication cost.

Notes

Acknowledgements

This work is funded by the Seventh Framework Program of European Commission, through the project REDUCTION (No. 288254). www.reduction-project.eu.

References

  1. Ang, H.-H., Gopalkrishnan, V., Hoi, S. C., & Ng, W. W. (2008). Cascade RSVM in Peer-to-Peer Networks. In European Conference on Machine Learning and Knowledge Discovery in Databases.Google Scholar
  2. Bhaduri, K., Wolff, R., Giannella, C., & Kargupta, H. (2008). Distributed decision-tree induction in peer-to-peer systems. Statistical Analysis and Data Mining, 1(2), 85–103.CrossRefMathSciNetGoogle Scholar
  3. Caruana, G., & Li, M. (2012). A survey of emerging approaches to spam filtering. ACM Computing Surveys, 44(2), Article 9, 27.Google Scholar
  4. Chang, C.-C., & Lin, C.-J., LIBSVM. (2011). A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2, 27:1–27:27.Google Scholar
  5. Datta, S., Giannella, C., & Kargupta, H. (2009). Approximate distributed k-means clustering over a peer-to-peer network. Transactions on Knowledge and Data Engineering, 21(10), 1372–1388.CrossRefGoogle Scholar
  6. Lee, Y.-J., & Mangasarian, O. L.(2001). RSVM: Reduced support vector machines. In First SIAM International Conference on Data Mining, 5–7.Google Scholar
  7. Luo, P., Xiong, H., Kevin, L., & Shi, Z. (2007). Distributed classification in peer-to-peer networks. In 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’07) Google Scholar
  8. MacKay, D. J. (1996). Bayesian methods for back propagation networks. Models of neural networks III (pp. 211–254). New York: SpringerCrossRefGoogle Scholar
  9. Odysseas, P., Siberski, W., & Siersdorfer, S. (2011). Collaborative classification over P2P networks. In 20th International Conference Companion on World Wide Web (WWW ’11) Google Scholar
  10. Tipping, M. E. (2001). Sparse bayesian learning and the relevance vector machine. Journal of Machine Learning Research, 1, 211–244.zbMATHMathSciNetGoogle Scholar
  11. Wolff, R., & Schuster, A. (2004). Association rule mining in peer-to-peer systems. Transactions on Systems, Man, and Cybernetics, Part B, 34(6), 2426–2438.CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  • Muhammad Umer Khan
    • 1
    Email author
  • Alexandros Nanopoulos
    • 2
  • Lars Schmidt-Thieme
    • 1
  1. 1.Information Systems and Machine Learning LabUniversity of HildesheimHildesheimGermany
  2. 2.University of EichstättIngolstadtGermany

Personalised recommendations