P2P RVM for Distributed Classification
In recent years there is an increasing interest for analytical methods that learn patterns over large-scale data distributed over Peer-to-Peer (P2P) networks and support applications. Mining patterns in such distributed and dynamic environment is a challenging task, because centralization of data is not feasible. In this paper, we have proposed a distributed classification technique based on relevance vector machines (RVM) and local model exchange among neighboring peers in a P2P network. In such networks, the evaluation criteria for an efficient distributed classification algorithm is based on the size of resulting local models (communication efficiency) and their prediction accuracy. RVM utilizes dramatically fewer kernel functions than a state-of-the-art “support vector machine” (SVM), while demonstrating comparable generalization performance. This makes RVM a suitable choice to learn compact and accurate local models at each peer in a P2P network. Our model propagation approach, exchange resulting models with peers in a local neighborhood to produce more accurate network wide global model, while keeping the communication cost low throughout the network. Through extensive experimental evaluations, we demonstrate that by using more relevant and compact models, our approach outperforms the baseline model propagation approaches in terms of accuracy and communication cost.
This work is funded by the Seventh Framework Program of European Commission, through the project REDUCTION (No. 288254). www.reduction-project.eu.
- Ang, H.-H., Gopalkrishnan, V., Hoi, S. C., & Ng, W. W. (2008). Cascade RSVM in Peer-to-Peer Networks. In European Conference on Machine Learning and Knowledge Discovery in Databases.Google Scholar
- Caruana, G., & Li, M. (2012). A survey of emerging approaches to spam filtering. ACM Computing Surveys, 44(2), Article 9, 27.Google Scholar
- Chang, C.-C., & Lin, C.-J., LIBSVM. (2011). A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2, 27:1–27:27.Google Scholar
- Lee, Y.-J., & Mangasarian, O. L.(2001). RSVM: Reduced support vector machines. In First SIAM International Conference on Data Mining, 5–7.Google Scholar
- Luo, P., Xiong, H., Kevin, L., & Shi, Z. (2007). Distributed classification in peer-to-peer networks. In 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’07) Google Scholar
- Odysseas, P., Siberski, W., & Siersdorfer, S. (2011). Collaborative classification over P2P networks. In 20th International Conference Companion on World Wide Web (WWW ’11) Google Scholar