Abstract
Top-k queries are attractive for users in P2P systems with very large numbers of peers but difficult to support efficiently. In this paper, we propose a fully distributed algorithm for executing Top-k queries in the context of the APPA (Atlas Peer-to-Peer Architecture) data management system. APPA has a network-independent architecture that can be implemented over various P2P networks. Our algorithm requires no global information, does not depend on the existence of certain peers and its bandwidth cost is low. We validated our algorithm through implementation over a 64-node cluster and simulation using the BRITE topology generator and SimJava. Our performance evaluation shows that our algorithm has logarithmic scale up and improves Top-k query response time very well using P2P parallelism in comparison with baseline algorithms.
Work partially funded by ARA Massive Data of the French ministry of research.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aberer, K., Wu, J.: Framework for Decentralized Ranking in Web Information Retrieval. In: Zhou, X., Zhang, Y., Orlowska, M.E. (eds.) APWeb 2003. LNCS, vol. 2642, Springer, Heidelberg (2003)
Abiteboul, S., et al.: Dynamic XML documents with distribution and replication. In: SIGMOD Conf. (2003)
Akbarinia, R., et al.: Design and Implementation of Atlas P2P Architecture. In: Baldoni, R., Cortese, G., Davide, F. (eds.) Global Data Management, IOS Press, Amsterdam (2006)
Akbarinia, R., et al.: Replication and Query Processing in the APPA Data Management System. In: 6th Workshop on Distributed Data & Structures (WDAS) (2004)
BRITE, http://www.cs.bu.edu/brite/
Carey, M.J., Kossmann, D.: On saying ‘Enough Already!’. In: SIGMOD Conf. (1997)
Chaudhuri, S., et al.: Evaluating Top-k Selection queries. In: VLDB Conf. (1999)
Cuenca-Acuna, F.M., et al.: PlanetP: Using Gossiping to Build Content Addressable Peer-to-Peer Information Sharing Communities. In: IEEE Int. Symp. on High Performance Distributed Computing (HPDC), IEEE Computer Society Press, Los Alamitos (2003)
Fagin, R., Lotem, J., Naor, M.: Optimal aggregation algorithms for middleware. J. Comput. Syst. Sci. 66(4) (2003)
Gnutella. http://www.gnutelliums.com/
Howell, F., McNab, R.: SimJava: a discrete event simulation package for Java with applications in computer systems modeling. In: Int. Conf. on Web-based Modelling and Simulation, Society for Computer Simulation, San Diego (1998)
Huebsch, R., et al.: Querying the Internet with PIER. In: VLDB Conf. (2003)
Kazaa. http://www.kazaa.com/
Michel, S., Triantafillou, P., Weikum, G.: KLEE: A Framework for Distributed Top-k Query Algorithms. In: VLDB Conf. (2005)
Ooi, B., Shu, Y., Tan, K.-L.: Relational data sharing in peer-based data management systems. SIGMOD Record 32(3) (2003)
Ratnasamy, S., et al.: A scalable content-addressable network. In: Proc. of SIGCOMM (2001)
Ripeanu, M., Foster, I.: Mapping the gnutella network: Macroscopic properties of large-scale peer-to-peer systems. In: Druschel, P., Kaashoek, M.F., Rowstron, A. (eds.) IPTPS 2002. LNCS, vol. 2429, Springer, Heidelberg (2002)
Saroiu, S., Gummadi, P., Gribble, S.: A Measurement Study of Peer-to-Peer File Sharing Systems. In: Proc. of Multimedia Computing and Networking (MMCN) (2002)
Siberski, W., Thaden, U.: A Simulation Framework for Schema-Based Query Routing in P2P-Networks. In: Lindner, W., et al. (eds.) EDBT 2004. LNCS, vol. 3268, Springer, Heidelberg (2004)
Tatarinov, I., et al.: The Piazza peer data management project. SIGMOD Record 32(3) (2003)
Thaden, U., et al.: Top-k query Evaluation for Schema-Based Peer-To-Peer Networks. In: McIlraith, S.A., Plexousakis, D., van Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298, Springer, Heidelberg (2004)
Theobald, M., Weikum, G., Schenkel, R.: Top-k Query Evaluation with Probabilistic Guarantees. In: VLDB Conf. (2004)
Yang, B., Garcia-Molina, H.: Designing a super-peer network. In: Int. Conf. on Data Engineering (2003)
Yu, C., et al.: Databases Selection for Processing k Nearest Neighbors Queries in Distributed Environments. In: ACM/IEEE-CS joint Conf. on DL, IEEE Computer Society Press, Los Alamitos (2001)
Yu, C., Philip, G., Meng, W.: Distributed Top-n Query Processing with Possibly Uncooperative Local Systems. In: VLDB Conf. (2003)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Akbarinia, R., Martins, V., Pacitti, E., Valduriez, P. (2007). Top-k Query Processing in the APPA P2P System. In: Daydé, M., Palma, J.M.L.M., Coutinho, Á.L.G.A., Pacitti, E., Lopes, J.C. (eds) High Performance Computing for Computational Science - VECPAR 2006. VECPAR 2006. Lecture Notes in Computer Science, vol 4395. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71351-7_13
Download citation
DOI: https://doi.org/10.1007/978-3-540-71351-7_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71350-0
Online ISBN: 978-3-540-71351-7
eBook Packages: Computer ScienceComputer Science (R0)