Abstract
Peer to peer databases are becoming prevalent on the Internet for distribution and sharing of documents, applications, and other digital media. The problem of answering large-scale ad hoc analysis queries, such as aggregation queries, these databases poses unique challenges. Exact solutions can be time consuming and difficult to implement, given the distributed and dynamic nature of P2P databases. In this paper, we present novel sampling-based techniques for approximate answering of ad hoc aggregation queries in such databases. Computing a high-quality random sample of the database efficiently in the P2P environment is complicated due to several factors: the data is distributed across many peers, within each peer, the data is often highly correlated, and, moreover, even collecting a random sample of the peers is difficult to accomplish. To counter these problems, we have developed an adaptive two-phase sampling approach based on random walks of the P2P graph, as well as block-level sampling techniques. We present extensive experimental evaluations to demonstrate the feasibility of our proposed solution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Acharya, S., Gibbons, P.B., Poosala, V.: Aqua: A Fast Decision Support System Using Approximate Query Answers. In: Proc. 25th Int’l Conf. Very Large a data Bases, VLDB 1999 (1999)
Adamic, L., Lukose, R., Puniyani, A., Huberman, B.: Search in Power-Law Networks. Physical Rev. E (2001)
Babcock, B., Chaudhuri, S., Das, G.: Dynamic Sample Selection for Approximate Query Processing. In: Proc. 22nd ACM SIGMOD Int’l Conf. Management of Data (SIGMOD 2003), pp. 539–550 (2003)
Bharambe, A.R., Agrawal, M., Seshan, S.: Mercury: Supporting Scalable Multi-Attribute Range Queries. In: Proc. ACM Ann. Conf. Applications, Technologies, Architectures, and Protocols for Computer Comm., SIGCOMM 2004 (2004)
Boyd, S., Ghosh, A., Prabhakar, B., Shah, D.: Analysis and Optimization of Randomized Gossip Algorithms. In: Proc. 43rd IEEE Conf. Decision and Control, CDC 2004 (2004)
Boyd, S., Ghosh, A., Prabhakar, B., Shah, D.: Gossip and Mixing Times of Random Walks on Random Graphs. In: Proc. IEEE INFOCOM 2005 (2005)
Charikar, M., Chaudhuri, S., Motwani, R., Narasayya, V.: Towards Estimation Error Guarantees for Distinct Values. In: Proc. 19th ACM Symp. Principles of Database Systems, PODS 2000 (2000)
Chaudhuri, S., Das, G., Datar, M., Motwani, R., Narasayya, V.: Overcoming Limitations of Sampling for Aggregation Queries. In: Proc. 17th IEEE Int’l Conf. Data Eng. (ICDE 2001), pp. 534–542 (2001)
Chaudhuri, S., Motwani, R., Narasayya, V.: Random Sampling for Histogram Construction: How Much Is Enough. In: Proc. ACM SIGMOD Int’l Conf. Management of Data (SIGMOD 1998), pp. 436–447 (1998)
Chaudhuri, S., Das, G., Narasayya, V.: A Robust Optimization- Based Approach for Approximate Answering of Aggregate Queries. In: Proc. 20th ACM SIGMOD Int’l Conf. Management of Data, SIGMOD 2001 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Saravanan, R., Vivekananth, P. (2012). Range of Query Processing in Peer to Peer Networks. In: Zhang, T. (eds) Instrumentation, Measurement, Circuits and Systems. Advances in Intelligent and Soft Computing, vol 127. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27334-6_1
Download citation
DOI: https://doi.org/10.1007/978-3-642-27334-6_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-27333-9
Online ISBN: 978-3-642-27334-6
eBook Packages: EngineeringEngineering (R0)