Abstract
In search engines, popular news events cause huge spikes in the queries related to the events. In this work, we consider the stability issues caused by these spikes in a peer-to-peer web search engine formed using voluntary resources shared by peers. The requirement of providing top-ranking results from a dynamic index distinguishes web search from other classes of peer-to-peer search like object lookup and file search. This makes the traditional methods of load reduction based on caching and replication, proposed for peer-to-peer object lookup and file search, insufficient for containing interest spikes in peer-to-peer web search. We propose transient use of public cloud to maintain the stability of peer-to-peer web search engines during interest spikes. To the best of our knowledge, this is the first work proposing transient use of public cloud to handle spikes in peer-to-peer search. In the proposed architecture, CAPS, the responsibility of handling spiking queries is dynamically offloaded to public clouds during the spike period. The peer bandwidth to be used for transfer of relevant index from peers to cloud is decided considering the impact on other applications in the peer as well as the requirements of the search application. We show that transient use of public cloud can be performed without major adverse impact on the desirable properties of peer-to-peer search like privacy and decentralized control. Experimental evaluation under realistic settings show that cloud-assistance can be used effectively to handle spikes.
Similar content being viewed by others
Notes
Yacy peer-to-peer web search engine, http://yacy.net/ Accessed: June 2014 Faroo peer-to-peer web search engine, http://www.faroo.com/ Accessed: June 2014
http://www.faroo.com/hp/p2p/p2p.html Accessed: June 2014
Google status message on query spike, http://twitter.com/#!/google/status/65502190315376640 Accessed: June 2014
http://yacy.net/en/Technology.html Accessed: June 2014
http://www.faroo.com/hp/p2p/faq.html#difference Accessed: June 2014
https://blog.twitter.com/2013/new-tweets-per-second-record-and-how Accessed: June 2014
https://blog.twitter.com/2013/new-tweets-per-second-record-and-how Accessed: June 2014
https://blog.twitter.com/2011/twitter-search-now-3x-faster Accessed: June 2014
Google enterprise blog: http://googleenterprise.blogspot.in/2012/07/introducing-google-cloud-platform.html Accessed: June 2014
Amazon S3 pricing http://aws.amazon.com/s3/pricing/ Accessed: June 2014
Youtube official explanation, https://support.google.com/youtube/answer/91449?hl=en-GB Accessed: June 2014
https://support.skype.com/en/faq/FA1417/how-much-bandwidth-does-skype-need Accessed: June 2014
For this, the maximum bandwidth parameter (introduced as b h in Eq. 3) could be set as {total available bandwidth - (minimum/fixed bandwidth requirements of other applications)}.
“Tor bridges in the Amazon cloud”: https://cloud.torproject.org/ Accessed: June 2014
http://www.faroo.com/hp/p2p/faq.html#difference Accessed: June 2014
References
Fcc sixth broadband deployment report (2014). http://hraunfoss.fcc.gov/edocs_public/attachmatch/FCC-10-129A1.pdf
Google search appliance protocol reference (2014). http://www.google.com/support/enterprise/static/gsa/docs/admin/70/gsa_doc_set/xml_reference/request_format.html#1076700
Chawathe Y, Ratnasamy S, Breslau L, Lanham N, Shenker S (2003) Making gnutella-like p2p systems scalable. In:Proceedings of the 2003 conference on applications, technologies, architectures, and protocols for computer communications, SIGCOMM ’03, pp 407–418. ACM, New York. doi:10.1145/863955.864000
Chen H, Jin H, Luo X, Liu Y, Gu T, Chen K, Ni L (2012) Bloomcast: Efficient and effective full-text retrieval in unstructured p2p networks. Parallel and Distributed Systems. IEEE Trans 23(2):232–241. doi:10.1109/TPDS.2011.168
Chen H, Yan J, Jin H, Liu Y, Ni L (2010) Tss: Efficient term set search in large peer-to-peer textual collections. Comput IEEE Trans 59(7):969–980. doi:10.1109/TC.2010.81
Chierichetti F, Kumar R, Raghavan P (2009) Compressed web indexes. In: Proceedings of the 18th international conference on world wide web, WWW ’09, pp. 451–460. ACM, New York. doi:10.1145/1526709.1526770
Cole JA, Stewart J (2001) Single Variable Calculus: Concepts and Contexts.Brooks/Cole Publishing Company
Dharanipragada J, Haridas H (2012) Stabilizing peer-to-peer systems using public cloud: A case study of peer-to-peer search. In: 11th international symposium on parallel and distributed computing (ISPDC). IEEE, pp 135–142. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6341504
Gwo-Hshiung T, Tzeng GH, Huang JJ (2011) Multiple attribute decision making: Methods and applications. CRC Press
Hsiao HC, Su HW (2012) On optimizing overlay topologies for search in unstructured peer-to-peer networks. Parallel and Distributed Systems. IEEE Trans 23(5):924–935. doi:10.1109/TPDS.2011.241
Huang C, Li J, Ross KW (2007) Peer-Assisted VoD: making internet video distribution cheap. In: IPTPS
Li J, Loo B, Hellerstein J, Kaashoek M, Karger D, Morris R (2003) On the feasibility of peer-to-peer web indexing and search. In: Kaashoek M, Stoica I (eds) Peer-to-Peer Systems II, Lecture Notes in Computer Science, vol 2735. Springer Berlin Heidelberg, pp 207–215
Lopes N, Baquero C Taming hot-spots in dht inverted indexes. In: ACM SIGIR Workshop on large-scale distributed systems for information retrieval. ACM 2007
Mager T, Biersack E, Michiardi P (2012) A measurement study of the wuala on-line storage service. In: IEEE 12th international conference on peer-to-peer computing (P2P). IEEE, pp 237–248
Montresor A, Abeni L (2011) Cloudy weather for p2p, with a chance of gossip. In: P2P
Raiciu C, Huici F, Handley M, Rosenblum DS (2009) Roar: increasing the flexibility and performance of distributed search. In: Proceedings of the ACM SIGCOMM 2009 conference on data communication, SIGCOMM ’09, pp. 291–302. ACM, New York. doi:10.1145/1592568.1592603
Ramasubramanian V, Sirer EG (2004) Beehive: O (1) lookup performance for power-law query distributions in peer-to-peer overlays. In: NSDI, vol 4
Reynolds P, Vahdat A (2003) Efficient peer-to-peer keyword searching.In: Proceedings of the ACM/IFIP/USENIX 2003 international conference on middleware, Middleware ’03. Springer-Verlag New York, Inc, New York, pp 21–40. http://dl.acm.org/citation.cfm?id=1515915.1515918
Ripeanu M (2001) Peer-to-peer architecture case study: Gnutella network. In: Proceedings first international conference on peer-to-peer computing, 2001., pp 99–100
Risson J, Moors T (2006) Survey of research towards robust peer-to-peer networks: search methods. Comput Netw 50:3485–3521. doi:10.1016/j.comnet.2006.02.001
Skobeltsyn G, Luu T, Zarko IP, Rajman M, Aberer K (2007) Web text retrieval with a p2p query-driven index. In: SIGIR
Skobeltsyn G, Luu T, Zarko IP, Rajman M, Aberer K (2007) Web text retrieval with a p2p query-driven index. In: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR ’07, pp 679–686. ACM, New York. doi:10.1145/1277741.1277857
Stading T, Maniatis P, Baker M (2002) Peer-to-peer caching schemes to address flash crowds. In: Druschel P, Kaashoek F, Rowstron A (eds) Peer-to-Peer Systems, Lecture Notes in Computer Science, vol 2429. Springer Berlin Heidelberg, pp 203–213
Stoica I, Morris R, Liben-Nowell D, Karger DR, Kaashoek MF, Dabek F, Balakrishnan H (2003) Chord: a scalable peer-to-peer lookup protocol for internet applications. IEEE/ACM Trans Networking 11(1):17–32
Tigelaar AS, Hiemstra D (2011) Query load balancing by caching search results in peer-to-peer information retrieval networks. In: DIR
Tigelaar AS, Hiemstra D, Trieschnigg D (2012) Peer-to-peer information retrieval: an overview. ACM Trans Inf Syst 30(2):9:1–9:34. doi:10.1145/2180868.2180871
Toka L, Dell’Amico M, Michiardi P (2010) Online data backup: a peer-assisted approach. In: P2P. doi:10.1109/P2P.2010.5570003
Xiong L, Agichtein E (2007) Towards privacy-preserving query log publishing. In: Query log analysis: social and technological challenges workshop in WWW
Zhang J, Suel T (2005) Efficient query evaluation on large textual collections in a peer-to-peer environment. In: a peer-to- peer environment. In: Fifth IEEE international conference on peer-to-peer computing, P2P 2005. IEEE, pp 225–233
Acknowledgments
The authors would like to thank Department of Science and Technology (DST), Government of India for supporting the work. We also thank Sriram Kailsam and Balaji S J for discussions on the solution.
Author information
Authors and Affiliations
Corresponding author
Additional information
An earlier version of this work was presented in a conference [8]. Extensions over the earlier version include an enhanced algorithm for deadline-based switching during spikes and evaluation of the same, study of implications of alternate design choices and solution approaches and study of privacy implications.
Rights and permissions
About this article
Cite this article
Haridas, H., Dharanipragada, J. CAPS: A cloud-assisted approach to handle spikes in peer-to-peer web search. Peer-to-Peer Netw. Appl. 9, 193–208 (2016). https://doi.org/10.1007/s12083-014-0322-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12083-014-0322-y