Skip to main content
Log in

CAPS: A cloud-assisted approach to handle spikes in peer-to-peer web search

  • Published:
Peer-to-Peer Networking and Applications Aims and scope Submit manuscript

Abstract

In search engines, popular news events cause huge spikes in the queries related to the events. In this work, we consider the stability issues caused by these spikes in a peer-to-peer web search engine formed using voluntary resources shared by peers. The requirement of providing top-ranking results from a dynamic index distinguishes web search from other classes of peer-to-peer search like object lookup and file search. This makes the traditional methods of load reduction based on caching and replication, proposed for peer-to-peer object lookup and file search, insufficient for containing interest spikes in peer-to-peer web search. We propose transient use of public cloud to maintain the stability of peer-to-peer web search engines during interest spikes. To the best of our knowledge, this is the first work proposing transient use of public cloud to handle spikes in peer-to-peer search. In the proposed architecture, CAPS, the responsibility of handling spiking queries is dynamically offloaded to public clouds during the spike period. The peer bandwidth to be used for transfer of relevant index from peers to cloud is decided considering the impact on other applications in the peer as well as the requirements of the search application. We show that transient use of public cloud can be performed without major adverse impact on the desirable properties of peer-to-peer search like privacy and decentralized control. Experimental evaluation under realistic settings show that cloud-assistance can be used effectively to handle spikes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. Yacy peer-to-peer web search engine, http://yacy.net/ Accessed: June 2014 Faroo peer-to-peer web search engine, http://www.faroo.com/ Accessed: June 2014

  2. http://www.faroo.com/hp/p2p/p2p.html Accessed: June 2014

  3. Google status message on query spike, http://twitter.com/#!/google/status/65502190315376640 Accessed: June 2014

  4. http://blogs.bing.com/search/2013/01/17/bing-social-updates-arrive-today-for-every-search-there-is-someone-who-can-help/ http://blogs.bing.com/search/2013/01/17/bing-social-updates-arrive-today-for-every-search-there-is-someone-who-can-help/Accessed: June 2014

  5. http://yacy.net/en/Technology.html Accessed: June 2014

  6. http://www.faroo.com/hp/p2p/faq.html#difference Accessed: June 2014

  7. https://blog.twitter.com/2013/new-tweets-per-second-record-and-how Accessed: June 2014

  8. https://blog.twitter.com/2013/new-tweets-per-second-record-and-how Accessed: June 2014

  9. https://blog.twitter.com/2011/twitter-search-now-3x-faster Accessed: June 2014

  10. Google enterprise blog: http://googleenterprise.blogspot.in/2012/07/introducing-google-cloud-platform.html Accessed: June 2014

  11. Amazon S3 pricing http://aws.amazon.com/s3/pricing/ Accessed: June 2014

  12. http://www.tekgazet.com/google-search-only-1000-maximum-search-results/net/1278.html http://www.tekgazet.com/google-search-only-1000-maximum-search-results/net/1278.html Accessed: June 2014

  13. Youtube official explanation, https://support.google.com/youtube/answer/91449?hl=en-GB Accessed: June 2014

  14. https://support.skype.com/en/faq/FA1417/how-much-bandwidth-does-skype-need Accessed: June 2014

  15. For this, the maximum bandwidth parameter (introduced as b h in Eq. 3) could be set as {total available bandwidth - (minimum/fixed bandwidth requirements of other applications)}.

  16. http://news.cnet.com/AOL-apologizes-for-release-of-user-search-data/2100-1030_3-6102793.html http://news.cnet.com/AOL-apologizes-for-release-of-user-search-data/2100-1030_3-6102793.html Accessed: June 2014

  17. “Tor bridges in the Amazon cloud”: https://cloud.torproject.org/ Accessed: June 2014

  18. http://www.faroo.com/hp/p2p/faq.html#difference Accessed: June 2014

References

  1. Fcc sixth broadband deployment report (2014). http://hraunfoss.fcc.gov/edocs_public/attachmatch/FCC-10-129A1.pdf

  2. Google search appliance protocol reference (2014). http://www.google.com/support/enterprise/static/gsa/docs/admin/70/gsa_doc_set/xml_reference/request_format.html#1076700

  3. Chawathe Y, Ratnasamy S, Breslau L, Lanham N, Shenker S (2003) Making gnutella-like p2p systems scalable. In:Proceedings of the 2003 conference on applications, technologies, architectures, and protocols for computer communications, SIGCOMM ’03, pp 407–418. ACM, New York. doi:10.1145/863955.864000

  4. Chen H, Jin H, Luo X, Liu Y, Gu T, Chen K, Ni L (2012) Bloomcast: Efficient and effective full-text retrieval in unstructured p2p networks. Parallel and Distributed Systems. IEEE Trans 23(2):232–241. doi:10.1109/TPDS.2011.168

    Google Scholar 

  5. Chen H, Yan J, Jin H, Liu Y, Ni L (2010) Tss: Efficient term set search in large peer-to-peer textual collections. Comput IEEE Trans 59(7):969–980. doi:10.1109/TC.2010.81

    Article  MathSciNet  Google Scholar 

  6. Chierichetti F, Kumar R, Raghavan P (2009) Compressed web indexes. In: Proceedings of the 18th international conference on world wide web, WWW ’09, pp. 451–460. ACM, New York. doi:10.1145/1526709.1526770

  7. Cole JA, Stewart J (2001) Single Variable Calculus: Concepts and Contexts.Brooks/Cole Publishing Company

  8. Dharanipragada J, Haridas H (2012) Stabilizing peer-to-peer systems using public cloud: A case study of peer-to-peer search. In: 11th international symposium on parallel and distributed computing (ISPDC). IEEE, pp 135–142. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6341504

  9. Gwo-Hshiung T, Tzeng GH, Huang JJ (2011) Multiple attribute decision making: Methods and applications. CRC Press

  10. Hsiao HC, Su HW (2012) On optimizing overlay topologies for search in unstructured peer-to-peer networks. Parallel and Distributed Systems. IEEE Trans 23(5):924–935. doi:10.1109/TPDS.2011.241

    Google Scholar 

  11. Huang C, Li J, Ross KW (2007) Peer-Assisted VoD: making internet video distribution cheap. In: IPTPS

  12. Li J, Loo B, Hellerstein J, Kaashoek M, Karger D, Morris R (2003) On the feasibility of peer-to-peer web indexing and search. In: Kaashoek M, Stoica I (eds) Peer-to-Peer Systems II, Lecture Notes in Computer Science, vol 2735. Springer Berlin Heidelberg, pp 207–215

  13. Lopes N, Baquero C Taming hot-spots in dht inverted indexes. In: ACM SIGIR Workshop on large-scale distributed systems for information retrieval. ACM 2007

  14. Mager T, Biersack E, Michiardi P (2012) A measurement study of the wuala on-line storage service. In: IEEE 12th international conference on peer-to-peer computing (P2P). IEEE, pp 237–248

  15. Montresor A, Abeni L (2011) Cloudy weather for p2p, with a chance of gossip. In: P2P

  16. Raiciu C, Huici F, Handley M, Rosenblum DS (2009) Roar: increasing the flexibility and performance of distributed search. In: Proceedings of the ACM SIGCOMM 2009 conference on data communication, SIGCOMM ’09, pp. 291–302. ACM, New York. doi:10.1145/1592568.1592603

  17. Ramasubramanian V, Sirer EG (2004) Beehive: O (1) lookup performance for power-law query distributions in peer-to-peer overlays. In: NSDI, vol 4

  18. Reynolds P, Vahdat A (2003) Efficient peer-to-peer keyword searching.In: Proceedings of the ACM/IFIP/USENIX 2003 international conference on middleware, Middleware ’03. Springer-Verlag New York, Inc, New York, pp 21–40. http://dl.acm.org/citation.cfm?id=1515915.1515918

    Google Scholar 

  19. Ripeanu M (2001) Peer-to-peer architecture case study: Gnutella network. In: Proceedings first international conference on peer-to-peer computing, 2001., pp 99–100

  20. Risson J, Moors T (2006) Survey of research towards robust peer-to-peer networks: search methods. Comput Netw 50:3485–3521. doi:10.1016/j.comnet.2006.02.001

    Article  MATH  Google Scholar 

  21. Skobeltsyn G, Luu T, Zarko IP, Rajman M, Aberer K (2007) Web text retrieval with a p2p query-driven index. In: SIGIR

  22. Skobeltsyn G, Luu T, Zarko IP, Rajman M, Aberer K (2007) Web text retrieval with a p2p query-driven index. In: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR ’07, pp 679–686. ACM, New York. doi:10.1145/1277741.1277857

  23. Stading T, Maniatis P, Baker M (2002) Peer-to-peer caching schemes to address flash crowds. In: Druschel P, Kaashoek F, Rowstron A (eds) Peer-to-Peer Systems, Lecture Notes in Computer Science, vol 2429. Springer Berlin Heidelberg, pp 203–213

  24. Stoica I, Morris R, Liben-Nowell D, Karger DR, Kaashoek MF, Dabek F, Balakrishnan H (2003) Chord: a scalable peer-to-peer lookup protocol for internet applications. IEEE/ACM Trans Networking 11(1):17–32

    Article  Google Scholar 

  25. Tigelaar AS, Hiemstra D (2011) Query load balancing by caching search results in peer-to-peer information retrieval networks. In: DIR

  26. Tigelaar AS, Hiemstra D, Trieschnigg D (2012) Peer-to-peer information retrieval: an overview. ACM Trans Inf Syst 30(2):9:1–9:34. doi:10.1145/2180868.2180871

    Article  Google Scholar 

  27. Toka L, Dell’Amico M, Michiardi P (2010) Online data backup: a peer-assisted approach. In: P2P. doi:10.1109/P2P.2010.5570003

  28. Xiong L, Agichtein E (2007) Towards privacy-preserving query log publishing. In: Query log analysis: social and technological challenges workshop in WWW

  29. Zhang J, Suel T (2005) Efficient query evaluation on large textual collections in a peer-to-peer environment. In: a peer-to- peer environment. In: Fifth IEEE international conference on peer-to-peer computing, P2P 2005. IEEE, pp 225–233

Download references

Acknowledgments

The authors would like to thank Department of Science and Technology (DST), Government of India for supporting the work. We also thank Sriram Kailsam and Balaji S J for discussions on the solution.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Harisankar Haridas.

Additional information

An earlier version of this work was presented in a conference [8]. Extensions over the earlier version include an enhanced algorithm for deadline-based switching during spikes and evaluation of the same, study of implications of alternate design choices and solution approaches and study of privacy implications.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Haridas, H., Dharanipragada, J. CAPS: A cloud-assisted approach to handle spikes in peer-to-peer web search. Peer-to-Peer Netw. Appl. 9, 193–208 (2016). https://doi.org/10.1007/s12083-014-0322-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12083-014-0322-y

Keywords

Navigation