Metric-Based Similarity Search in Unstructured Peer-to-Peer Systems

  • Akrivi Vlachou
  • Christos Doulkeridis
  • Yannis Kotidis
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7100)

Abstract

Peer-to-peer systems constitute a promising solution for deploying novel applications, such as distributed image retrieval. Efficient search over widely distributed multimedia content requires techniques for distributed retrieval based on generic metric distance functions. In this paper, we propose a framework for distributed metric-based similarity search, where each participating peer stores its own data autonomously. In order to establish a scalable and efficient search mechanism, we adopt a super-peer architecture, where super-peers are responsible for query routing. We propose the construction of metric routing indices suitable for distributed similarity search in metric spaces. Furthermore, we present a query routing algorithm that exploits pruning techniques to selectively direct queries to super-peers and peers with relevant data. We study the performance of the proposed framework using both synthetic and real data demonstrate its scalability over a wide range of experimental setups.

Keywords

Query Processing Similarity Search Range Query Distribute Hash Table Query Object 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Banaei-Kashani, F., Shahabi, C.: SWAM: a family of access methods for similarity-search in peer-to-peer data networks. In: Proceedings of CIKM 2004, pp. 304–313 (2004)Google Scholar
  2. 2.
    Batko, M., Falchi, F., Lucchese, C., Novak, D., Perego, R., Rabitti, F., Sedmidubský, J., Zezula, P.: Building a web-scale image similarity search system. Multimedia Tools Appl. 47(3), 599–629 (2010)CrossRefGoogle Scholar
  3. 3.
    Batko, M., Gennaro, C., Zezula, P.: A Scalable Nearest Neighbor Search in P2P Systems. In: Ng, W.S., Ooi, B.-C., Ouksel, A.M., Sartori, C. (eds.) DBISP2P 2004. LNCS, vol. 3367, pp. 79–92. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  4. 4.
    Batko, M., Novak, D., Falchi, F., Zezula, P.: On scalability of the similarity search in the world of peers. In: Proceedings of International Conference on Scalable Information Systems (InfoScale), vol. 20 (2006)Google Scholar
  5. 5.
    Bawa, M., Condie, T., Ganesan, P.: LSH forest: self-tuning indexes for similarity search. In: Proceedings of WWW 2005, pp. 651–660 (2005)Google Scholar
  6. 6.
    Bharambe, A.R., Agrawal, M., Seshan, S.: Mercury: supporting scalable multi-attribute range queries. In: Proceedings of SIGCOMM 2004, pp. 353–366 (2004)Google Scholar
  7. 7.
    Chavez, E., Navarro, G., Baeza-Yates, R., Marroquin, J.L.: Searching in metric spaces. ACM Computing Surveys (CSUR) 33(3), 273–321 (2001)CrossRefGoogle Scholar
  8. 8.
    Ciaccia, P., Patella, M.: Bulk loading the M-tree. In: Proceedings of Australasian Database Conference (ADC), pp. 15–26 (1998)Google Scholar
  9. 9.
    Ciaccia, P., Patella, M., Zezula, P.: M-tree: An efficient access method for similarity search in metric spaces. In: Proceedings of the International Conference on Very Large Data Bases (VLDB), pp. 426–435 (1997)Google Scholar
  10. 10.
    Crainiceanu, A., Linga, P., Gehrke, J., Shanmugasundaram, J.: P-tree: a P2P index for resource discovery applications. In: Proceedings of WWW 2004 (2004)Google Scholar
  11. 11.
    Crainiceanu, A., Linga, P., Machanavajjhala, A., Gehrke, J., Shanmugasundaram, J.: P-ring: An efficient and robust p2p range index structure. In: Proceedings of SIGMOD, pp. 223–234 (2007)Google Scholar
  12. 12.
    Datta, A., Hauswirth, M., John, R., Schmidt, R., Aberer, K.: Range queries in trie-structured overlays. In: Proceedings of P2P 2005, pp. 57–66 (2005)Google Scholar
  13. 13.
    Dohnal, V., Sedmidubsky, J., Zezula, P., Novak, D.: Similarity searching: Towards bulk-loading peer-to-peer networks. In: Proceedings of International Workshop on Similarity Search and Applications (SISAP), pp. 87–94 (2008)Google Scholar
  14. 14.
    Doulkeridis, C., Vlachou, A., Kotidis, Y., Vazirgiannis, M.: Peer-to-peer similarity search in metric spaces. In: Proceedings of the International Conference on Very Large Data Bases (VLDB), pp. 986–997 (2007)Google Scholar
  15. 15.
    Doulkeridis, C., Vlachou, A., Kotidis, Y., Vazirgiannis, M.: Efficient range query processing in metric spaces over highly distributed data. Distributed and Parallel Databases 26(2-3), 155–180 (2009)CrossRefGoogle Scholar
  16. 16.
    Doulkeridis, C., Vlachou, A., Nørvåg, K., Kotidis, Y., Vazirgiannis, M.: Efficient search based on content similarity over self-organizing p2p networks. Peer-to-Peer Networking and Applications 3(1), 67–79 (2010)CrossRefGoogle Scholar
  17. 17.
    Falchi, F., Gennaro, C., Zezula, P.: A Content–Addressable Network for Similarity Search in Metric Spaces. In: Moro, G., et al. (eds.) DBISP2P 2005 and DBISP2P 2006. LNCS, vol. 4125, pp. 98–110. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  18. 18.
    Ganesan, P., Bawa, M., Garcia-Molina, H.: Online balancing of range-partitioned data with applications to peer-to-peer systems. In: Proceedings of VLDB 2004, pp. 444–455 (2004)Google Scholar
  19. 19.
    Hjaltason, G.R., Samet, H.: Index-driven similarity search in metric spaces. ACM Transactions on Database Systems (TODS) 28(4), 517–580 (2003)CrossRefGoogle Scholar
  20. 20.
    Jagadish, H.V., Ooi, B.C., Tan, K.-L., Yu, C., Zhang, R.: iDistance: An adaptive B + -tree based indexing method for nearest neighbor search. ACM Transactions on Database Systems (TODS) 30(2), 364–397 (2005)CrossRefGoogle Scholar
  21. 21.
    Jagadish, H.V., Ooi, B.C., Vu, Q.H.: Baton: a balanced tree structure for peer-to-peer networks. In: Proceedings of VLDB 2005, pp. 661–672 (2005)Google Scholar
  22. 22.
    Jagadish, H.V., Ooi, B.C., Vu, Q.H., Zhang, R., Zhou, A.: VBI-tree: A peer-to-peer framework for supporting multi-dimensional indexing schemes. In: Proceedings of ICDE 2006, vol. 34 (2006)Google Scholar
  23. 23.
    Kalnis, P., Ng, W.S., Ooi, B.C., Tan, K.-L.: Answering similarity queries in peer-to-peer networks. Inf. Syst. 31(1), 57–72 (2006)CrossRefGoogle Scholar
  24. 24.
    Liu, B., Lee, W.-C., Lee, D.L.: Supporting complex multi-dimensional queries in P2P systems. In: Proceedings of ICDCS 2005, pp. 155–164 (2005)Google Scholar
  25. 25.
    Novak, D., Batko, M., Zezula, P.: Large-scale similarity data management with distributed metric index. In: Information Processing and Management (2011)Google Scholar
  26. 26.
    Novak, D., Zezula, P.: M-Chord: a scalable distributed similarity search structure. In: Proceedings of International Conference on Scalable Information Systems (InfoScale), vol. 19 (2006)Google Scholar
  27. 27.
    Ntarmos, N., Pitoura, T., Triantafillou, P.: Range Query Optimization Leveraging Peer Heterogeneity in DHT Data Networks. In: Moro, G., Bergamaschi, S., Joseph, S., Morin, J.-H., Ouksel, A.M. (eds.) DBISP2P 2005 and DBISP2P 2006. LNCS, vol. 4125, pp. 111–122. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  28. 28.
    Ratnasamy, S., Francis, P., Handley, M., Karp, R., Schenker, S.: A scalable content-addressable network. In: Proceedings of Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM), pp. 161–172 (2001)Google Scholar
  29. 29.
    Shen, H.T., Shu, Y., Yu, B.: Efficient semantic-based content search in P2P network. IEEE Trans. Knowl. Data Eng. 16(7), 813–826 (2004)CrossRefGoogle Scholar
  30. 30.
    Shu, Y., Ooi, B.C., Tan, K.-L., Zhou, A.: Supporting multi-dimensional range queries in peer-to-peer systems. In: Proceedings of P2P 2005, pp. 173–180 (2005)Google Scholar
  31. 31.
    Stoica, I., Morris, R., Karger, D., Kaashoek, M.F., Balakrishnan, H.: Chord: A scalable peer-to-peer lookup service for internet applications. In: Proceedings of Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM), pp. 149–160 (2001)Google Scholar
  32. 32.
    Vlachou, A., Doulkeridis, C., Kotidis, Y.: Peer-to-Peer Similarity Search Based on M-Tree Indexing. In: Kitagawa, H., Ishikawa, Y., Li, Q., Watanabe, C. (eds.) DASFAA 2010. LNCS, vol. 5982, pp. 269–275. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  33. 33.
    Vlachou, A., Doulkeridis, C., Mavroeidis, D., Vazirgiannis, M.: Designing a Peer-to-Peer Architecture for Distributed Image Retrieval. In: Boujemaa, N., Detyniecki, M., Nürnberger, A. (eds.) AMR 2007. LNCS, vol. 4918, pp. 182–195. Springer, Heidelberg (2008)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Akrivi Vlachou
    • 1
  • Christos Doulkeridis
    • 1
  • Yannis Kotidis
    • 2
  1. 1.Dept. of Computer and Information ScienceNTNUTrondheimNorway
  2. 2.Dept. of InformaticsAUEBAthensGreece

Personalised recommendations