Advertisement

A DHT-Based System for the Management of Loosely Structured, Multidimensional Data

  • Athanasia Asiki
  • Dimitrios Tsoumakos
  • Nectarios Koziris
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7600)

Abstract

In this paper we present LinkedPeers, a DHT-based system designed for efficient distribution and processing of multidimensional, loosely structured data over a Peer-to-Peer overlay. Each dimension is further annotated with the use of concept hierarchies. The system design aims at incorporating two important features, namely large-scale support for partially-structured data and high-performance, distributed query processing including multiple aggregates. To enable the efficient resolution of such queries, LinkedPeers utilizes a conceptual chain of DHT rings that stores data in a hierarchy-preserving manner. Moreover, adaptive mechanisms detect dynamic changes in the query workloads and adjust the granularity of the indexing on a per node basis. The pre-computation of possible future queries is also performed during the resolution of an incoming query. Extensive experiments prove that our system is very efficient achieving over 85% precision in answering queries while minimizing communication cost and adapting its indexing to the incoming queries.

Keywords

Resource Description Framework Primary Ring Distribute Hash Table Primary Dimension Query Response Time 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Data, L.: Connect Distributed Data across the Web, http://linkeddata.org/
  2. 2.
    Balakrishnan, H., Kaashoek, M.F., Karger, D., Morris, R., Stoica, I.: Looking up data in p2p systems. Commun. ACM 46, 43–48 (2003), http://doi.acm.org/10.1145/606272.606299 CrossRefGoogle Scholar
  3. 3.
    Stoica, I., Morris, R., Karger, D., Kaashoek, F., Balakrishnan, H.: Chord: A Scalable Peer-To-Peer Lookup Service for Internet Applications. In: Proceedings of the 2001 ACM SIGCOMM Conference, San Diego, USA, pp. 149–160 (August 2001)Google Scholar
  4. 4.
    Rowstron, A., Druschel, P.: Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems. In: Guerraoui, R. (ed.) Middleware 2001. LNCS, vol. 2218, pp. 329–350. Springer, Heidelberg (2001), http://dl.acm.org/citation.cfm?id=646591.697650 CrossRefGoogle Scholar
  5. 5.
    Maymounkov, P., Mazières, D.: Kademlia: A Peer-to-Peer Information System Based on the XOR Metric. In: Druschel, P., Kaashoek, M.F., Rowstron, A. (eds.) IPTPS 2002. LNCS, vol. 2429, pp. 53–65. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  6. 6.
    Ratnasamy, S., Francis, P., Handley, M., Karp, R., Schenker, S.: A scalable content-addressable network. In: Proceedings of the 2001 ACM SIGCOMM Conference, San Diego, USA, pp. 161–172 (August 2001)Google Scholar
  7. 7.
    Lv, Q., Cao, P., Cohen, E., Li, K., Shenker, S.: Search and replication in unstructured peer-to-peer networks. In: Proceedings of the 16th International Conference on Supercomputing, ICS 2002, pp. 84–95. ACM, New York (2002), http://doi.acm.org/10.1145/514191.514206 CrossRefGoogle Scholar
  8. 8.
    Asiki, A., Tsoumakos, D., Koziris, N.: Distributing and searching concept hierarchies: An adaptive dht-based system. Cluster Computing 13, 257–276 (2010)CrossRefGoogle Scholar
  9. 9.
    Bizer, C., Heath, T., Berners-Lee, T.: Linked Data - The Story So Far. Int. Journal on Semantic Web and Information Systems, IJSWIS (2009)Google Scholar
  10. 10.
    RDF, Resource Description Framework(RDF), http://www.w3.org/RDF/
  11. 11.
    SPARQL, SPARQL Query Language for RDF, http://www.w3.org/TR/rdf-sparql-query/
  12. 12.
    Bizer, C., Lehmann, J., Kobilarov, G., Auer, S., Becker, C., Cyganiak, R., Hellmann, S.: Dbpedia - a crystallization point for the web of data. Web Semant. 7, 154–165 (2009)CrossRefGoogle Scholar
  13. 13.
    Halpin, H.: A query-driven characterization of linked data. In: LDOW (2009)Google Scholar
  14. 14.
  15. 15.
    apb, OLAP Council APB-1 OLAP Benchmark, http://www.olapcouncil.org/research/resrchly.htm
  16. 16.
  17. 17.
    O.-S. E. Virtuoso, Version 6.1, http://www.openlinksw.com/wiki/main/Main
  18. 18.
  19. 19.
    Guo, Y., Pan, Z., Heflin, J.: An Evaluation of Knowledge Base Systems for Large OWL Datasets. In: McIlraith, S.A., Plexousakis, D., van Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298, pp. 274–288. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  20. 20.
    Huebsch, R., Hellerstein, J., Boon, N.L., Loo, T., Shenker, S., Stoica, I.: Querying the Internet with PIER. In: VLDB (2003)Google Scholar
  21. 21.
    Tatarinov, I., Halevy, A.: Efficient query reformulation in peer data management systems. In: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, SIGMOD 2004, pp. 539–550. ACM, New York (2004)CrossRefGoogle Scholar
  22. 22.
    Ooi, B.C., Tan, K.-L., Zhou, A., Goh, C.H., Li, Y., Liau, C.Y., Ling, B., Ng, W.S., Shu, Y., Wang, X., Zhang, M.: Peerdb: Peering into personal databases. In: SIGMOD Conference, p. 659 (2003)Google Scholar
  23. 23.
    Wu, S., Li, J., Ooi, B.C., Tan, K.-L.: Just-in-time query retrieval over partially indexed data on structured p2p overlays. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, SIGMOD 2008, pp. 279–290. ACM, New York (2008)CrossRefGoogle Scholar
  24. 24.
    Wu, S., Jiang, S., Ooi, B.C., Tan, K.-L.: Distributed online aggregations. In: Proc. VLDB Endow., vol. 2, pp. 443–454 (August 2009)Google Scholar
  25. 25.
    Schmidt, C., Parashar, M.: Enabling flexible queries with guarantees in p2p systems. IEEE Internet Computing 8, 19–26 (2004)CrossRefGoogle Scholar
  26. 26.
    Lee, J., Lee, H., Kang, S., Kim, S.M., Song, J.: CISS: An efficient object clustering framework for DHT-based peer-to-peer applications. Computer Networks 51(4), 1072–1094 (2007)zbMATHCrossRefGoogle Scholar
  27. 27.
    Ganesan, P., Yang, B., Garcia-Molina, H.: One torus to rule them all: multi-dimensional queries in p2p systems. In: Proceedings of the 7th International Workshop on the Web and Databases: Colocated with ACM SIGMOD/PODS, WebDB 2004, pp. 19–24. ACM, New York (2004)CrossRefGoogle Scholar
  28. 28.
    Hose, K., Schenkel, R., Theobald, M., Weikum, G.: Database Foundations for Scalable RDF Processing. In: Polleres, A., d’Amato, C., Arenas, M., Handschuh, S., Kroner, P., Ossowski, S., Patel-Schneider, P. (eds.) Reasoning Web 2011. LNCS, vol. 6848, pp. 202–249. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  29. 29.
    Harris, S., Gibbins, N.: 3store: Efficient bulk RDF storage. In: Proceedings of the 1st International Workshop on Practical and Scalable Semantic Systems (PSSS 2003), Citeseer, pp. 1–20 (2003)Google Scholar
  30. 30.
    Neumann, T., Weikum, G.: The rdf-3x engine for scalable management of rdf data. The VLDB Journal 19, 91–113 (2010)CrossRefGoogle Scholar
  31. 31.
    Haase, P., Mathäß, T., Ziller, M.: An evaluation of approaches to federated query processing over linked data. In: Proceedings of the 6th International Conference on Semantic Systems, I-SEMANTICS 2010, pp. 5:1–5:9. ACM, New York (2010)Google Scholar
  32. 32.
    Harth, A., Hose, K., Karnstedt, M., Polleres, A., Sattler, K.-U., Umbrich, J.: Data summaries for on-demand queries over linked data. In: Proceedings of the 19th International Conference on World Wide Web, WWW 2010, pp. 411–420. ACM, New York (2010)CrossRefGoogle Scholar
  33. 33.
    Cai, M., Frank, M.: Rdfpeers: a scalable distributed rdf repository based on a structured peer-to-peer network. In: Proceedings of the 13th International Conference on World Wide Web, WWW 2004, pp. 650–657. ACM, New York (2004)CrossRefGoogle Scholar
  34. 34.
    Cai, M., Frank, M., Chen, J., Szekely, P.: Maan: A multi-attribute addressable network for grid information services. Journal of Grid Computing 2, 3–14 (2004), doi:10.1007/s10723-004-1184-yzbMATHCrossRefGoogle Scholar
  35. 35.
    Kaoudi, Z., Koubarakis, M., Kyzirakos, K., Miliaraki, I., Magiridou, M., Papadakis-Pesaresi, A.: Atlas: Storing, updating and querying rdf(s) data on top of dhts. Web Semant. 8, 271–277 (2010)CrossRefGoogle Scholar
  36. 36.
    Aberer, K., Cudré-Mauroux, P., Hauswirth, M., Van Pelt, T.: GridVine: Building Internet-Scale Semantic Overlay Networks. In: McIlraith, S.A., Plexousakis, D., van Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298, pp. 107–121. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  37. 37.
    Karnstedt, M., Sattler, K.-U., Hauswirth, M., Schmidt, R.: A dht-based infrastructure for ad-hoc integration and querying of semantic data. In: Proceedings of the 2008 International Symposium on Database Engineering and Applications, IDEAS 2008, pp. 19–28. ACM, New York (2008)CrossRefGoogle Scholar
  38. 38.
    Zhou, J., Hall, W., De Roure, D.: Building a distributed infrastructure for scalable triple stores. Journal of Computer Science and Technology 24, 447–462 (2009), doi:10.1007/s11390-009-9236-1CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Athanasia Asiki
    • 1
  • Dimitrios Tsoumakos
    • 1
  • Nectarios Koziris
    • 1
  1. 1.School of Electrical and Computer EngineeringNational Technical University of AthensGreece

Personalised recommendations