Retrieving Arbitrary XML Fragments from Structured Peer-to-Peer Networks

  • Toshiyuki Amagasa
  • Chunhui Wu
  • Hiroyuki Kitagawa
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4505)


This paper proposes a novel method for the storage and retrieval of XML in structured P2P networks. Peer-to-Peer (P2P), a new paradigm in the field of distributed computing, is attracting much attention. In particular, DHT (Distributed Hash Table) offers effective and powerful search through peer collaboration. However, query retrieval of arbitrary XML fragments using DHTs is a challenging issue because of the impedance mismatch between hashing, which is the core mechanism of the DHTs, and the complex structure of XML data and the XML query model. To address this problem, we propose a novel scheme for managing XML data in DHT. Specifically, we (virtually) deploy two kinds of DHTs, called Contents-DHT (C-DHT) and Structure-DHT (S-DHT), to map textual information and structural information, respectively. Queries against stored XML are processed using the DHTs concurrently. We can, therefore, retrieve arbitrary XML fragments. Effectiveness of the proposed scheme is demonstrated with a series of experiments.


Leaf Node Query Processing Resource Description Framework Distribute Hash Table Path Expression 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    W3C: Extensible Markup Language (XML) 1.0, 3rd edn. Recommendation (April 2004),
  2. 2.
    Koloniari, G., Pitoura, E.: Peer-to-Peer Management of XML Data: Issues and Research Challenges. SIGMOD Record 34(2), 6–17 (2005)CrossRefGoogle Scholar
  3. 3.
    Bonifati, A., Matrangolo, U., Cuzzocrea, A., Jain, M.: XPath Lookup Queries in P2P Networks. In: Proc. WIDM 2004, pp. 48–55 (2004)Google Scholar
  4. 4.
    Rao, W.-X., Song, H., Ma, F.-Y.: Querying XML Data over DHT System Using XPeer. In: Jin, H., Pan, Y., Xiao, N., Sun, J. (eds.) GCC 2004. LNCS, vol. 3251, pp. 559–566. Springer, Heidelberg (2004)Google Scholar
  5. 5.
    W3C: XML Path Language (XPath) Version 1.0. Recommendation (November 1999)Google Scholar
  6. 6.
    Stoica, I., Morris, R., Karger, D.R., Kaashoek, M.F., Balakrishnan, H.: Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications. In: Proc. ACM SIGCOMM 2001, pp. 149–160 (2001)Google Scholar
  7. 7.
    Rowstron, A., Druschel, P.: Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems. In: Guerraoui, R. (ed.) Middleware 2001. LNCS, vol. 2218, pp. 329–350. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  8. 8.
    Zhao, B.Y., Kubiatowicz, J.D., Joseph, A.D.: Tapestry: An Infrastructure for Fault-tolerant Wide-area Location and Routing. Technical Report UCB/CSD-01-1141, UC Berkeley (April 2001)Google Scholar
  9. 9.
    Ratnasamy, S., Francis, P., Handley, M., Karp, R., Shenker, S.: A Scalable Content-Addressable Network. In: Proc. ACM SIGCOMM 2001, pp. 161–172 (2001)Google Scholar
  10. 10.
    Goldman, R., Widom, J.: DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases. In: Proc. VLDB 1997, pp. 436–445 (1997)Google Scholar
  11. 11.
    Al-Khalifa, S., Jagadish, H.V., Patel, J.M., Wu, Y., Koudas, N., Srivastava, D.: Structural Joins: A Primitive for Efficient XML Query Pattern Matching. In: Proc. ICDE 2002, pp. 141–152 (2002)Google Scholar
  12. 12.
    Catania, B., Maddalena, A., Vakali, A.: XML Document Indexes: A Classification. IEEE Internet Computing 9(5), 64–71 (2005)CrossRefGoogle Scholar
  13. 13.
    Dietz, P.F.: Maintaining order in a linked list. In: Proc. of the fourteenth annual ACM symposium on Theory of computing table of contents, pp. 122–127 (1982)Google Scholar
  14. 14.
    Yoshikawa, M., Amagasa, T., Shimura, T., Uemura, S.: XRel: a path-based approach to storage and retrieval of XML documents using relational databases. ACM Transactions on Internet Technology (TOIT) 1(1), 110–141 (2001)CrossRefGoogle Scholar
  15. 15.
    Tatarinov, I., Viglas, S.D., Beyer, K., Shanmugasundaram, J., Shekita, E., Zhang, C.: Storing and Querying Ordered XML Using a Relational Database System. In: Proc. ACM SIGMOD 2002, pp. 204–215 (2002)Google Scholar
  16. 16.
    O’Neil, P.E., O’Neil, E.J., Pal, S., Cseri, I., Schaller, G., Westbury, N.: ORDPATHs: Insert-Friendly XML Node Labels. In: Proc. ACM SIGMOD 2004, pp. 903–908 (2004)Google Scholar
  17. 17.
    Shudo, K.: Overlay Weaver An Overlay Construction Toolkit,
  18. 18.
    XMark – An XML Benchmark Project,
  19. 19.
    XBench - A Family of Benchmarks for XML DBMSs: Xbench,
  20. 20.
    Sartiani, C., Manghi, P., Ghelli, G., Conforti, G.: XPeer: A Self-Organizing XML P2P Database System. In: Lindner, W., Mesiti, M., Türker, C., Tzitzikas, Y., Vakali, A.I. (eds.) EDBT 2004. LNCS, vol. 3268, pp. 456–465. Springer, Heidelberg (2004)Google Scholar
  21. 21.
    Cai, M., Frank, M.R.: RDFPeers: a scalable distributed RDF repository based on a structured peer-to-peer network. In: Proc. WWW’2004, pp. 650–657 (2004)Google Scholar
  22. 22.
    W3C: Resource Description Framework (RDF): Concepts and Abstract Syntax. Recommendation (Feb. 2004),
  23. 23.
    Huebsch, R., Hellerstein, J.M., Lanham, N., Loo, B.T., Shenker, S., Stoica, I.: Querying the Internet with PIER. In: Proc. VLDB 2003, pp. 321–332 (2003)Google Scholar
  24. 24.
    Ng, W.S., Ooi, B.C., Tan, K.L., Zhou, A.: PeerDB: A P2P-based System for Distributed Data Sharing. In: Proc. ICDE 2003, pp. 633–644 (2003)Google Scholar

Copyright information

© Springer Berlin Heidelberg 2007

Authors and Affiliations

  • Toshiyuki Amagasa
    • 1
    • 2
  • Chunhui Wu
    • 1
  • Hiroyuki Kitagawa
    • 1
    • 2
  1. 1.Graduate School of Systems and Information Engineering, Department of Computer Science 
  2. 2.Center for Computational Sciences, University of Tsukuba, 1–1–1 Tennodai, Tsukuba 305–8573Japan

Personalised recommendations