Skip to main content

Managing Linguistic Data Summaries in Advanced P2P Applications

  • Chapter
  • First Online:
Handbook of Peer-to-Peer Networking
  • 2572 Accesses

Abstract

As the amount of stored data increases, data localization techniques become no longer sufficient in P2P systems. A practical approach is to rely on compact database summaries rather than raw database records, whose access is costly in large P2P systems. In this chapter, we describe a solution for managing linguistic data summaries in advanced P2P applications which are dealing with semantically rich data. The produced summaries are synthetic, multidimensional views over relational tables. The novelty of this proposal relies on the double summary exploitation in distributed P2P systems. First, as semantic indexes, they support locating relevant nodes based on their data descriptions. Second, due to their intelligibility, these summaries can be directly queried and thus approximately answer a query without the need for exploring original data. The proposed solution consists first in defining a summary model for hierarchical P2P systems. Second, appropriate algorithms for summary creation and maintenance are presented. A query processing mechanism, which relies on summary querying, is then proposed to demonstrate the benefits that might be obtained from summary exploitation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 349.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 449.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Reference

  1. http://www.cs.bu.edu/brite/

  2. http://www.snomed.org/snomedct

  3. A.Crespo, H.G.Molina: Routing indices for peer-to-peer systems. In: Proc. of the 28 tn Conference on Distributed Computing Systems (2002)

    Google Scholar 

  4. A.Iamnitchi, M.Ripeanu, I.Foster: Locating data in (small-world?) peer-to-peer scientific collaborations. In: IPTPS, pp. 232–241 (2002)

    Google Scholar 

  5. A.Shoshani: OLAP and statistical databases: Similarities and differences. In: Proceedings of the Sixteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 185–196. ACM Press (1997)

    Google Scholar 

  6. Bentley, J.L.: Multidimensional binary search trees used for associative searching. Commun. ACM 18(9), 509–517 (1975). DOI http://doi.acm.org/10.1145/361002.361007

    Article  MATH  MathSciNet  Google Scholar 

  7. B.Yang, H.G.Molina: Comparing hybrid peer-to-peer systems. In: Proc VLDB (2001)

    Google Scholar 

  8. Comer, D.: Ubiquitous b-tree. ACM Comput. Surv. 11(2), 121–137 (1979). DOI http://doi.acm.org/10.1145/356770.356776

    Article  MATH  Google Scholar 

  9. F.Cuenca-Acuna, C.Peery, R.Martin, T.Nguyen: Planetp: Using gossiping to build content addressable peer-to-peer information sharing communities. In: HPDC-12 (2003)

    Google Scholar 

  10. F.Howell, R.McNab: Simjava: a discrete event simulation package for java with the applications in computer systems modeling. In: Int. Conf on Web-based Modelling and Simulation, San Diego CA, Society for Computer Simulation (1998)

    Google Scholar 

  11. Han, J., Fu, Y., Huang, Y., Cai, Y., Cercone, N.: Dblearn: A system prototype for knowledge discovery in relational databases. In: Proceedings of the 1994 ACM SIGMOD International Conference on Management of Data, Minneapolis, Minnesota, May 24–27, 1994, p. 516. ACM Press (1994)

    Chapter  Google Scholar 

  12. H.Jagadish, R.Ng, B.Ooi, A.Tung: Itcompress: An iterative semantic compression algorithm. In: 20th International Conference on Data Engineering, p. 646 (2004)

    Google Scholar 

  13. H.Shen, Y.Shu, B.Yu: Efficient semantic-based content search in p2p network. IEEE Trans. Knowl. Data Eng. 16(7) (2004)

    Google Scholar 

  14. Ioannidis, Y.: The history of histograms (abridged). In: VLDB ’2003: Proceedings of the 29th international conference on Very large data bases, pp. 19–30. VLDB Endowment (2003)

    Google Scholar 

  15. I.Tartinov, et al: The Piazza peer data management project. In: SIGMOD (2003)

    Google Scholar 

  16. K.Thompson, P.Langley: Concept formation in structured domains. In: Concept formation: Knowledge and experience in unsupervised learning, pp. 127–161. Morgan Kaufmann

    Google Scholar 

  17. L.A.Zadeh: Concept of a linguistic variable and its application to approximate reasoning-I. Inf. Syst. 8, 199–249 (1975)

    Google Scholar 

  18. M.Bechchi, G.Raschia, N.Mouaddib: Merging distributed database summaries. In: ACM Sixteenth Conference on Information and Knowledge Management (CIKM) (2007)

    Google Scholar 

  19. Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical recipes in C (2nd ed.): the art of scientific computing. Cambridge University Press, New York, NY, USA (1992)

    MATH  Google Scholar 

  20. R.Akbarinia, V.Martins, E.Pacitti, P.Valduriez: Design and implementation of appa. In: Global Data Management (Eds. R. Baldoni, G. Cortese and F. Davide). IOS press (2006)

    Google Scholar 

  21. R.Hayek, G.Raschia, P.Valduriez, N.Mouaddib: Summary management in p2p systems. In: EDBT, pp. 16–25 (2008)

    Google Scholar 

  22. R.Saint-Paul, G.Raschia, N.Mouaddib: General purpose database summarization. In: Proc VLDB, pp. 733–744 (2005)

    Google Scholar 

  23. Ruspini, E.H.: A new approach to clustering. Inf. Control 15, 22–32 (1969)

    Article  MATH  Google Scholar 

  24. S.Babu, G.Minos, R.Rajeev: Spartan: A model-based semantic compression system for massive data tables. In: Proc. of the 2001 ACM Intl. Conf. on Management of Data (SIGMOD), pp. 283–295 (2001)

    Google Scholar 

  25. S.Saroiu, P.Gummadi, S.Gribble: A measurement study of peer-to-peer file sharing systems. In: Proc of Multimedia Computing and Networking (MMCN) (2002)

    Google Scholar 

  26. W.A.Voglozin, G.Raschia, L.Ughetto, N.Mouaddib: Querying the SaintEtiq summaries–a first attempt. In: Int.Conf.On Flexible Query Answering Systems (FQAS) (2004)

    Google Scholar 

  27. W.Nejdl, W.Siberski: Design issues and challenges for rdf- and schema-based peer-to-peer systems. SIGMOD Record 32, 2003 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rabab Hayek .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Hayek, R., Raschia, G., Valduriez, P., Mouaddib, N. (2010). Managing Linguistic Data Summaries in Advanced P2P Applications. In: Shen, X., Yu, H., Buford, J., Akon, M. (eds) Handbook of Peer-to-Peer Networking. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-09751-0_20

Download citation

  • DOI: https://doi.org/10.1007/978-0-387-09751-0_20

  • Published:

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-09750-3

  • Online ISBN: 978-0-387-09751-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics