Abstract
As the amount of stored data increases, data localization techniques become no longer sufficient in P2P systems. A practical approach is to rely on compact database summaries rather than raw database records, whose access is costly in large P2P systems. In this chapter, we describe a solution for managing linguistic data summaries in advanced P2P applications which are dealing with semantically rich data. The produced summaries are synthetic, multidimensional views over relational tables. The novelty of this proposal relies on the double summary exploitation in distributed P2P systems. First, as semantic indexes, they support locating relevant nodes based on their data descriptions. Second, due to their intelligibility, these summaries can be directly queried and thus approximately answer a query without the need for exploring original data. The proposed solution consists first in defining a summary model for hierarchical P2P systems. Second, appropriate algorithms for summary creation and maintenance are presented. A query processing mechanism, which relies on summary querying, is then proposed to demonstrate the benefits that might be obtained from summary exploitation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Reference
A.Crespo, H.G.Molina: Routing indices for peer-to-peer systems. In: Proc. of the 28 tn Conference on Distributed Computing Systems (2002)
A.Iamnitchi, M.Ripeanu, I.Foster: Locating data in (small-world?) peer-to-peer scientific collaborations. In: IPTPS, pp. 232–241 (2002)
A.Shoshani: OLAP and statistical databases: Similarities and differences. In: Proceedings of the Sixteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 185–196. ACM Press (1997)
Bentley, J.L.: Multidimensional binary search trees used for associative searching. Commun. ACM 18(9), 509–517 (1975). DOI http://doi.acm.org/10.1145/361002.361007
B.Yang, H.G.Molina: Comparing hybrid peer-to-peer systems. In: Proc VLDB (2001)
Comer, D.: Ubiquitous b-tree. ACM Comput. Surv. 11(2), 121–137 (1979). DOI http://doi.acm.org/10.1145/356770.356776
F.Cuenca-Acuna, C.Peery, R.Martin, T.Nguyen: Planetp: Using gossiping to build content addressable peer-to-peer information sharing communities. In: HPDC-12 (2003)
F.Howell, R.McNab: Simjava: a discrete event simulation package for java with the applications in computer systems modeling. In: Int. Conf on Web-based Modelling and Simulation, San Diego CA, Society for Computer Simulation (1998)
Han, J., Fu, Y., Huang, Y., Cai, Y., Cercone, N.: Dblearn: A system prototype for knowledge discovery in relational databases. In: Proceedings of the 1994 ACM SIGMOD International Conference on Management of Data, Minneapolis, Minnesota, May 24–27, 1994, p. 516. ACM Press (1994)
H.Jagadish, R.Ng, B.Ooi, A.Tung: Itcompress: An iterative semantic compression algorithm. In: 20th International Conference on Data Engineering, p. 646 (2004)
H.Shen, Y.Shu, B.Yu: Efficient semantic-based content search in p2p network. IEEE Trans. Knowl. Data Eng. 16(7) (2004)
Ioannidis, Y.: The history of histograms (abridged). In: VLDB ’2003: Proceedings of the 29th international conference on Very large data bases, pp. 19–30. VLDB Endowment (2003)
I.Tartinov, et al: The Piazza peer data management project. In: SIGMOD (2003)
K.Thompson, P.Langley: Concept formation in structured domains. In: Concept formation: Knowledge and experience in unsupervised learning, pp. 127–161. Morgan Kaufmann
L.A.Zadeh: Concept of a linguistic variable and its application to approximate reasoning-I. Inf. Syst. 8, 199–249 (1975)
M.Bechchi, G.Raschia, N.Mouaddib: Merging distributed database summaries. In: ACM Sixteenth Conference on Information and Knowledge Management (CIKM) (2007)
Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical recipes in C (2nd ed.): the art of scientific computing. Cambridge University Press, New York, NY, USA (1992)
R.Akbarinia, V.Martins, E.Pacitti, P.Valduriez: Design and implementation of appa. In: Global Data Management (Eds. R. Baldoni, G. Cortese and F. Davide). IOS press (2006)
R.Hayek, G.Raschia, P.Valduriez, N.Mouaddib: Summary management in p2p systems. In: EDBT, pp. 16–25 (2008)
R.Saint-Paul, G.Raschia, N.Mouaddib: General purpose database summarization. In: Proc VLDB, pp. 733–744 (2005)
Ruspini, E.H.: A new approach to clustering. Inf. Control 15, 22–32 (1969)
S.Babu, G.Minos, R.Rajeev: Spartan: A model-based semantic compression system for massive data tables. In: Proc. of the 2001 ACM Intl. Conf. on Management of Data (SIGMOD), pp. 283–295 (2001)
S.Saroiu, P.Gummadi, S.Gribble: A measurement study of peer-to-peer file sharing systems. In: Proc of Multimedia Computing and Networking (MMCN) (2002)
W.A.Voglozin, G.Raschia, L.Ughetto, N.Mouaddib: Querying the SaintEtiq summaries–a first attempt. In: Int.Conf.On Flexible Query Answering Systems (FQAS) (2004)
W.Nejdl, W.Siberski: Design issues and challenges for rdf- and schema-based peer-to-peer systems. SIGMOD Record 32, 2003 (2003)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Hayek, R., Raschia, G., Valduriez, P., Mouaddib, N. (2010). Managing Linguistic Data Summaries in Advanced P2P Applications. In: Shen, X., Yu, H., Buford, J., Akon, M. (eds) Handbook of Peer-to-Peer Networking. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-09751-0_20
Download citation
DOI: https://doi.org/10.1007/978-0-387-09751-0_20
Published:
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-09750-3
Online ISBN: 978-0-387-09751-0
eBook Packages: Computer ScienceComputer Science (R0)