Abstract
A combination of the strengths of both classic information retrieval with the distributed approach of P2P networks can avoid both their weaknesses: The organisation of document collections relevant for special communities allows both high coverage and quick access. We present a theoretical framework in which the semantic structure between words can be deduced from a document collection. This structural knowledge can then be used to connect document collections to communities based on their content.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Adar, E., Hubermann, B.: Freeriding on Gnutella. Firstmonday 5(10) (2000)
Barabasi, A.L., et al.: Scale-free characteristics of random networks: the topology of the World-wide web. Physica A (281), 70–77 (2000)
Bolloba, B.: Random Graphs. Academic Press, London (1985)
Bordag, S.: Sentence Co-occurrences as Small-World Graphs: A solution to Automatic Lexical Disambiguation. In: Gelbukh, A. (ed.) CICLing 2003. LNCS, vol. 2588, pp. 329–332. Springer, Heidelberg (2003)
Bordag, S.: Vererbungsalgorithmen von semantischen Eigenschaften auf As-soziationsgraphen und deren Nutzung zur Klassifikation von natürlichsprachlichen Daten, Diplomarbeit, Universität Leipzig, Institut für Mathematik und Informatik (2002)
Böhm, K., Heyer, G., Quasthoff, U., Wolff, C.: Topic Map Generation using Text Mining. Journal of Universal Computer Science 8(6) (2002), http://www.jucs.org/jucs_8_6
Clarke, I., Sandberg, O., Wiley, B., Hong, T.: Freenet: A Distributed Anonymous Information Storage and Retrieval System. In: ICSI Workshop on Design Issues in Anonymity and Unobservability, Berkeley, CA (2000)
Davidson, R., Harel, D.: Drawing graphs nicely using simulated annealing. ACM Transactions on Graphics 15(4), 301–331 (1996)
Deo, N., Gupta, P.: World Wide web: a Graph Theoretic Approach. Technical Report CS TR-01-001, University of Central Florida, Orlando Fl, USA (2001)
Ferrero i Cancho, R., Solé, R.V.: The Small-World of Human Language (2001), http://www.santafe.edu/sfi/publications/
Gnutella (2002), http://www.gnutella.com
GRACE (2002), http://www.ub.uni-stuttgart.de/grace/
Heyer, G., Quasthoff, U., Wolff, C.: Aiding Web Searches by Statistical Classification Tools. In: Knorz, G., Kuhlen, R. (eds.) Informationskompetenz - Basiskompetenz in der Informationsgesellschaft. In: Proc. 7. Intern. Symposium f. Informationswissen-schaft, ISI 2000, Darmstadt, Konstanz: UVK, pp.163–177 (2000)
Heyer, G., Quasthoff, U., Wittig, T., Wolff, C.: Learning Relations Using Collocations. In: Maedche, A., Staab, S., Nedellec, C., Hovy, E. (eds.) Proc. IJCAI Workshop on Ontology Learning, Seattle/ WA (2001)
Heyer, G., Quasthoff, U., Wolff, C.: Automatic Analysis of Large Text Corpora - A Contribution to Structuring WEB Ccommunities. In: Unger, H., Böhme, T.(Hrsg.) (eds.) Proceedings I2CS - 2002. Advanced Lecture Notes in Computer Science, Springer, Heidelberg (2002)
Heyer, G., Quasthoff, U., Wolff, C.: Knowledge Extraction from Text: Using Filters on Collocation Sets. In: Unger, H., Böhme, T., Mikler, A.R. (eds.) IICS 2002. LNCS, vol. 2346, p. 15. Springer, Heidelberg (2002)
Joseph, S.: NeuroGrid - Freenet Simulation Results
Joseph, S.: NeuroGrid White Paper
Kleinberg, J.: The small-world phenomenon: An algorithmic perspective. In: Proc. 32nd ACM Symposium on Theory of Computing (2000)
Lechner, U.: Peer to Peer beyond Filesharing. In: Unger, H., Bohme, T. (Hrsg.)
Lifantsev, M.: Voting Model for Ranking Web Pages. In: Graham, P., Maheswaran, M. (eds.) Proceedings of the International Conference on Internet Computing (Las Vegas, Nevada, U.S.A.), pp. 143–148. CSREA Press, Las Vegas (2000)
Lifantsev, M., Chiueh, T.: I/O-Conscious Data Preparation for Large-Scale Web Search Engines. In: Proceedings of 28th International Conference on Very Large Data Bases, Hong Kong, China, August 20-23, Morgan Kaufmann, Hong Kong (2002)
Milgram, S.: The small world problem. Psychology Today 2, 60–67 (1967)
Newman, M.E.J.: Models of the Small World (2000)
Open GriD (2002), http://www.cs.sunysb.edu/~maxim/OpenGRiD/
Ritter, J.: Why Gnutella can’t scale. No, really, www.nearlydeaf.8m.com/ygnutellwnwrk.html (2002)
Sanderson, M.: Word Sense Disambiguation and information Retrieval. In: Proceedings of the 17th ACM SIGIR Conference, pp. 142–151 (1996)
Saussure, de Saussure, F.: Cours de linguistique générale (1916)
Schmidt, F.: Automatische Ermittlung semantischer Zusammenhänge lexi-kalischer Einheiten und deren graphische Darstellung, Diplomarbeit, Universität Leipzig (1999)
Manning, C.D., Schutze, H.: Foundations of statistical natural language processing (1999)
Sebastiani, F.: Machine Learning in Automated Text Categorization (2001)
Singla, A., Rohrs, C.: Ultrapeers: Another Step towards gnutella scalability. Lime Wire LLC, Working Draft (2002), www.limewire.com/developrer/Ultrapeers.html
Steyvers, M., Tenenbaum, J.B.: The large-scale structure of semantic networks: statistical analyses and a model of semantic growth, Congnitive Science (2002)
Watts, D.J., Strogatz, S.H.: Collective dynamics of ’small-world’ networks. Nature 393, 440–442 (1998) www.firstmonday.org www.neurogrid.net/ng-simulation.html (2001) www.neurogrid.net/WhitePaper0_3.html (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bordag, S., Heyer, G., Quasthoff, U. (2003). Small Worlds of Concepts and Other Principles of Semantic Search. In: Böhme, T., Heyer, G., Unger, H. (eds) Innovative Internet Community Systems. IICS 2003. Lecture Notes in Computer Science, vol 2877. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39884-4_2
Download citation
DOI: https://doi.org/10.1007/978-3-540-39884-4_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20436-7
Online ISBN: 978-3-540-39884-4
eBook Packages: Springer Book Archive