Content Aggregation on Knowledge Bases Using Graph Clustering

  • Christoph Schmitz
  • Andreas Hotho
  • Robert Jäschke
  • Gerd Stumme
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4011)


Recently, research projects such as PADLR and SWAP have developed tools like Edutella or Bibster, which are targeted at establishing peer-to-peer knowledge management (P2PKM) systems. In such a system, it is necessary to obtain provide brief semantic descriptions of peers, so that routing algorithms or matchmaking processes can make decisions about which communities peers should belong to, or to which peers a given query should be forwarded.

This paper provides a graph clustering technique on knowledge bases for that purpose. Using this clustering, we can show that our strategy requires up to 58% fewer queries than the baselines to yield full recall in a bibliographic P2PKM scenario.


Knowledge Base Cluster Strategy Semantic Distance Graph Cluster Text Summarization 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Aberer, K., Cudré-Mauroux, P., Hauswirth, M.: The Chatty Web: Emergent Semantics Through Gossiping. In: Proc. 12th International World Wide Web Conference, Budapest, Hungary (May 2003)Google Scholar
  2. 2.
    Bonifacio, M., Cuel, R., Mameli, G., et al.: A peer-to-peer architecture for distributed knowledge management. In: Proc. 3rd International Symposium on Multi-Agent Systems, Large Complex Systems, and E-Businesses MALCEB 2002, Erfurt, Germany (October 2002)Google Scholar
  3. 3.
    Crespo, A., Garcia-Molina, H.: Routing indices for peer-to-peer systems. In: Proc. International Conference on Distributed Computing Systems (ICDCS), Vienna, Austria (July 2002), ISSN 0734-2071Google Scholar
  4. 4.
    de Bruijn, J., Martin-Recuerda, F., Manov, D., et al.: State-of-the-art survey on ontology merging and aligning (SEKT project deliverable 4.2.1) (2004),
  5. 5.
    Ehrig, M., Haase, P., van Harmelen, F., et al.: The SWAP data and metadata model for semantics-based peer-to-peer systems. In: Schillo, M., Klusch, M., Müller, J., Tianfield, H. (eds.) MATES 2003. LNCS (LNAI), vol. 2831, pp. 144–155. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  6. 6.
    Ehrig, M., Handschuh, S., Hotho, A., et al.: KAON - towards a large scale Semantic Web. In: Bauknecht, K., Tjoa, A.M., Quirchmayr, G. (eds.) EC-Web 2002. LNCS, vol. 2455, p. 304. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  7. 7.
    Gentner, D., Brem, S.K.: Is snow really like a shovel? Distinguishing similarity from thematic relatedness. In: Hahn, M., Stoness, S.C. (eds.) Proc. Twenty-First Annual Meeting of the Cognitive Science Society, Mahwah, NJ (1999), ISBN 3-540-40317-5Google Scholar
  8. 8.
    Haase, P., Siebes, R.: Peer selection in peer-to-peer networks with semantic topologies. In: Proc. 13th International World Wide Web Conference, New York City, NY, USA (May 2004)Google Scholar
  9. 9.
    Hahn, U., Reimer, U.: Knowledge-based text summarization: Salience and generalization operators for knowledge base abstraction. In: Mani, I., Maybury, M.T. (eds.) Advances in Automatic Text Summarization. MIT Press, Cambridge (1999)Google Scholar
  10. 10.
    Hovy, E., Lin, C.-Y.: Automated text summarization in SUMMARIST. In: Mani, I., Maybury, M.T. (eds.) Advances in Automatic Text Summarization. MIT Press, Cambridge (1999)Google Scholar
  11. 11.
    Huang, Z.: Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Min. Knowl. Discov. 2(3), 283–304 (1998)CrossRefGoogle Scholar
  12. 12.
    Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley, Chichester (1990)Google Scholar
  13. 13.
    Löser, A., Tempich, C., Quilitz, B., et al.: Searching dynamic communities with personal indexes. In: Gil, Y., Motta, E., Benjamins, V.R., Musen, M.A. (eds.) ISWC 2005. LNCS, vol. 3729, pp. 491–505. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  14. 14.
    Maedche, A., Staab, S.: Measuring similarity between ontologies. In: Gómez-Pérez, A., Benjamins, V.R. (eds.) EKAW 2002. LNCS, vol. 2473, p. 251. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  15. 15.
    Nejdl, W., Wolf, B., Qu, C., et al.: Edutella: A p2p networking infrastructure based on rdf. In: Proc. 11th International World Wide Web Conference (WWW 2002), Honolulu, Hawaii (May 2002)Google Scholar
  16. 16.
    Pothen, A.: Graph partitioning algorithms with applications to scientific computing. In: Keyes, D.E., Sameh, A., Venkatakrishnan, V. (eds.) Parallel Numerical Algorithms, pp. 323–368. Kluwer, Dordrecht (1997)Google Scholar
  17. 17.
    Rada, R., Mili, H., Bicknell, E., et al.: Development and application of a metric on semantic nets. IEEE Transactions on Systems, Man and Cybernetics 19(1), 17–30 (1989)CrossRefGoogle Scholar
  18. 18.
    Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In: Proc. Fourteenth International Joint Conference on Artificial Intelligence, IJCAI 1995, Montreal, Canada (August 1995)Google Scholar
  19. 19.
    Reynolds, P., Vahdat, A.: Efficient peer-to-peer keyword searching. In: Endler, M., Schmidt, D.C. (eds.) Middleware 2003. LNCS, vol. 2672. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  20. 20.
    Schmitz, C.: Self-organization of a small world by topic. In: Proc. 1st International Workshop on Peer-to-Peer Knowledge Management, Boston, MA (August 2004)Google Scholar
  21. 21.
    Schmitz, C., Staab, S., Tempich, C.: Socialisation in peer-to-peer knowledge management. In: Proc. International Conference on Knowledge Management (I-Know 2004), Graz, Austria (June 2004)Google Scholar
  22. 22.
    Tane, J., Schmitz, C., Stumme, G.: Semantic resource management for the web: An elearning application. In: Proc. 13th International World Wide Web Conference, New York (May 2004)Google Scholar
  23. 23.
    Tempich, C., Staab, S., Wranik, A.: Remindin’: Semantic query routing in peer-to-peer networks based on social metaphors. In: W3C (ed.) Proceedings of the 13th International World Wide Web Conference (WWW 2004), pp. 640–649. ACM Press, New York (2004)Google Scholar
  24. 24.
    Tversky, A.: Features of similarity. Psychological Review 84(4), 327–352 (1977)CrossRefGoogle Scholar
  25. 25.
    Welty, C.A., Ferrucci, D.A.: What’s in an instance? Tech. Rep. #94-18, RPI Computer Science Dept. (1994)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Christoph Schmitz
    • 1
  • Andreas Hotho
    • 1
  • Robert Jäschke
    • 1
  • Gerd Stumme
    • 1
  1. 1.Knowledge and Data Engineering GroupUniversität Kassel 

Personalised recommendations