Relating RSS News/Items

  • Fekade Getahun
  • Joe Tekli
  • Richard Chbeir
  • Marco Viviani
  • Kokou Yetongnon
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5648)


Merging related RSS news (coming from one or different sources) is beneficial for end-users with different backgrounds (journalists, economists, etc.), particularly those accessing similar information. In this paper, we provide a practical approach to both: measure the relatedness, and identify relationships between RSS elements. Our approach is based on the concepts of semantic neighborhood and vector space model, and considers the content and structure of RSS news items.


RSS Relatedness Similarity Relationships Neighbourhood 


  1. 1.
    Bille, P.: A survey on tree edit distance and related problems. Theoretical CS 337(1-3), 217–239 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  2. 2.
    Budanitsky, A., Hirst, G.: Evaluating wordnet-based measures of lexical semantic relatedness. Computational Linguistics 32(1), 13–47 (2006)CrossRefzbMATHGoogle Scholar
  3. 3.
    Chawathe, S.S.: Comparing hierarchical data in external memory. In: VLDB 1999, pp. 90–101. Morgan Kaufmann Publishers Inc., San Francisco (1999)Google Scholar
  4. 4.
    Flesca, S., Manco, G., Masciari, E., Pontieri, L.: Fast detection of xml structural similarity. IEEE Transactions on Knowledge and Data Engineering 17(2), 160–175 (2005)CrossRefGoogle Scholar
  5. 5.
    Garcia, I., Ng, Y.: Eliminating Redundant and Less-Informative RSS News Articles Based on Word Similarity and a Fuzzy Equivalence Relation. In: ICTAI 2006, pp. 465–473 (2006)Google Scholar
  6. 6.
    Getahun, F., Tekli, J., Atnafu, S., Chbeir, R.: Towards efficient horizontal multimedia database fragmentation using semantic-based predicates implication. In: SBBD 2007, pp. 68–82 (2007)Google Scholar
  7. 7.
    Grabs, T., Schek, H.-J.: Generating Vector Spaces On-the-fly for Flexible XML Retrieval. In: ACM SIGIR Workshop on XML and Information Retrieval 2002, pp. 4–13 (2002)Google Scholar
  8. 8.
    Kade, A.M., Heuser, C.A.: Matching XML documents in highly dynamic applications. In: ACM symposium on Document engineering 2008, pp. 191–198 (2008)Google Scholar
  9. 9.
    La Fontaine, R.: Merging XML files: A new approach providing intelligent merge of XML data sets. In: Proceedings of XML, Barcelona, Spain (May 2002)Google Scholar
  10. 10.
    Lau, H., Ng, W.: A Unifying Framework for Merging and Evaluating XML Information. In: Zhou, L.-z., Ooi, B.-C., Meng, X. (eds.) DASFAA 2005. LNCS, vol. 3453, pp. 81–94. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  11. 11.
    Lin, D.: An Information-Theoretic Definition of Similarity. In: Proceedings of the 15th International Conference on Machine Learning, pp. 296–304. Morgan Kaufmann Publishers, San Francisco (1989)Google Scholar
  12. 12.
    McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, New York (1983)zbMATHGoogle Scholar
  13. 13.
    Nierman, A., Jagadish, H.V.: Evaluating structural similarity in XML documents. In: WebDB 2002, pp. 61–66 (2002)Google Scholar
  14. 14.
    Princeton University Cognitive Science Laboratory. WordNet: a lexical database for the English language,
  15. 15.
    R. Richardson and A. F. Smeaton. Using WorldNet in a knowledge-based approach to information retrieval. Technical Report CA-0395, Dublin, Ireland (1995)Google Scholar
  16. 16.
    RSS Advisory Board. RSS 2.0 Specification,
  17. 17.
    Tekli, J., Chbeir, R., Yetongnon, K.: A hybrid approach for xml similarity. SOFSEM 07: 783-795Google Scholar
  18. 18.
    Wu, Z., Palmer, M.: Verbs semantics and lexical selection. In: Proceedings of the 32nd annual meeting on Association for Computational Linguistics, Morristown, NJ, USA, pp. 133–138 (1994)Google Scholar
  19. 19.
    WWW Consortium. The Document Object Model,

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Fekade Getahun
    • 1
  • Joe Tekli
    • 1
  • Richard Chbeir
    • 1
  • Marco Viviani
    • 1
  • Kokou Yetongnon
    • 1
  1. 1.Laboratoire Electronique, Informatique et Image(LE2I) – UMR-CNRS Université de Bourgogne – Sciences et Techniques MirandeDijon CedexFrance

Personalised recommendations