Meaningful Change Detection on the Web⋆

  • S. Flesca
  • F. Furfaro
  • E. Masciari
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2113)


In this paper we present a new technique for detecting changes on the Web. We propose a new method to measure the similarity of two documents, that can be efficiently used to discover changes in selected portions of the original document. The proposed technique has been implemented in the CDWeb system providing a change monitoring service on theWeb. CDWeb differs from other previously proposed systems since it allows the detection of changes on portions of documents and specific changes expressed by means of complex conditions, i.e. users might want to know if the value of a given stock has increased by more than 10%. Several tests on stock exchange and auction web pages proved the effectiveness of the proposed approach.


Target Zone Document Tree Edit Mapping Cisco System Unordered Tree 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    S. Chawathe, A. Rajaraman, H. Garcia-Molina, and J. Widom Change detection in hierarchically structured information. In Proc. of the ACM SIGMOD Int. Conf. on Management of Data, pages 493–504, Montreal, Quebec, June 1996.Google Scholar
  2. 2.
    S. Chawathe, H. Garcia-Molina Meaningful change detection in structured data. In Proc. of the ACM SIGMOD Int. Conf. on Management of Data, pages 26–37, Tuscon, Arizona, May 1997.Google Scholar
  3. 3.
    S. Chawathe, S. Abiteboul, J. Widom Representing and querying changes in semistructured data. In Proc. of the Int. Conf. on Data Engeneering, pages 4–13, Orlando, Florida, February 1998Google Scholar
  4. 4.
    F. Douglis, T. Ball, Y. Chen, E. Koutsofios WebGuide: Querying and Navigating Changes in Web Repositories. In WWW5 / Computer Networks, 28(7-11), pages 1335–1344, 1996.Google Scholar
  5. 5.
    Fred Douglis, Thomas Ball: Tracking and Viewing Changes on the Web. In Proc. of USENIX Annual Technical Conference, pages 165–176, 1996.Google Scholar
  6. 6.
    F. Douglis, T. Ball, Y. Chen, and E. Koutsofios. The AT&T Internet Difference Engine: Tracking and Viewing Changes on the Web. In World Wide Web, 1(1), pages 27–44, Baltzer Science Publishers, 1998.CrossRefGoogle Scholar
  7. 7.
    L. Liu, C. Pu, W. Tang, J. Biggs, D. Buttler, W. Han, P. Benninghoff, and Fenghua. CQ: A personalized update monitoring toolkit. In Proc. of the ACM SIGMOD Int. Conf. on Management of Data, 1998Google Scholar
  8. 8.
    L. Liu, C. Pu, W. Tang WebCQ-Detecting and delivering information changes on the web. In Proc. of CIKM’00, Washington, DC USA, 2000.Google Scholar
  9. 9.
  10. 10.
  11. 11.
    Wuu Yang. Identifying Syntactic differences Between Two Programs. In Software-Practice and Experience (SPE), 21(7), pp. 739–755, 1991.CrossRefGoogle Scholar
  12. 12.
    J.T. Wang, K. Zhang and G. Chirn. Algorithms for Approximate Graph Matching. In Information Sciences 82(1-2), pp. 45–74, 1995.zbMATHCrossRefMathSciNetGoogle Scholar
  13. 13.
  14. 14.
    J. Widom and J. Ullman. C 3: Changes, consistency, and configurations in heterogeneous distributed information systems. Unpublished, available at, 1995
  15. 15.
    K. Zhang, J.T. Wang and D. Shasha. On the Editing Distance between Undirected Acyclic Graphs and Related Problems. In Proc. of Combinatorial Pattern Matching, pp. 395–407, 1995.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2001

Authors and Affiliations

  • S. Flesca
    • 2
  • F. Furfaro
    • 2
  • E. Masciari
    • 1
    • 2
  1. 1.ISI-CNRRendeItaly
  2. 2.DEIS, Univ. della CalabriaRendeItaly

Personalised recommendations