Discovering semantic proximity for web pages

  • Matthew Merzbacher
Communications 3A Learning and Knowledge Discovery
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1609)


Dynamic Nearness is a data mining algorithm that detects semantic relationships between objects in a database, based on access patterns. This approach can be applied to web pages to allow automatic dynamic reconfiguration of a web site. Worst-case storage requirements for the algorithm are quadratic (in the number of web pages), but practical reductions, such as ignoring a few long transactions that provide little information, drop storage requirements to linear. Thus, dynamic nearness scales to large systems. The methodology is validated via experiments run on a moderately-sized existing web site.


learnign and knowledge discovery intelligent information systems semantic distance metrics data mining world wide web 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    M. S. Chen, J. S. Park, and. P. S. Yu, “Data mining for path traversal patterns in a web environment’, Proc. 16th International Conference on Distributed Computing Systems, pages 385–392, 1996.Google Scholar
  2. 2.
    R. Cooley, B. Mobasher, and J. Srivastava, “Grouping Web Page References into Transactions for Mining World Wide Web Browsing Patterns,” Proceedings of the IEEE Knowledge and Data Engineering Exchange Workshop, KDEX, pages 2–9, 1997.Google Scholar
  3. 3.
    T. M. Cover and P. E. Hart, “Nearest neighbor pattern classification”, IEEE Transactions on Information Theory, IT.-13(1):21–27, January 1967.MATHCrossRefGoogle Scholar
  4. 4.
    F Cuppers and R. Demolombe, “Cooperative answering: a methodology to provide intelligent access to databases”, Proc. 2nd International Conference on Expert Database Systems, Virginia, USA, 1988.Google Scholar
  5. 5.
    T. Gaasterland, P. Godfrey, and J. Minker, “An overview of cooperative answering”, In Nonstandard Queries and Nonstandard Answers. R. Demolombe and T. Imielinski, eds. Oxford Science Publications, 1994.Google Scholar
  6. 6.
    C. M. Hymes and G. M. Olson, “Quick but not so dirty web design: applying empirical conceptual clustering techniques to organize hypertext content” Proceedings of the Conference on Designing Interactive Systems: Processes, Practices, Methods, and Techniques. pages 159–162, 1997.Google Scholar
  7. 7.
    J. A. Johnson, “Semantic Relatedness”, Computer and Mathematics with Applications, pages 51–63, 29(5), 1995.MATHCrossRefGoogle Scholar
  8. 8.
    L. Leydesdorff and R. Zaal, “Co-words and citations relations between document sets and environments,” Informetrics 87/88, pages 105–119, 1988.Google Scholar
  9. 9.
    M. A. Merzbacher and W. W. Chu, “Query-Based Semantic Nearness for Cooperative Query Answering”, Proc. 1st ISMM Conference on Information and Knowledge Management: CIKM, 1993.Google Scholar
  10. 10.
    A. Motro, “Cooperative database systems”, International Journal of Intelligent Systems. pages 717–731, v11 n10, 1996.CrossRefGoogle Scholar
  11. 11.
    G. Salton, Automatic text processing: the transformation, analysis, and retrieval of information by computer, Addison-Wesley, Reading Massachusetts, 1989.Google Scholar
  12. 12.
    D. G. Zhao, “ELINOR electronic library system”, Electronic Library, pages 289–294, Oct 1994.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1999

Authors and Affiliations

  • Matthew Merzbacher
    • 1
  1. 1.Mills CollegeOaklandUSA

Personalised recommendations