Skip to main content

Link Proximity Analysis - Clustering Websites by Examining Link Proximity

  • Conference paper
Research and Advanced Technology for Digital Libraries (ECDL 2010)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6273))

Included in the following conference series:

Abstract

This research-in-progress paper presents a new approach called Link Proximity Analysis (LPA) for identifying related web pages based on link analysis. In contrast to current techniques, which ignore intra-page link analysis, the one put forth here examines the relative positioning of links to each other within websites. The approach uses the fact that a clear correlation between the proximity of links to each other and the subject-relatedness of the linked websites can be observed on nearly every web page. By statistically analyzing this relationship and measuring the amount of sentences, paragraphs, etc. between two links, related websites can be automatically, identified as a first study has proven.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Gipp, B., Beel, J.: Citation Proximity Analysis (CPA) - A new approach for identifying related work based on Co-Citation Analysis. In: Larsen, B., Leta, J. (eds.) Proceedings of the 12th International Conference on Scientometrics and Informetrics (ISSI 2009), Rio de Janeiro, Brazil, vol. 2, pp. 571–575 (July 2009), ISSN 2175-1935, http://www.sciplore.org

  2. Gipp, B., Beel, J., Hentschel, C.: Scienstein: A Research Paper Recommender System. In: Proceedings of the International Conference on Emerging Trends in Computing (ICETiC 2009), Virudhunagar, India, pp. 309–315. Kamaraj College of Engineering and Technology India/IEEE (January 2009), http://www.sciplore.org

  3. Kessler, M.M.: Bibliographic coupling between scientific papers. American Documentation 14, 10–25 (1963)

    Article  Google Scholar 

  4. Marshakova, I.V.: System of document connections based on references. Scientific and Technical Information Serial of VINITI 6(2), 3–8 (1973)

    Google Scholar 

  5. Fogaras, D., Rácz, B.: Scaling link-based similarity search. In: Proceedings of the 14th International Conference on World Wide Web Conference (2005)

    Google Scholar 

  6. Dutta, A.K.R., Ghosh, I., Mukhopadhyay, D.: An Advanced Partitioning Approach of Web Page Clustering utilizing Content & Link Structure. Journal of Convergence Information Technology 4, 65–71 (2009)

    Google Scholar 

  7. Small, H.: Co-citation in the scientific literature: a new measure of the relationship between two documents. Journal of the American Society for Information Science 24, 265–269 (1973)

    Article  Google Scholar 

  8. Strehl, A., Ghosh, J., Mooney, R.: Impact of similarity measures on web-page clustering. In: Workshop on Artificial Intelligence for Web Search (AAAI 2000), pp. 58–64 (2000)

    Google Scholar 

  9. Klein, D., Haveliwala, T.H., Gionis, A., Indyk, P.: Evaluating strategies for similarity search on the web. In: Proceedings of the 11th International Conference on World Wide Web (2002)

    Google Scholar 

  10. Wang, Y., Kitsuregawa, M.: Evaluating contents-link coupled web page clustering for web search results. In: Proceedings of the Eleventh International Conference on Information and Knowledge Management, p. 506. ACM, New York (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gipp, B., Taylor, A., Beel, J. (2010). Link Proximity Analysis - Clustering Websites by Examining Link Proximity. In: Lalmas, M., Jose, J., Rauber, A., Sebastiani, F., Frommholz, I. (eds) Research and Advanced Technology for Digital Libraries. ECDL 2010. Lecture Notes in Computer Science, vol 6273. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15464-5_54

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15464-5_54

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15463-8

  • Online ISBN: 978-3-642-15464-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics