Abstract
This research-in-progress paper presents a new approach called Link Proximity Analysis (LPA) for identifying related web pages based on link analysis. In contrast to current techniques, which ignore intra-page link analysis, the one put forth here examines the relative positioning of links to each other within websites. The approach uses the fact that a clear correlation between the proximity of links to each other and the subject-relatedness of the linked websites can be observed on nearly every web page. By statistically analyzing this relationship and measuring the amount of sentences, paragraphs, etc. between two links, related websites can be automatically, identified as a first study has proven.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Gipp, B., Beel, J.: Citation Proximity Analysis (CPA) - A new approach for identifying related work based on Co-Citation Analysis. In: Larsen, B., Leta, J. (eds.) Proceedings of the 12th International Conference on Scientometrics and Informetrics (ISSI 2009), Rio de Janeiro, Brazil, vol. 2, pp. 571–575 (July 2009), ISSN 2175-1935, http://www.sciplore.org
Gipp, B., Beel, J., Hentschel, C.: Scienstein: A Research Paper Recommender System. In: Proceedings of the International Conference on Emerging Trends in Computing (ICETiC 2009), Virudhunagar, India, pp. 309–315. Kamaraj College of Engineering and Technology India/IEEE (January 2009), http://www.sciplore.org
Kessler, M.M.: Bibliographic coupling between scientific papers. American Documentation 14, 10–25 (1963)
Marshakova, I.V.: System of document connections based on references. Scientific and Technical Information Serial of VINITI 6(2), 3–8 (1973)
Fogaras, D., Rácz, B.: Scaling link-based similarity search. In: Proceedings of the 14th International Conference on World Wide Web Conference (2005)
Dutta, A.K.R., Ghosh, I., Mukhopadhyay, D.: An Advanced Partitioning Approach of Web Page Clustering utilizing Content & Link Structure. Journal of Convergence Information Technology 4, 65–71 (2009)
Small, H.: Co-citation in the scientific literature: a new measure of the relationship between two documents. Journal of the American Society for Information Science 24, 265–269 (1973)
Strehl, A., Ghosh, J., Mooney, R.: Impact of similarity measures on web-page clustering. In: Workshop on Artificial Intelligence for Web Search (AAAI 2000), pp. 58–64 (2000)
Klein, D., Haveliwala, T.H., Gionis, A., Indyk, P.: Evaluating strategies for similarity search on the web. In: Proceedings of the 11th International Conference on World Wide Web (2002)
Wang, Y., Kitsuregawa, M.: Evaluating contents-link coupled web page clustering for web search results. In: Proceedings of the Eleventh International Conference on Information and Knowledge Management, p. 506. ACM, New York (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gipp, B., Taylor, A., Beel, J. (2010). Link Proximity Analysis - Clustering Websites by Examining Link Proximity. In: Lalmas, M., Jose, J., Rauber, A., Sebastiani, F., Frommholz, I. (eds) Research and Advanced Technology for Digital Libraries. ECDL 2010. Lecture Notes in Computer Science, vol 6273. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15464-5_54
Download citation
DOI: https://doi.org/10.1007/978-3-642-15464-5_54
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15463-8
Online ISBN: 978-3-642-15464-5
eBook Packages: Computer ScienceComputer Science (R0)