A Rule-Based Approach for Extraction of Link-Context from Anchor-Text Structure

  • Suresh Kumar
  • Naresh Kumar
  • Manjeet Singh
  • Asok De
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 182)

Abstract

Most of the researchers have widely explored the use of link-context to determine the theme of target web-page. Link-context has been applied in areas such as search engines, focused crawlers, and automatic classification. Therefore, extraction of precise link-context may be considered as an important parameter for extracting more relevant information from the web-page. In this paper, we have proposed a rule-based approach for the extraction of the link-context from anchor-text (AT) structure using bottom-up simple LR (SLR) parser. Here, we have considered only named entity (NE) anchor-text. In order to validate our proposed approach, we have considered a sample of 4 ATs. The results have shown that, the proposed LCEA has extracted 100% actual link-context of each considered AT.

Keywords

Ontology Augmented Context-Embedded grammar SLR parser Indexing Focused-Crawling Semantic-Web NLP Bare-Concept 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Jing, T., Ping, T., Zuo, W.: Deriving Link Context through Dependency Analysis. In: IEEE International Conference on Education Technology and Computer (2009)Google Scholar
  2. 2.
    Java, A., et al.: Using a Natural Language Understanding System to Generate Semantic Web Content. International Journal on Semantic Web and Information Systems 3(4) (2007)Google Scholar
  3. 3.
    Chauhan, N., Sharma, A.K.: Analyzing Anchor- Links to Extract Semantic Inference of a Web page. In: 10th IEEE International Conference on Information Technology (2007)Google Scholar
  4. 4.
    Xu, Q., Zuo, W.: Extracting Precise Link Context Using NLP Parsing Technique. In: Proceeding of the IEEE/WIC/ACM International Conference on Web Intelligence, WI 2004 (2004)Google Scholar
  5. 5.
    Pant, G.: Deriving Link-context from HTML Tag Tree. In: Proceedings of 8th SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (2003)Google Scholar
  6. 6.
    Henzinger, M., et al.: Link Analysis in Web Information Retrieval. IEEE Data Engineering Bulletin 23(3), 3–8 (2000)Google Scholar
  7. 7.
    Aho, A.V., Ullman, J.D.: Principals of Compiler Design, pp. 197–214. Narosa Publishing House (25th reprint 2003)Google Scholar
  8. 8.
    Fensal, D., Van Harmelen, Horrocks, I., McGuinness, Patel-Scheider: OIL: An ontology Infrastructure for the Semantic Web. IEEE Intelligent Systems 16(2), 38–45 (2001)CrossRefGoogle Scholar
  9. 9.
    Klein, M.: Tutorial: The Semantic Web- XML, RDF, and Relatives. IEEE Intelligent Systems 16(2), 26–28 (2001)CrossRefGoogle Scholar
  10. 10.
    Hebeler, J., Fisher, M., Blace, R., Lopez, A.P.: Semantic Web Programming, pp. 63–139. Wiley Publication (2009)Google Scholar
  11. 11.
    Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN Systems 30(1-7), 107–117 (1998)CrossRefGoogle Scholar
  12. 12.
    Aggarwal, C.C., AI-Garawi, F., Yu, P.S.: Intelligent crawling on the World Wide Web with arbitrary predicates. In: WWW 10, Hong Kong (May 2001)Google Scholar
  13. 13.
    Chauhan, N., Sharma, A.K.: A framework to derive web page context from hyperlink structure. International Journal of Information and Communication Technology 1(3/4), 329–346Google Scholar
  14. 14.
    Attardi, G., Gulli, A., Sebastini, F.: Automatic Web page categorization by link and context analysis. In: Proceeding of THAI 1999, 1st European Symposium on Telematics, Hypermedia and Artificial Intelligence (1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Suresh Kumar
    • 1
  • Naresh Kumar
    • 2
  • Manjeet Singh
    • 3
  • Asok De
    • 1
  1. 1.Institute of Advanced Communication Technologies & ResearchDelhiIndia
  2. 2.AIIT, Amity UniversityNoidaIndia
  3. 3.YMCA University of Science & TechnologyFaridabadIndia

Personalised recommendations