A Technique for Information Retrieval from Microformatted Websites
In this work, we introduce a new method for information extraction from the semantic web. The fundamental idea is to model the semantic information contained in the microformats of a set of web pages, by using a data structure called semantic network. Then, we introduce a novel technique for information extraction from semantic networks. In particular, the technique allows us to extract a portion—a slice—of the semantic network with respect to some criterion of interest. The slice obtained represents relevant information retrieved from the semantic network and thus from the semantic web. Our approach can be used to design novel tools for information retrieval and presentation, and for information filtering that was distributed along the semantic web.
Unable to display preview. Download preview PDF.
- 1.Microformats.org. The Official Microformats Site (2009), http://microformats.org/
- 3.hCard. Simple, Open, Distributed Format for Representing People, Companies, Organizations, and Places (2009), http://microformats.org/wiki/hcard
- 5.Sowa, J.F.: Semantic Networks. In: Shapiro, S.C. (ed.) Encyclopedia of Artificial Intelligence. John Wiley & Sons, Chichester (1992)Google Scholar
- 7.Mollá, D.: Learning of Graph-based Question Answering Rules. In: Proc. HLT/NAACL 2006 Workshop on Graph Algorithms for Natural Language Processing, pp. 37–44 (2006)Google Scholar