Abstract
This article aims to give introduce to an Ontology-based Web information extraction system and the related content about Web information extraction and Ontology. This paper introduces the modules of the Web information extraction system, including: Web page preprocessing, DOM tree formation, Positioning information domain, Lexical analysis, Ontology construction, Ontology analysis, Keyword management, Rule generation, Information extraction and Information storage. And then, it also describes the experimental results. Finally, it describes the development trends and challenges of Ontology-based Web information extraction.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Liu, J.-G., Liu, G.-S., He, L.-Y., Chen, S.: Status and development of Web-based information extraction technolog. Fujian Computer (2007)
Deng, Z., Tang, S., Zhang, M., Yang, D., Chen, J.: Overview of Ontology. Acta Scientiarum Naturalium Universitatis Pekinensis (2002)
Perez, A.G., Benjamins, V.R.: Overview of Knowledge Sharing and Reuse Components: Ontologies and Problem-Solving Methods. In: Stockholm, V.R., Benjamins, B., Chandrasekaran, A. (eds.) Proceedings of the IJCAI 1999 Workshop on Ontologies and Problem-Solving Methods (KRR5), pp. 1–15 (1999)
Liu, B., Grossman, R., Zhai, Y.: Mining data records in Web pages. In: Getoor, L., et al. (eds.) ACM SIGKDD Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, pp. 601–606 (2003)
Eikvil, L.: Information extraction from world wide web – a survey. Technical Report 945. Norweigan Computing Center, Norway (1999)
Chen, J., Zhu, Q.-M., Gong, Z.-X.: Overview of Ontology-Based Information Extraction. Comput ER Technology and Development (2007)
Yang, X.-Q., Kong, D.-R., Shi, H., Sun, N., Zhang, Y.: Web information extraction based on domain ontology. Information Technology (2009)
Liu, J.-G., Chen, S., Huang, Y.: Improved Ontology-based Web Information Extraction. Computer Engineering (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mo, Q., Chen, Yh. (2012). Ontology-Based Web Information Extraction. In: Zhao, M., Sha, J. (eds) Communications and Information Processing. Communications in Computer and Information Science, vol 288. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31965-5_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-31965-5_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31964-8
Online ISBN: 978-3-642-31965-5
eBook Packages: Computer ScienceComputer Science (R0)