Coping with Web Knowledge
The web seems to be the biggest existing information repository. The extraction of information from this repository has attracted the interest of many researchers, who have developed intelligent algorithms (wrappers) able to extract structured syntactic information automatically.
In this article, we formalise a new solution in order to extract knowledge from today’s non-semantic web. It is novel in that it associates semantics with the information extracted, which improves agent interoperability; furthermore, it achieves to delegate the knowledge extraction procedure to specialist agents, easing software development and promoting software reuse and maintainability.
Keywordsknowledge extraction wrappers web agents and ontologies
Unable to display preview. Download preview PDF.
- 1.ISO/IEC 13568:2002. Information technology—Z formal specification notation—syntax, type system and semantics. International Standard.Google Scholar
- 2.J. L. Arjona, R. Corchuelo, A. Ruiz, and M. Toro. A practical agent-based method to extract semantic information from the web. In Advanced Information Systems Engineering, 14th International Conference, CAiSE 2002, volume 2348 of Lecture Notes in Computer Science, pages 697–700. Springer, 2002.Google Scholar
- 5.D. Brickley and R.V. Guha. Resource description framework schema specification 1.0. Technical Report http://www.w3.org/TR/2000/CR-rdf-schema-20000327, W3C Consortium, March 2000.
- 6.W.W. Cohen and L.S. Jensen. A structured wrapper induction system for extracting information from semi-structured documents. In Proceedings of the Workshop on Adaptive Text Extraction and Mining (IJCAI’01), 2001.Google Scholar
- 7.O. Corcho and A. Gómez-Pérez. A road map on ontology specification languages. In Proceedings of the Workshop on Applications of Ontologies and Problem Solving Methods. 14 th European Conference on Artificial Intelligence (ECAI’00), pages 80–96, 2000.Google Scholar
- 8.L. Eikvil. Information extraction from world wide web — a survey. Technical Report 945, Norweigan Computing Center, 1999.Google Scholar
- 9.D. Fensel, editor. Spinning the Semantic Web: Bringing the World Wide Web to Its Full Potential. The MIT Press, 2002.Google Scholar
- 10.Dayne Freitag and Nicholas Kushmerick. Boosted wrapper induction. In AAAI/IAAI, pages 577–583, 2000.Google Scholar
- 11.N. Guarino. Formal ontology and information systems. In N. Guarino, editor, Proceedings of the 1st International Conference on Formal Ontologies in Information Systems, FOIS’98, Trento, Italy, pages 3–15. IOS Press, June 1998.Google Scholar
- 12.J. Heflin. Towards the Semantic Web: Knowledge Representation in a Dynamic, Distributed Environment. PhD thesis, University of Maryland, College Park, 2001.Google Scholar
- 13.I. Horrocks, P.F. Patel-Schneider, and F. van Harmelen. Reviewing the design of DAML+OIL: An ontology language for the semantic web. Technical Report http://www.daml.org, Defense Advanced Research Projects Agency, 2002.
- 14.C.A. Knoblock, K. Lerman, S. Minton, and I. Muslea. Accurately and reliably extracting data from the web: A machine learning approach. IEEE Data Engineering Bulletin, 23(4):33–41, 2000.Google Scholar
- 16.Ling Liu, Calton Pu, and Wei Han. XWRAP: An XML-enabled wrapper construction system for web information sources. In ICDE, pages 611–621, 2000.Google Scholar
- 17.S. Luke, L. Spector, D. Rager, and J. Hendler. Ontology-based web agents. In W.L. Johnson and B. Hayes-Roth, editors, Proceedings of the First International Conference on Autonomous Agents (Agents’97), pages 59–68, Marina del Rey, CA, USA, 1997. ACM Press.Google Scholar
- 18.M. Minsky. A framework for representing knowledge. McGraw-Hill, New York, 1975.Google Scholar
- 19.I. Muslea, S. Minton, and C. Knoblock. STALKER: Learning extraction rules for semistructured, web-based information sources. In Proceedings of the AAAI-98 Workshop on AI and Information Integration, 1998.Google Scholar