Coping with Web Knowledge

  • J. L. Arjona
  • R. Corchuelo
  • J. Peña
  • D. Ruiz
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2663)


The web seems to be the biggest existing information repository. The extraction of information from this repository has attracted the interest of many researchers, who have developed intelligent algorithms (wrappers) able to extract structured syntactic information automatically.

In this article, we formalise a new solution in order to extract knowledge from today’s non-semantic web. It is novel in that it associates semantics with the information extracted, which improves agent interoperability; furthermore, it achieves to delegate the knowledge extraction procedure to specialist agents, easing software development and promoting software reuse and maintainability.


knowledge extraction wrappers web agents and ontologies 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    ISO/IEC 13568:2002. Information technology—Z formal specification notation—syntax, type system and semantics. International Standard.Google Scholar
  2. 2.
    J. L. Arjona, R. Corchuelo, A. Ruiz, and M. Toro. A practical agent-based method to extract semantic information from the web. In Advanced Information Systems Engineering, 14th International Conference, CAiSE 2002, volume 2348 of Lecture Notes in Computer Science, pages 697–700. Springer, 2002.Google Scholar
  3. 3.
    T. Berners-Lee, J. Hendler, and O. Lassila. The semanticWeb. Scientific American, 284(5):34–43, May 2001.CrossRefGoogle Scholar
  4. 4.
    T.J. Berners-Lee, R. Cailliau, and J.-F. Groff. The World-Wide Web. Computer Networks and ISDN Systems, 25(4–5):454–459, November 1992.CrossRefGoogle Scholar
  5. 5.
    D. Brickley and R.V. Guha. Resource description framework schema specification 1.0. Technical Report, W3C Consortium, March 2000.
  6. 6.
    W.W. Cohen and L.S. Jensen. A structured wrapper induction system for extracting information from semi-structured documents. In Proceedings of the Workshop on Adaptive Text Extraction and Mining (IJCAI’01), 2001.Google Scholar
  7. 7.
    O. Corcho and A. Gómez-Pérez. A road map on ontology specification languages. In Proceedings of the Workshop on Applications of Ontologies and Problem Solving Methods. 14 th European Conference on Artificial Intelligence (ECAI’00), pages 80–96, 2000.Google Scholar
  8. 8.
    L. Eikvil. Information extraction from world wide web — a survey. Technical Report 945, Norweigan Computing Center, 1999.Google Scholar
  9. 9.
    D. Fensel, editor. Spinning the Semantic Web: Bringing the World Wide Web to Its Full Potential. The MIT Press, 2002.Google Scholar
  10. 10.
    Dayne Freitag and Nicholas Kushmerick. Boosted wrapper induction. In AAAI/IAAI, pages 577–583, 2000.Google Scholar
  11. 11.
    N. Guarino. Formal ontology and information systems. In N. Guarino, editor, Proceedings of the 1st International Conference on Formal Ontologies in Information Systems, FOIS’98, Trento, Italy, pages 3–15. IOS Press, June 1998.Google Scholar
  12. 12.
    J. Heflin. Towards the Semantic Web: Knowledge Representation in a Dynamic, Distributed Environment. PhD thesis, University of Maryland, College Park, 2001.Google Scholar
  13. 13.
    I. Horrocks, P.F. Patel-Schneider, and F. van Harmelen. Reviewing the design of DAML+OIL: An ontology language for the semantic web. Technical Report, Defense Advanced Research Projects Agency, 2002.
  14. 14.
    C.A. Knoblock, K. Lerman, S. Minton, and I. Muslea. Accurately and reliably extracting data from the web: A machine learning approach. IEEE Data Engineering Bulletin, 23(4):33–41, 2000.Google Scholar
  15. 15.
    N. Kushmerick. Wrapper verification. World Wide Web Journal, 3(2):79–94, 2000.zbMATHCrossRefGoogle Scholar
  16. 16.
    Ling Liu, Calton Pu, and Wei Han. XWRAP: An XML-enabled wrapper construction system for web information sources. In ICDE, pages 611–621, 2000.Google Scholar
  17. 17.
    S. Luke, L. Spector, D. Rager, and J. Hendler. Ontology-based web agents. In W.L. Johnson and B. Hayes-Roth, editors, Proceedings of the First International Conference on Autonomous Agents (Agents’97), pages 59–68, Marina del Rey, CA, USA, 1997. ACM Press.Google Scholar
  18. 18.
    M. Minsky. A framework for representing knowledge. McGraw-Hill, New York, 1975.Google Scholar
  19. 19.
    I. Muslea, S. Minton, and C. Knoblock. STALKER: Learning extraction rules for semistructured, web-based information sources. In Proceedings of the AAAI-98 Workshop on AI and Information Integration, 1998.Google Scholar
  20. 20.
    M. R. Quillian. Word concepts: A theory and simulation of some basic semantic capabilities. Behavioral Science, 12:410–430, 1967.CrossRefGoogle Scholar
  21. 21.
    M.J. Wooldridge and M.R. Jennings. Intelligent agents: Theory and practice. The Knowledge Engineering Review, 10(2):115–152, 1995.CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • J. L. Arjona
    • 1
  • R. Corchuelo
    • 1
  • J. Peña
    • 1
  • D. Ruiz
    • 1
  1. 1.The Distributed GroupSevillaSpain

Personalised recommendations