Towards Semantic Annotation Supported by Dependency Linguistics and ILP

  • Jan Dědek
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6497)


In this paper we present a method for semantic annotation of texts, which is based on a deep linguistic analysis (DLA) and Inductive Logic Programming (ILP). The combination of DLA and ILP have following benefits: Manual selection of learning features is not needed. The learning procedure has full available linguistic information at its disposal and it is capable to select relevant parts itself. Learned extraction rules can be easily visualized, understood and adapted by human. A description, implementation and initial evaluation of the method are the main contributions of the paper.


Semantic Annotation Dependency Linguistics Inductive Logic Programming Information Extraction Machine Learning 


  1. 1.
    Aitken, S.: Learning information extraction rules: An inductive logic programming approach. In: van Harmelen, F. (ed.) Proceedings of the 15th European Conference on Artificial Intelligence. IOS Press, Amsterdam (2002)Google Scholar
  2. 2.
    Bontcheva, K., Tablan, V., Maynard, D., Cunningham, H.: Evolving GATE to Meet New Challenges in Language Engineering. Natural Language Engineering 10(3/4), 349–373 (2004)CrossRefGoogle Scholar
  3. 3.
    Bunescu, R., Mooney, R.: Extracting relations from text: From word sequences to dependency paths. In: Kao, A., Poteet, S.R. (eds.) Natural Language Processing and Text Mining, ch. 3, pp. 29–44. Springer, London (2007)CrossRefGoogle Scholar
  4. 4.
    Buyko, E., Faessler, E., Wermter, J., Hahn, U.: Event extraction from trimmed dependency graphs. In: BioNLP 2009: Proceedings of the Workshop on BioNLP, pp. 19–27. ACL, Morristown (2009)Google Scholar
  5. 5.
    Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V.: GATE: A framework and graphical development environment for robust NLP tools and applications. In: Proceedings of the 40th Anniversary Meeting of the ACL (2002)Google Scholar
  6. 6.
    Cunningham, H., Maynard, D., Tablan, V.: JAPE: a Java Annotation Patterns Engine. Tech. rep., Department of Computer Science, The University of Sheffield (2000),
  7. 7.
    Dědek, J., Vojtáš, P.: Computing aggregations from linguistic web resources: a case study in czech republic sector/traffic accidents. In: Dini, C. (ed.) Second International Conference on Advanced Engineering Computing and Applications in Sciences, pp. 7–12. IEEE Computer Society, Los Alamitos (2008), Google Scholar
  8. 8.
    Etzioni, O., Banko, M., Soderland, S., Weld, D.S.: Open information extraction from the web. ACM Commun. 51(12), 68–74 (2008)CrossRefGoogle Scholar
  9. 9.
    Fundel, K., Küffner, R., Zimmer, R.: Relex—relation extraction using dependency parse trees. Bioinformatics 23(3), 365–371 (2007)Google Scholar
  10. 10.
    Hajič, J., Hajičová, E., Hlaváčová, J., Klimeš, V., Mírovský, J., Pajas, P., Štěpánek, J., Vidová-Hladká, B., Žabokrtský, Z.: Prague dependency treebank 2.0 CD-ROM. In: Linguistic Data Consortium LDC2006T01, Philadelphia (2006)Google Scholar
  11. 11.
    Hepp, M.: Goodrelations: An ontology for describing products and services offers on the web. In: Gangemi, A., Euzenat, J. (eds.) EKAW 2008. LNCS (LNAI), vol. 5268, pp. 329–346. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  12. 12.
    Junker, M., Sintek, M., Sintek, M., Rinck, M.: Learning for text categorization and information extraction with ILP. In: Cussens, J., Džeroski, S. (eds.) LLL 1999. LNCS (LNAI), vol. 1925, pp. 84–93. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  13. 13.
    Li, Y., Bontcheva, K., Cunningham, H.: Adapting SVM for Data Sparseness and Imbalance: A Case Study on Information Extraction. Natural Language Engineering 15(02), 241–271 (2009), CrossRefGoogle Scholar
  14. 14.
    Li, Y., Zaragoza, H., Herbrich, R., Shawe-Taylor, J., Kandola, J.S.: The perceptron algorithm with uneven margins. In: ICML 2002: Proceedings of the Nineteenth International Conference on Machine Learning, pp. 379–386. Morgan Kaufmann Publishers Inc., San Francisco (2002)Google Scholar
  15. 15.
    Muggleton, S.: Inverse entailment and progol. New Generation Computing, Special issue on Inductive Logic Programming 13(3-4), 245–286 (1995)CrossRefGoogle Scholar
  16. 16.
    Muggleton, S.: Inductive logic programming. New Generation Computing 8(4), 295–318 (1991), CrossRefzbMATHGoogle Scholar
  17. 17.
    Ramakrishnan, G., Joshi, S., Balakrishnan, S., Srinivasan, A.: Using ilp to construct features for information extraction from semi-structured text. In: Blockeel, H., Ramon, J., Shavlik, J., Tadepalli, P. (eds.) ILP 2007. LNCS (LNAI), vol. 4894, pp. 211–224. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  18. 18.
    Wang, R., Neumann, G.: Recognizing textual entailment using sentence similarity based on dependency tree skeletons. In: RTE 2007: Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing, pp. 36–41. ACL, Morristown (2007)CrossRefGoogle Scholar
  19. 19.
    Yakushiji, A., Tateisi, Y., Miyao, Y., Tsujii, J.: Event extraction from biomedical papers using a full parser. In: Pac. Symp. Biocomput., pp. 408–419 (2001)Google Scholar
  20. 20.
    Žabokrtský, Z., Ptáček, J., Pajas, P.: TectoMT: Highly modular MT system with tectogrammatics used as transfer layer. In: Proceedings of the 3rd Workshop on Statistical Machine Translation, pp. 167–170. ACL, Columbus (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Jan Dědek
    • 1
  1. 1.Department of Software EngineeringCharles UniversityPragueCzech Republic

Personalised recommendations