Advertisement

Preliminary Study on Automatic Induction of Rules for Recognition of Semantic Relations between Proper Names in Polish Texts

  • Michał Marcińczuk
  • Marcin Ptak
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7499)

Abstract

In the paper we present a preliminary work on automatic construction of rules for recognition of semantic relations between pairs of proper names in Polish texts. Our goal was to check the feasibility of automatic rule construction using existing inductive logic programming (ILP) system as an alternative or supporting method for manual rule creation. We present a set of predicates in first-order logic that is used to represent the semantic relation recognition task. The background knowledge encode the morphological, orthographic and named entity-based features. We applied an ILP on the proposed representation to generate rules for relation extraction. We have utilized an existing ILP system called Aleph [1]. The performance of automatically generated rules was compared with a set of hand-crafted rules developed on the basis of training set for 8 categories of relations (affiliation, alias, creator, composition, location, nationality, neighbourhood, origin). Finally, we proposed several ways how to improve to preliminary results in the future work.

Keywords

Semantic Relations Named Entities Proper Names Rule Induction ILP Polish 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Srinivasan, A.: The Aleph Manual (2006), http://www.cs.ox.ac.uk/activities/machlearn/Aleph/aleph.html
  2. 2.
    Linguistic Data Consortium (LDC). ACE (Automatic Content Extraction) English Annotation Guidelines for Relations (2008)Google Scholar
  3. 3.
    Pyysalo, S., Ohta, T., Tsujii\(\dag\), J.: Overview of the Entity Relations (REL) supporting task of BioNLP Shared Task 2011. In: Proceedings of BioNLP Shared Task 2011 Workshop, June 24, pp. 83–88. Association for Computational Linguistics, Portland (2011)Google Scholar
  4. 4.
    Marciniak, M., Mykowiecka, A.: Automatic processing of diabetic patients’ hospital documentation. In: Annual Meeting of the ACL (2007)Google Scholar
  5. 5.
    Patwardhan, S., Riloff, E.: Learning Domain-Specific Information Extraction Patterns from the Web. In: ACL 2006 Workshop on Information Extraction Beyond the Document (2006)Google Scholar
  6. 6.
    Califf, M.E.: Relational learning techniques for natural language information extraction. Doctor of philosophy, The University of Texas at Austin (1998)Google Scholar
  7. 7.
    Freitag, D.: Machine learning for information extraction in informal domains. Doctor of philosophy. Carnegie Mellon University (1998)Google Scholar
  8. 8.
    Wróblewska, A., Woliński, M.: Preliminary Experiments in Polish Dependency Parsing. In: Bouvry, P., Kłopotek, M.A., Leprévost, F., Marciniak, M., Mykowiecka, A., Rybiński, H. (eds.) SIIS 2011. LNCS, vol. 7053, pp. 279–292. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  9. 9.
    Marcińczuk, M., Janicki, M.: Optimizing CRF-Based Model for Proper Name Recognition in Polish Texts. In: Gelbukh, A. (ed.) CICLing 2012, Part I. LNCS, vol. 7181, pp. 258–269. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  10. 10.
    Broda, B., Marcińczuk, M., Maziarz, M., Radziszewski, A., Wardyński, A.: KPWr: Towards a Free Corpus of Polish. In: Proceedings of the 8th ELRA Conference on Language Resources and Evaluation LREC 2012, Istanbul, Turkey (2012)Google Scholar
  11. 11.
    Marcińczuk, M., Stanek, M., Piasecki, M., Musiał, A.: Rich Set of Features for Proper Name Recognition in Polish Texts. In: Bouvry, P., Kłopotek, M.A., Leprévost, F., Marciniak, M., Mykowiecka, A., Rybiński, H. (eds.) SIIS 2011. LNCS, vol. 7053, pp. 332–344. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  12. 12.
    Quinlan, J.R., Cameron-jones, R.M.: FOIL: A Midterm Report. In: Brazdil, P.B. (ed.) ECML 1993. LNCS, vol. 667, pp. 3–20. Springer, Heidelberg (1993)Google Scholar
  13. 13.
    Muggleton, S., Feng, C.: Efficient induction in logic programs. In: Muggleton, S. (ed.) Inductive Logic Programming, pp. 281–298. Academic Press (1992)Google Scholar
  14. 14.
    Muggleton, S.: Inverse Entailment and Progol. New Generation Computing Journal 13, 245–286 (1995), http://www.doc.ic.ac.uk/~shm/progol.html CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Michał Marcińczuk
    • 1
  • Marcin Ptak
    • 1
  1. 1.Wrocław University of TechnologyWrocławPoland

Personalised recommendations