Statistical Recognition of References in Czech Court Decisions

  • Vincent Kríž
  • Barbora Hladká
  • Jan Dědek
  • Martin Nečaský
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8856)

Abstract

We address the task of detection and classification of references in Czech court decisions, mainly we focus on references to other court decisions and acts. In addition, we are interested in detection of institutions that issued documents under consideration. We handle these references like entities in the task of Named Entity Recognition. We approach the task using machine learning methods, namely HMM and Perceptron algorithm and we report F-measure over 90% averaged over all entities. The results significantly outperform the systems published previously.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Gantz, J., Reinsel, D.: The digital universe decade – are you ready (2010), http://goo.gl/ZaO0PR
  2. 2.
    Ratinov, L., Roth, D.: Design challenges and misconceptions in named entity recognition. In: Proceedings of the Thirteenth Conference on Computational Natural Language Learning, pp. 147–155. Association for Computational Linguistics (2009)Google Scholar
  3. 3.
    Quaresma, P., Gonçalves, T.: Using linguistic information and machine learning techniques to identify entities from juridical documents. In: Francesconi, E., Montemagni, S., Peters, W., Tiscornia, D. (eds.) Semantic Processing of Legal Texts. LNCS, vol. 6036, pp. 44–59. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  4. 4.
    Dozier, C., Kondadadi, R., Light, M., Vachher, A., Veeramachaneni, S., Wudali, R.: Named entity recognition and resolution in legal text. In: Francesconi, E., Montemagni, S., Peters, W., Tiscornia, D. (eds.) Semantic Processing of Legal Texts. LNCS, vol. 6036, pp. 27–43. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  5. 5.
    de Maat, E., Winkels, R., van Engers, T.M.: Automated detection of reference structures in law. In: van Engers, T.M. (ed.) JURIX. Frontiers in Artificial Intelligence and Applications, vol. 152, pp. 41–50. IOS Press (2006)Google Scholar
  6. 6.
    Palmirani, M., Brighi, R., Massini, M.: Automated extraction of normative references in legal texts. In: Proceedings of the 9th International Conference on Artificial Intelligence and Law, pp. 105–106. ACM (2003)Google Scholar
  7. 7.
    Bruckschen, M., Northfleet, C., Silva, D., Bridi, P., Granada, R., Vieira, R., Rao, P., Sander, T.: Named entity recognition in the legal domain for ontology population. In: Workshop Programme, p. 16 (2010)Google Scholar
  8. 8.
    Quaresma, P., Gonçalves, T.: Using linguistic information and machine learning techniques to identify entities from juridical documents. In: Francesconi, E., Montemagni, S., Peters, W., Tiscornia, D. (eds.) Semantic Processing of Legal Texts. LNCS, vol. 6036, pp. 44–59. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  9. 9.
    Bacci, L., Francesconi, E., Sagri, M.: A rule-based parsing approach for detecting case law references in italian court decisions. In: Semantic Processing of Legal Texts (SPLeT-2012) Workshop Programme, p. 27 (2012)Google Scholar
  10. 10.
    De, E., Winkels, R., van Engers, T.: Automated detection of reference structures in law. In: Frontiers in Artificial Intelligence and Applications, p. 41 (2006)Google Scholar
  11. 11.
    Tjong Kim Sang, E.F., De Meulder, F.: Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition. In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, vol. 4, pp. 142–147. Association for Computational Linguistics (2003)Google Scholar
  12. 12.
    Suzuki, J., Isozaki, H.: Semi-supervised sequential labeling and segmentation using giga-word scale unlabeled data. In: ACL, pp. 665–673. Citeseer (2008)Google Scholar
  13. 13.
    Ando, R.K., Zhang, T.: A high-performance semi-supervised learning method for text chunking. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 1–9. Association for Computational Linguistics (2005)Google Scholar
  14. 14.
    Straková, J., Straka, M., Hajič, J.: A new state-of-the-art czech named entity recognizer. In: Habernal, I., Matousek, V. (eds.) TSD 2013. LNCS, vol. 8082, pp. 68–75. Springer, Heidelberg (2013)Google Scholar
  15. 15.
    Konkol, M., Konopík, M.: Maximum entropy named entity recognition for czech language. In: Habernal, I., Matoušek, V. (eds.) TSD 2011. LNCS, vol. 6836, pp. 203–210. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  16. 16.
    de Maat, E., Krabben, K., Winkels, R.: Machine Learning versus Knowledge Based Classification of Legal Texts. In: Proceedings of the 2010 Conference on Legal Knowledge and Information Systems: JURIX 2010: The Twenty-Third Annual Conference, pp. 87–96. IOS Press, Amsterdam (2010)Google Scholar
  17. 17.
    Stenetorp, P., Pyysalo, S., Topić, G., Ohta, T., Ananiadou, S., Tsujii, J.: brat: a web-based tool for nlp-assisted text annotation. In: Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics, pp. 102–107. Association for Computational Linguistics (2012)Google Scholar
  18. 18.
    Fleiss, J.L.: Measuring nominal scale agreement among many raters. Psychological Bulletin  76, 378 (1971)CrossRefGoogle Scholar
  19. 19.
    Carletta, J.: Assessing agreement on classification tasks: the kappa statistic. Computational linguistics 22, 249–254 (1996)Google Scholar
  20. 20.
    Li, Y., Zaragoza, H., Herbrich, R., Shawe-Taylor, J., Kandola, J.S.: The perceptron algorithm with uneven margins. In: Proceedings of the Nineteenth International Conference on Machine Learning, ICML 2002, pp. 379–386. Morgan Kaufmann Publishers Inc., San Francisco (2002)Google Scholar
  21. 21.
    Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V.: GATE: A Framework and Graphical Development Environment for Robust NLP Tools and Applications. In: Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics, ACL 2002 (2002)Google Scholar
  22. 22.
    Kim, K.-B., Kim, S., Joo, Y., Oh, A.-S.: Enhanced fuzzy single layer perceptron. In: Wang, J., Liao, X.-F., Yi, Z. (eds.) ISNN 2005. LNCS, vol. 3496, pp. 603–608. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  23. 23.
    Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20, 273–297 (1995)MATHGoogle Scholar
  24. 24.
    Li, Y., Bontcheva, K., Cunningham, H.: Using uneven margins svm and perceptron for information extraction. In: Proceedings of the Ninth Conference on Computational Natural Language Learning, pp. 72–79. Association for Computational Linguistics (2005)Google Scholar
  25. 25.
    Merialdo, B.: Tagging english text with a probabilistic model. Comput. Linguist. 20, 155–171 (1994)Google Scholar
  26. 26.
    Bikel, D.M., Miller, S., Schwartz, R., Weischedel, R.: Nymble: a high-performance learning name-finder. In: Proceedings of the Fifth Conference on Applied Natural Language Processing, pp. 194–201. Association for Computational Linguistics (1997)Google Scholar
  27. 27.
    Nadeau, C., Bengio, Y.: Inference for the generalization error. Machine Learning 52, 239–281 (2003)CrossRefMATHGoogle Scholar
  28. 28.
    Berners-Lee, T.: Linked data - design issues. W3C (2006)Google Scholar
  29. 29.
    Lassila, O., Swick, R.R.: Resource description framework (RDF) model and syntax specification. Technical report (1999), http://www.w3.org/TR/1999/REC-rdf-syntax-19990222/

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Vincent Kríž
    • 1
  • Barbora Hladká
    • 1
  • Jan Dědek
    • 2
  • Martin Nečaský
    • 2
  1. 1.Institute of Formal and Applied LinguisticsCharles University in PraguePraha 1Czech Republic
  2. 2.Department of Software Engineering , Faculty of Mathematics and PhysicsCharles University in PraguePraha 1Czech Republic

Personalised recommendations