Skip to main content

Automatic Identification of Legal Terms in Czech Law Texts

  • Chapter
Semantic Processing of Legal Texts

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6036))

Abstract

Law texts including constitution, acts, public notices and court judgements form a huge database of texts. As many texts from small domains, the used sublanguage is partially restricted and also different from general language (Czech). As a starting collection of data, the legal database Lexis containing approx. 50,000 Czech law documents has been chosen. Our attention is concentrated mostly on noun groups, which are the main candidates for law terms. We were able to recognize 3992 such different noun groups in the selected text samples. The paper also presents results of the morphological analysis, lemmatization, tagging, disambiguation, and the basic syntactic analysis of Czech law texts as these tasks are crucial for any further sophisticated natural language processing. The verbs in legal texts have been explored preliminarily as well. In this respect, we are trying to explore how the linguistic analysis can help in identification of the semantic nature of law terms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Sedláček, R.: Morphemic Analyser for Czech. PhD thesis, Faculty of Informatics, Masaryk University, Brno (2005)

    Google Scholar 

  2. Šmerk, P.: Towards morphological disambiguation of Czech. PhD thesis proposals, Faculty of Informatics, Masaryk University, Brno (2007) (in Czech)

    Google Scholar 

  3. Žáčková, E.: Partial syntactic analysis of Czech. PhD thesis, Faculty of Informatics, Masaryk University, Brno (2002) (in Czech)

    Google Scholar 

  4. Horák, A., Hlaváčková, D.: VerbaLex – New Comprehensive Lexicon of Verb Valencies for Czech. In: Computer Treatment of Slavic and East European Languages, Third International Seminar, Bratislava, VEDA, pp. 107–115 (2005)

    Google Scholar 

  5. Schulze, B.M., Christ, O.: The CQP User’s Manual (1996)

    Google Scholar 

  6. Rychlý, P.: Corpus managers and their effective implementation. PhD thesis, Faculty of Informatics, Masaryk University, Brno (2000)

    Google Scholar 

  7. Čermák, F., et al.: The Czech National Corpus – SYN2000. Institute of the Czech National Corpus, Prague (2000), http://www.korpus.cz

  8. Vossen, P., et al.: The EuroWordNet Base Concepts and Top Ontology. Technical Report Deliverable D017, EuroWordNet LE2 4003, University of Amsterdam (1998)

    Google Scholar 

  9. Miller, G.A., Fellbaum, C., et al.: WordNet 3.0. Princeton University (2006), http://wordnet.princeton.edu

  10. Pala, K., Ševeček, P.: Valence českých sloves (Valences of Czech Verbs). In: Sborník prací Filozofické fakulty Masarykovy univerzity, Brno, Masaryk University, pp. 41–54 (1997)

    Google Scholar 

  11. Peters, W., Sagri, M., Tiscornia, D.: The structuring of legal knowledge in LOIS. Artficial Intelligence and Law 15, 2 (2007)

    Google Scholar 

  12. Hlaváčková, D., Khokhlova, M., Pala, K.: Semantic Classes of Czech Verbs. In: Proceedings of the IIS Conference 2009, Krakow (2009) (in print)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Pala, K., Rychlý, P., Šmerk, P. (2010). Automatic Identification of Legal Terms in Czech Law Texts. In: Francesconi, E., Montemagni, S., Peters, W., Tiscornia, D. (eds) Semantic Processing of Legal Texts. Lecture Notes in Computer Science(), vol 6036. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12837-0_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12837-0_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12836-3

  • Online ISBN: 978-3-642-12837-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics