EuroVoc-Based Summarization of European Case Law

  • Florian SchmeddingEmail author
  • Peter Klügl
  • David Baehrens
  • Christian Simon
  • Kai Simon
  • Katrin Tomanek
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10791)


This work reports on the ongoing development of a multilingual pipeline for the summarization of European case law. We apply the TextRank algorithm on concepts of the EuroVoc thesaurus in order to extract summarizing keywords and sentences. In a first case study, we demonstrate the feasibility and usefulness of the presented approach for five different languages and 18 document sources.



Parts of this work have been supported by the European Commission under the 7th Framework Programme through the project EUCases–EUropean and National CASE Law and Legislation Linked in Open Data Stack (grant agreement no. 611760). We do also gratefully acknowledge the effort spent by all legal experts for finishing the questionnaires.


  1. 1.
    Alemany, L.A., Castellón, I., Climent, S., Fort, M.F., Padró, L., Rodríguez, H.: Approaches to text summarization: questions and answers. Inteligencia Artif. Rev. Iberoamericana de Inteligencia Artif. 8(22), 79–102 (2004)Google Scholar
  2. 2.
    Boella, G., Caro, L.D., Humphreys, L., Robaldo, L., Rossi, P., Torre, L.: Eunomos, a legal document and knowledge management system for the web to provide relevant, reliable and up-to-date information on the law. Artif. Intell. Law 24(3), 245–283 (2016)CrossRefGoogle Scholar
  3. 3.
    Boella, G., et al.: Linking legal open data: breaking the accessibility and language barrier in European legislation and case law. In: Proceedings of the 15th International Conference on Artificial Intelligence and Law, ICAIL 2015, pp. 171–175. ACM, New York (2015)Google Scholar
  4. 4.
    Boella, G., Di Caro, L., Rispoli, D., Robaldo, L.: A system for classifying multi-label text into EuroVoc. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Law, ICAIL 2013, pp. 239–240. ACM, New York (2013)Google Scholar
  5. 5.
    Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. Comput. Netw. ISDN Syst. 30(1–7), 107–117 (1998)CrossRefGoogle Scholar
  6. 6.
    Chieze, E., Farzindar, A., Lapalme, G.: An automatic system for summarization and information extraction of legal information. In: Francesconi, E., Montemagni, S., Peters, W., Tiscornia, D. (eds.) Semantic Processing of Legal Texts: Where the Language of Law Meets the Law of Language. LNCS (LNAI), vol. 6036, pp. 216–234. Springer, Heidelberg (2010). Scholar
  7. 7.
    Dalal, V., Malik, L.: A survey of extractive and abstractive text summarization techniques. In: 6th International Conference on Emerging Trends in Engineering and Technology (ICETET), pp. 109–110. IEEE (2013)Google Scholar
  8. 8.
    Daumke, P., Schulz, S., Markó, K.: Subword approach for acquiring and cross-linking multilingual specialized lexicons. In: Workshop on Acquiring and Representing Multilingual, Specialized Lexicons at LREC 2006 (2006)Google Scholar
  9. 9.
    Elfayoumy, S., Thoppil, J.: A survey of unstructured text summarization techniques. Int. J. Adv. Comput. Sci. Appl. 5(4), 149–154 (2014)Google Scholar
  10. 10.
    Erbs, N., Santos, P.B., Gurevych, I., Zesch, T.: DKPro keyphrases: flexible and reusable keyphrase extraction experiments. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 31–36. ACL (2014)Google Scholar
  11. 11.
    Erkan, G., Radev, D.R.: LexRank: graph-based lexical centrality as salience in text summarization. CoRR abs/1109.2128 (2011)
  12. 12.
    Farzindar, A., Lapalme, G.: LetSum, an automatic legal text summarizing system. In: Legal Knowledge and Information Systems, pp. 11–18 (2004)Google Scholar
  13. 13.
    Ferrucci, D., Lally, A.D.A.M.: UIMA: an architectural approach to unstructured information processing in the corporate research environment. Nat. Lang. Eng. 10(3–4), 327–348 (2004)CrossRefGoogle Scholar
  14. 14.
    Galgani, F., Compton, P., Hoffmann, A.: HAUSS: incrementally building a summarizer combining multiple techniques. Int. J. Hum.-Comput. Stud. 72(7), 584–605 (2014)CrossRefGoogle Scholar
  15. 15.
    Galgani, F., Compton, P., Hoffmann, A.G.: Summarization based on bi-directional citation analysis. Inf. Process. Manag. 51(1), 1–24 (2015)CrossRefGoogle Scholar
  16. 16.
    Grover, C., Hachey, B., Hughson, I., et al.: The HOLJ corpus: supporting summarisation of legal texts. In: Proceedings of the 5th International Workshop on Linguistically Interpreted Corpora (LINC) at Coling 2004 (2004)Google Scholar
  17. 17.
    Gupta, V., Lehal, G.S.: A survey of text summarization extractive techniques. J. Emerg. Technol. Web Intell. 2(3), 258–268 (2010)Google Scholar
  18. 18.
    Hachey, B., Grover, C.: Extractive summarisation of legal texts. Artif. Intell. Law 14(4), 305–345 (2006)CrossRefGoogle Scholar
  19. 19.
    Kim, M.-Y., Xu, Y., Goebel, R.: Summarization of legal texts with high cohesion and automatic compression rate. In: Motomura, Y., Butler, A., Bekki, D. (eds.) JSAI-isAI 2012. LNCS (LNAI), vol. 7856, pp. 190–204. Springer, Heidelberg (2013). Scholar
  20. 20.
    Kontonasios, G., Korkontzelos, I., Ananiadou, S.: Developing multilingual text mining workflows in UIMA and U-compare. In: Bouma, G., Ittoo, A., Métais, E., Wortmann, H. (eds.) NLDB 2012. LNCS, vol. 7337, pp. 82–93. Springer, Heidelberg (2012). Scholar
  21. 21.
    Kontonatsios, G., Thompson, P., Batista-Navarro, R.T., Mihaila, C., Korkontzelos, I., Ananiadou, S.: Extending an interoperable platform to facilitate the creation of multilingual and multimodal NLP applications. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 43–48 (2013)Google Scholar
  22. 22.
    Lin, C.Y.: ROUGE: a package for automatic evaluation of summaries. In: Marie-Francine Moens, S.S. (ed.) Text Summarization Branches Out: Proceedings of the ACL 2004 Workshop, pp. 74–81. Association for Computational Linguistics, Barcelona (2004)Google Scholar
  23. 23.
    Mihalcea, R., Tarau, P.: TextRank: bringing order into texts. In: Proceedings of Empirical Methods for Natural Language Processing, pp. 404–411 (2004)Google Scholar
  24. 24.
    Moens, M.F.: Summarizing court decisions. Inf. Process. Manag. 43(6), 1748–1764 (2007)MathSciNetCrossRefGoogle Scholar
  25. 25.
    Nenkova, A., McKeown, K.: A survey of text summarization techniques. In: Aggarwal, C.C., Zhai, C. (eds.) Mining Text Data, pp. 43–76. Springer, Boston (2012). Scholar
  26. 26.
    Ogrodniczuk, M., Karagiozov, D.: ATLAS multilingual language processing platform. Procesamiento del Leng. Nat. 47, 241–248 (2011)Google Scholar
  27. 27.
    Petrov, S., Das, D., McDonald, R.T.: A universal part-of-speech tagset. In: Calzolari, N., et al., (eds.) Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC 2012). European Language Resources Association (ELRA), Istanbul, May 2012Google Scholar
  28. 28.
    Rocheteau, J., Daille, B.: TTC TermSuite: a UIMA application for multilingual terminology extraction from comparable corpora. In: Proceedings of the 5th International Joint Conference on Natural Language Processing (IJCNLP), Chiang Mai, Thailand, pp. 9–12 (2011)Google Scholar
  29. 29.
    Saravanan, M., Ravindran, B., Raman, S.: Improving legal document summarization using graphical models. Front. Artif. Intell. Appl. 152, 51–60 (2006)Google Scholar
  30. 30.
    Sarkar, K.: Multilingual summarization approaches. In: Computational Linguistics: Concepts, Methodologies, Tools, and Applications, pp. 158–177 (2014)Google Scholar
  31. 31.
    Schweighofer, E., Rauber, A., Dittenbach, M.: Automatic text representation, classification and labeling in European law. In: Proceedings of the 8th International Conference on Artificial Intelligence and Law, ICAIL 2001, pp. 78–87. ACM, New York (2001)Google Scholar
  32. 32.
    Strötgen, J., Gertz, M.: Multilingual and cross-domain temporal tagging. Lang. Resour. Eval. 47(2), 269–298 (2013)CrossRefGoogle Scholar
  33. 33.
    Tanenblatt, M., Coden, A., Sominsky, I.: The ConceptMapper approach to named entity recognition. In: Calzolari, N., et al., (eds.) Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC 2010). European Language Resources Association (ELRA), Valletta, May 2010Google Scholar
  34. 34.
    Yousfi-Monod, M., Farzindar, A., Lapalme, G.: Supervised machine learning for summarizing legal documents. In: Farzindar, A., Kešelj, V. (eds.) AI 2010. LNCS (LNAI), vol. 6085, pp. 51–62. Springer, Heidelberg (2010). Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Florian Schmedding
    • 1
    Email author
  • Peter Klügl
    • 1
  • David Baehrens
    • 1
  • Christian Simon
    • 2
  • Kai Simon
    • 3
  • Katrin Tomanek
    • 4
  1. 1.Averbis GmbHFreiburgGermany
  2. 2.INTER CHALET Ferienhaus-Gesellschaft mbHFreiburgGermany
  3. 3.European Patent OfficeRijswijkThe Netherlands
  4. 4.VigLink Inc.San FranciscoUSA

Personalised recommendations