Abstract
The number of available legal documents has presented an enormous growth in recent years, and the digital processing of such materials is prompting the necessity of systems that support the automatic relevant information extraction. This work presents a system for ontology-based information extraction from natural language texts, able to identify a set of legal events. The system is based on an innovative methodology based on domain ontology of legal events and a set of linguistic rules, integrated through inference mechanism, resulting in a flexible approach and scalable approach. A case study with the use of documents from the Superior Court in Brazil is related, with satisfactory results in precision and recall.
Similar content being viewed by others
References
Amardeilh F, Laublet P, Minel J-L (2005) Document annotation and ontology population from linguistic extractions. In: Proceedings of the 3rd international conference on knowledge capture. ACM, pp 161–168, Recuperado de http://dl.acm.org/citation.cfm?id=1088651
Ashley K (2014) Applying argument extraction to improve legal information retrieval. In: ArgNLP, Recuperado de http://ceur-ws.org/Vol-1341/paper3.pdf
Baader F, Horrocks I, Sattler U (2009) Description logics. In: Handbook on ontologies. Springer, pp 21–43. Recuperado de http://link.springer.com/chapter/10.1007/978-3-540-92673-3_1
Berners-Lee T, Fielding R, Masinter L (2004) Uniform resource identifier (URI): generic syntax. Recuperado de http://www.rfc-editor.org/info/rfc3986
Bick E (2000) The parsing system“ Palavras”: automatic grammatical analysis of Portuguese in a constraint grammar framework. Aarhus Universitetsforlag, Aarhus
Boella G, Di Caro L, Humphreys L, Robaldo L, Rossi P, van der Torre L (2016) Eunomos, a legal document and knowledge management system for the web to provide relevant, reliable and up-to-date information on the law. Artif Intell Law 24(3):245–283
Breuker J (2004) Constructing a legal ontology: LRI-Core. In: Proceedings of WONTO-2004, workshop on ontologies and their applications. LivroRápido, Recife, Brazil
Brüninghaus S, Ashley KD (2001) Improving the representation of legal case texts with information extraction methods. In: Proceedings of the 8th international conference on artificial intelligence and law. ACM, pp 42–51. Recuperado de http://dl.acm.org/citation.cfm?id=383540
Chen S, Ye J-M (2016) A three layer semantic approach for information extraction. In: Materials, manufacturing technology, electronics and information science: proceedings of the 2015 international workshop on materials, manufacturing technology, electronics and information science (MMTEI2015). World Scientific, p 316. Recuperado de https://books.google.com.br/books?hl=pt-BR&lr=&id=XFa2DAAAQBAJ&oi=fnd&pg=PA316&dq=%22information+extraction%22++%22domain+ontology%22+linguistic&ots=ANMcXE50lD&sig=_6kk2-zMhdxpTGgSb7p5lVH1rok
Chiarcos C (2008) An ontology of linguistic annotations. In: LDV Forum. pp 1–16
Chiarcos C (2012a) Ontologies of linguistic annotation: survey and perspectives. In: LREC, pp 303–310. Recuperado de http://www.sfb632.uni-potsdam.de/~chiarcos/papers/chiarcos-2012-olia-lrec.pdf
Chiarcos C (2012b) POWLA: modeling linguistic corpora in OWL/DL. In: The semantic web: research and applications. Springer, pp 225–239. Recuperado de http://link.springer.com/chapter/10.1007/978-3-642-30284-8_22
Chiarcos C (2012c) Interoperability of corpora and annotations. In: Linked data in linguistics. Springer, pp 161–179
Chiarcos C, Erjavec T (2011) OWL/DL formalization of the MULTEXT-East morphosyntactic specifications. In: Proceedings of the 5th linguistic annotation workshop. Association for Computational Linguistics, pp 11–20
Chiarcos C, Sukhareva M (2015) OLiA—ontologies of linguistic annotation. Semant Web 6(4):379–386
de Araujo DA, Müller C, Chishman R, Rigo SJ (2014) Information extraction for legal knowledge representation—a review of approaches and trends. Rev Bras Comput Apl 6(2):2–19
Gutierrez F, Dou D, Fickas S, Wimalasuriya D, Zong H (2015) A hybrid ontology-based information extraction system. J Inf Sci 42(6):798–820. doi:10.1177/0165551515610989
Hafner CD (1978) An information retrieval system based on a computer model of legal knowledge. University of Michigan, Ann Arbor
Horridge M, Drummond N, Goodwin J, Rector AL, Stevens R, Wang H (2006) The Manchester OWL syntax. In: OWLED, vol 216. Recuperado de http://owl1-1.googlecode.com/svn-history/r716/trunk/www.webont.org/owled/2006/acceptedLong/submission_9.pdf
Jacobs PS (2014) Text-based intelligent systems: Current research and practice in information extraction and retrieval. Psychology Press, Recuperado de https://books.google.com.br/books?hl=pt-BR&lr=&id=7BDsAgAAQBAJ&oi=fnd&pg=PP1&dq=%22information+extraction%22+&ots=EqurvwC28l&sig=uAX5A8Fou8upPkgb8moVYOBpcKE
Jonsson P, Wuolikainen A, Thysell E, Chorell E, Stattin P, Wikström P, Antti H (2015) Constrained randomization and multivariate effect projections improve information extraction and biomarker pattern discovery in metabolomics studies involving dependent samples. Metabolomics 11(6):1667–1678
Jurafsky D, Martin JH (2000) Speech and language processing. Prentice-Hall, Inc: New Jersey
König E, Lezius W (2000) The TIGER language—a description language for syntax graphs. Recuperado de http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.32.4134
Kowsrihawat K, Vateekul P (2015) An information extraction framework for legal documents: a case study of Thai Supreme Court verdicts. In: 12th international joint conference on computer science and software engineering (JCSSE), 2015. IEEE, pp 275–280. Recuperado de http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=7219809
Krız V, Hladká B (2015) RExtractor: a robust information extractor. In: Proceedings of NAACL-HLT, pp 21–25. Recuperado de http://www.anthology.aclweb.org/N/N15/N15-3.pdf#page=33
Labský M, Svátek V, Nekvasil M (2008) Information Extraction Based on Extraction Ontologies: Design, Deployment and Evaluation. ONTOLOGY-BASED INFORMATION EXTRACTION SYSTEMS (OBIES 2008), p 9
Laclavik M, Šeleng M, Ciglan M, Hluchỳ L (2012) Ontea: platform for pattern based automated semantic annotation. Comput Inform 28(4):555–579
Lenci A, Montemagni S, Pirrelli V, Venturi G (2007) NLP-based ontology learning from legal texts. A case study. In: LOAIT. Citeseer, pp 113–129. Recuperado de http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.95.9650&rep=rep1&type=pdf#page=113
Lesmo L, Mazzei A, Palmirani M, Radicioni DP (2013) TULSI: an NLP system for extracting legal modificatory provisions. Artif Intell Law 21(2):139–172
Maarek M (2010) On the extraction of decisions and contributions from summaries of legal IT contract cases. In: LREC 2010 workshop on semantic processing of legal texts proceedings (SPLeT). Recuperado de https://hal.inria.fr/hal-00789652/
Mazzei A, Radicioni DP, Brighi R (2009) NLP-based extraction of modificatory provisions semantics. In: Proceedings of the 12th international conference on artificial intelligence and law. ACM, pp 50–57
McGuinness DL, Van Harmelen F (2004) OWL web ontology language overview. W3C recommendation 10(10):1–22
Palmirani M, Ceci M, Radicioni D, Mazzei A (2011). FrameNet model of the suspension of norms. In: Proceedings of the 13th international conference on artificial intelligence and law, pp 189–193. ACM, Recuperado de http://dl.acm.org/citation.cfm?id=2018385
Peters W, Sagri MT, Tiscornia D, Castagnoli S (2006) The LOIS project. In: Linguistic resources evaluation conference. Citeseer, Recuperado de http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.368.7292&rep=rep1&type=pdf
Sagri MT, TiscorniaD, Bertagna F (2004) Jur-WordNet. In: Proceedings of the 2nd international global Wordnet conference. Citeseer, pp 305–310. Recuperado de http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.111.8163&rep=rep1&type=pdf#page=317
Saravanan M, Ravindran B, Raman S (2009) Improving legal information retrieval using an ontological framework. Artif Intell Law 17(2):101–124
Schweighofer E, Liebwald D (2007) Advanced lexical ontologies and hybrid knowledge based systems: first steps to a dynamic legal electronic commentary. Artif Intell Law 15(2):103–115
Soysal E, Cicekli I, Baykal N (2010) Design and evaluation of an ontology based information extraction system for radiological reports. Comput Biol Med 40(11):900–911
Ta CD, Thi TP (2015) Automatic evaluation of the computing domain ontology. In: International conference on future data and security engineering. Springer, pp 285–295 Recuperado de http://link.springer.com/chapter/10.1007/978-3-319-26135-5_21
Varga A, Edmonds AN (2016) Multilingual extraction and editing of concept strings for the legal domain. Adv Comput Sci Int J 5(4):18–23
Wang W, Stewart K (2015) Spatiotemporal and semantic information extraction from web news reports about natural hazards. Comput Environ Urban Syst 50:30–40
Wimalasuriya DC, Bandara L (2015) Development of an ontology construction component for the OBCIE (ontology-based components for information extraction) approach. Recuperado de http://www.saitm.edu.lk/fac_of_eng/RSEA/SAITM_RSEA_2013/imagenesweb/14.pdf
Wyner A, Peters W (2011) On rule extraction from regulations. In: JURIX. pp 113–122
Zhang J, El-Gohary N (2012) Extraction of construction regulatory requirements from textual documents using natural language processing techniques. In Computing in civil engineering. pp 453–460
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
de Araujo, D.A., Rigo, S.J. & Barbosa, J.L.V. Ontology-based information extraction for juridical events with case studies in Brazilian legal realm. Artif Intell Law 25, 379–396 (2017). https://doi.org/10.1007/s10506-017-9203-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10506-017-9203-z