GaiusT 2.0: Evolution of a Framework for Annotating Legal Documents

Zeni, Nicola; Mich, Luisa; Mylopoulos, John

doi:10.1007/978-3-319-49157-8_4

Nicola Zeni¹⁴,
Luisa Mich¹⁵ &
John Mylopoulos¹⁶

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 672))

Included in the following conference series:

Research Conference on Metadata and Semantics Research

986 Accesses

Abstract

Semantic annotation technologies support the extraction of legal concepts, for example rights and obligations, from legal documents. For software engineers, the final goal is to identify compliance requirements a software system has to fulfill in order to comply with a law or regulation. That implies analyzing and annotating legal documents in prescriptive natural language, still an open problem for research in the field. In this paper we describe GaiusT 2.0, a system for extracting requirements from legal documents. GaiusT 2.0 is the result of the evolution of GaiusT, and has been designed and implemented as a web-based system intended to semi-automate the extraction process. Results of the application of GaiusT 2.0 show that the new version improves performance of the extraction process and also makes the tool more usable.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
- Recall is a measure of how well the tool performs in finding relevant items TP/(TP + FN);
- Precision is a measure of how well the tool performs in not returning irrelevant items TP/(TP + FP);
- Fallout is a measure of how quickly precision drops as recall is increased FP/(FP + TN);
- Accuracy is a measure of how well the tool identifies relevant items and rejects irrelevant ones (TP + TN)/N;
- Error is a measure of how much the tool is prone to accept irrelevant items and rejects relevant ones (FP + FN)/N;
- F-measure is a harmonic mean of recall and precision 2 × Recall × Precision/(Recall + Precision)
where TP is the number of items correctly assigned to the category; FP is the number of items incorrectly assigned to the category; FN is the number of items incorrectly rejected from the category; TN is the number of items correctly rejected from the category; and N is the total number of items N = TP + FP + FN + TN.

References

IEEE: IEEE Computer society predicts Top 9 technology trends (2016). http://www.computer.org/web/pressroom/Technology-Trends-2016
Systems Engineering Body of Knowledge (SEBoK) v.1.5.1. (2015). http://sebokwiki.org/wiki/Guide_to_the_Systems_Engineering_Body_of_Knowledge_(SEBoK)
RELAW workshops. http://gaius.isri.cmu.edu/relaw
Jurix conferences. http://jurix.nl/proceedings
Davis, E.: The singularity and the state of the art in artificial intelligence: the technological singularity. In: Ubiquity symposium (Ubiquity 2014), 12 pages (2014)
Google Scholar
Zeni, N., Kiyavitskaya, N., Cordy, J.R., Mich, L., Mylopoulos, J.: GaiusT: supporting the extraction of rights and obligations for regulatory compliance. Req. Eng. 20(1), 1–22 (2015). (online 2013)
Article Google Scholar
Kiyavitskaya, N., Zeni, N., Cordy, J.R., Mich, L., Mylopoulos, J.: Cerno: light-weight tool support for semantic annotation of textual documents. Data Knowl. Eng. 68(12), 1470–1492 (2009)
Article Google Scholar
Cordy, J.R.: The TXL source transformation language. Sci. Comput. Program. 61(3), 190–210 (2006)
Article MathSciNet MATH Google Scholar
Zeni, N., Mich, L., Mylopoulos, J., Cordy, J.R.: Applying GaiusT for extracting requirements from legal documents. In: 6th International Workshop on Requirements Engineering and Law (RELAW 2013), pp. 65–68. IEEE (2013)
Google Scholar
Zeni, N., Mich, L.: Usability issues for systems supporting requirements extraction from legal documents. In: 7th International Workshop on Requirements Engineering and Law (RELAW 2014), pp. 35–38. IEEE (2014)
Google Scholar
Health Insurance Portability and Accountability Act – HIPAA. http://www.hhs.gov/ocr/privacy/hipaa/understanding
Bluebook. https://www.legalbluebook.com
Souza, V., Zeni, N., Kiyavitskaya, N., A, P., Mich, L., Mylopoulos, J.: Automating the generation of semantic annotation tools using a clustering technique. In: Kapetanios, E., Sugumaran, V., Spiliopoulou, M. (eds.) NLDB 2008. LNCS, vol. 5039, pp. 91–96. Springer, Heidelberg (2008). doi:10.1007/978-3-540-69858-6_10
Chapter Google Scholar
Kiyavitskaya, N., Zeni, N., Cordy, J.R., Mich, L., Mylopoulos, J.: Applying software analysis technology to lightweight semantic markup of document text. In: Singh, S., Singh, M., Apte, C., Perner, P. (eds.) ICAPR 2005. LNCS, vol. 3686, pp. 590–600. Springer, Heidelberg (2005). doi:10.1007/11551188_65
Chapter Google Scholar
Siena, A., Jureta, I., Ingolfo, S., Susi, A., Perini, A., Mylopoulos, J.: Capturing variability of law with Nómos 2. In: Atzeni, P., Cheung, D., Ram, S. (eds.) ER 2012. LNCS, vol. 7532, pp. 383–396. Springer, Heidelberg (2012). doi:10.1007/978-3-642-34002-4_30
Chapter Google Scholar
Winkels, R., Hoekstra, R.: Automatic extraction of legal concepts and definitions, pp. 157–166. IOS (2012)
Google Scholar
OpenNLP. https://opennlp.apache.org
Stanford CoreNLP. http://nlp.stanford.edu/software
NLP toolkit. http://www.nltk.org
Google n-grams. http://storage.googleapis.com/books/ngrams/books/datasetsv2.html
WordNet. https://wordnet.princeton.edu
SharpNLP. https://sharpnlp.codeplex.com
Zeni, N., Seid, E.A., Engiel, P., Ingolfo, S., Mylopoulos, J.: Building large models of laws with NómosT. In: 35th International Conference on Conceptual Modeling (ER 2016). Springer (2016, to appear)
Google Scholar
NodeJs. http://nodejs.org
ASP.NET MVC. www.asp.net/mvc
Germany’s Federal Data Protection Act - Bundesdatenschutzgesetz – BDSG. https://www.loc.gov/law/help/online-privacy-law/germany.php
Data Protection Code - Legislative Decree no. 196/2003. http://194.242.234.211/documents/10160/2012405/DataProtectionCode-2003.pdf
Breaux, T.D., Antón, A.I.: Analyzing regulatory rules for privacy and security requirements. IEEE Trans. on SW. Eng. 34(1), 5–20 (2008)
Article Google Scholar
Amazon Mechanical Turk. https://www.mturk.com
Dozier, C., Kondadadi, R., Light, M., Vachher, A., Veeramachaneni, S., Wudali, R.: Named entity recognition and resolution in legal text. In: Francesconi, E., Montemagni, S., Peters, W., Tiscornia, D. (eds.). LNCS (LNAI), vol. 6036, pp. 27–43Springer, Heidelberg (2010). doi:10.1007/978-3-642-12837-0_2
Chapter Google Scholar
Lesmo, L., Mazzei, A., Palmirani, M., Radicioni, D.P.: TULSI: an NLP system for extracting legal modificatory provisions. Art. Intell. Law 21(2), 139–172 (2013)
Article Google Scholar
GATE. https://gat.ac.uk
Sannier, N., Adedjouma, M., Sabetzadeh, M., Briand, L.: An automated framework for detection and resolution of cross references in legal texts. Req. Eng., 1–23 (2015). http://link.springer.com/journal/766/onlineFirst/page/1
Oanh Thi, T., Bach Xuan, N., Le Minh, N., Akira, S.: Automated reference resolution in legal texts. Artif. Intell. Law 22(1), 29–60 (2014)
Article Google Scholar
Zeni, N., Kiyavitskaya, N., Mich, L., Mylopoulos, J., Cordy, J.R.: A lightweight approach to semantic annotation of research papers. In: Kedad, Z., Lammari, N., Métais, E., Meziane, F., Rezgui, Y. (eds.) NLDB 2007. LNCS, vol. 4592, pp. 61–72. Springer, Heidelberg (2007). doi:10.1007/978-3-540-73351-5_6
Chapter Google Scholar
Boella, G., di Caro, L., Humphreys, L., Robaldo, L., van der Torre, L.: NLP challenges for EuNomos a tool to build and manage legal knowledge. In: 8th International Conference on Language Resources and Evaluation (LREC 2012). European Language Resources Ass., Istanbul (2012)
Google Scholar
Akoma Ntoso standard. http://www.akomantoso.org
Soria, C., Bartolini, R., Lenci, A., Montemagni, S., Pirrelli, V.: Automatic extraction of semantics in law documents. In: 5th Legislative XML Workshop (2007)
Google Scholar
de Maat, E., Krabben, K., Winkels, R.: Machine learning versus knowledge based classification of legal texts. In: 23rd Conference on Legal KW and IS (JURIX 2010), pp. 87–96. IOS Press (2010)
Google Scholar
Massey, K.: Legal requirements metrics for compliance analysis. Ph.D. dissertation, North Carolina State University (2012)
Google Scholar
Ingolfo, S., Siena, A., Mylopoulos, J., Susi, A., Perini, A.: Arguing regulatory compliance of software requirements. Data Knowl. Eng. 87, 279–296 (2013)
Article Google Scholar
LegalRuleML. https://www.oasis-open.org/committees/legalruleml

Download references

Acknowledgements

Nicola Zeni’s work was supported in part by the ERC advanced grant 267856 ‘Lucretius: Foundations for Software Evolution’. Our thanks go to the experts and users who made themselves available to test GaiusT 2.0.

Author information

Authors and Affiliations

Department of Information Engineering and Computer Science, University of Trento, Trento, Italy
Nicola Zeni
Department of Industrial Engineering, University of Trento, Trento, Italy
Luisa Mich
School of Electrical Engineering and Computer Science, University of Ottawa, Ottawa, CA, Canada
John Mylopoulos

Authors

Nicola Zeni
View author publications
You can also search for this author in PubMed Google Scholar
Luisa Mich
View author publications
You can also search for this author in PubMed Google Scholar
John Mylopoulos
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Luisa Mich .

Editor information

Editors and Affiliations

Alexander Technological Educational Institute of Thessaloniki, Thessaloniki, Greece
Emmanouel Garoufallou
Alexander Technological Educational Inst , Rome, Italy
Imma Subirats Coll
Sapienza University of Rome , Rome, Italy
Armando Stellato
Drexel University, Philadelphia, Pennsylvania, USA
Jane Greenberg

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zeni, N., Mich, L., Mylopoulos, J. (2016). GaiusT 2.0: Evolution of a Framework for Annotating Legal Documents. In: Garoufallou, E., Subirats Coll, I., Stellato, A., Greenberg, J. (eds) Metadata and Semantics Research. MTSR 2016. Communications in Computer and Information Science, vol 672. Springer, Cham. https://doi.org/10.1007/978-3-319-49157-8_4

Download citation

DOI: https://doi.org/10.1007/978-3-319-49157-8_4
Published: 04 November 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-49156-1
Online ISBN: 978-3-319-49157-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics