Abstract
Semantic annotation technologies support the extraction of legal concepts, for example rights and obligations, from legal documents. For software engineers, the final goal is to identify compliance requirements a software system has to fulfill in order to comply with a law or regulation. That implies analyzing and annotating legal documents in prescriptive natural language, still an open problem for research in the field. In this paper we describe GaiusT 2.0, a system for extracting requirements from legal documents. GaiusT 2.0 is the result of the evolution of GaiusT, and has been designed and implemented as a web-based system intended to semi-automate the extraction process. Results of the application of GaiusT 2.0 show that the new version improves performance of the extraction process and also makes the tool more usable.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
-
Recall is a measure of how well the tool performs in finding relevant items TP/(TPÂ +Â FN);
-
Precision is a measure of how well the tool performs in not returning irrelevant items TP/(TPÂ +Â FP);
-
Fallout is a measure of how quickly precision drops as recall is increased FP/(FPÂ +Â TN);
-
Accuracy is a measure of how well the tool identifies relevant items and rejects irrelevant ones (TPÂ +Â TN)/N;
-
Error is a measure of how much the tool is prone to accept irrelevant items and rejects relevant ones (FPÂ +Â FN)/N;
-
F-measure is a harmonic mean of recall and precision 2 × Recall × Precision/(Recall + Precision)
where TP is the number of items correctly assigned to the category; FP is the number of items incorrectly assigned to the category; FN is the number of items incorrectly rejected from the category; TN is the number of items correctly rejected from the category; and N is the total number of items NÂ =Â TPÂ +Â FPÂ +Â FNÂ +Â TN.
-
References
IEEE: IEEE Computer society predicts Top 9 technology trends (2016). http://www.computer.org/web/pressroom/Technology-Trends-2016
Systems Engineering Body of Knowledge (SEBoK) v.1.5.1. (2015). http://sebokwiki.org/wiki/Guide_to_the_Systems_Engineering_Body_of_Knowledge_(SEBoK)
RELAW workshops. http://gaius.isri.cmu.edu/relaw
Jurix conferences. http://jurix.nl/proceedings
Davis, E.: The singularity and the state of the art in artificial intelligence: the technological singularity. In: Ubiquity symposium (Ubiquity 2014), 12 pages (2014)
Zeni, N., Kiyavitskaya, N., Cordy, J.R., Mich, L., Mylopoulos, J.: GaiusT: supporting the extraction of rights and obligations for regulatory compliance. Req. Eng. 20(1), 1–22 (2015). (online 2013)
Kiyavitskaya, N., Zeni, N., Cordy, J.R., Mich, L., Mylopoulos, J.: Cerno: light-weight tool support for semantic annotation of textual documents. Data Knowl. Eng. 68(12), 1470–1492 (2009)
Cordy, J.R.: The TXL source transformation language. Sci. Comput. Program. 61(3), 190–210 (2006)
Zeni, N., Mich, L., Mylopoulos, J., Cordy, J.R.: Applying GaiusT for extracting requirements from legal documents. In: 6th International Workshop on Requirements Engineering and Law (RELAW 2013), pp. 65–68. IEEE (2013)
Zeni, N., Mich, L.: Usability issues for systems supporting requirements extraction from legal documents. In: 7th International Workshop on Requirements Engineering and Law (RELAW 2014), pp. 35–38. IEEE (2014)
Health Insurance Portability and Accountability Act – HIPAA. http://www.hhs.gov/ocr/privacy/hipaa/understanding
Bluebook. https://www.legalbluebook.com
Souza, V., Zeni, N., Kiyavitskaya, N., A, P., Mich, L., Mylopoulos, J.: Automating the generation of semantic annotation tools using a clustering technique. In: Kapetanios, E., Sugumaran, V., Spiliopoulou, M. (eds.) NLDB 2008. LNCS, vol. 5039, pp. 91–96. Springer, Heidelberg (2008). doi:10.1007/978-3-540-69858-6_10
Kiyavitskaya, N., Zeni, N., Cordy, J.R., Mich, L., Mylopoulos, J.: Applying software analysis technology to lightweight semantic markup of document text. In: Singh, S., Singh, M., Apte, C., Perner, P. (eds.) ICAPR 2005. LNCS, vol. 3686, pp. 590–600. Springer, Heidelberg (2005). doi:10.1007/11551188_65
Siena, A., Jureta, I., Ingolfo, S., Susi, A., Perini, A., Mylopoulos, J.: Capturing variability of law with Nómos 2. In: Atzeni, P., Cheung, D., Ram, S. (eds.) ER 2012. LNCS, vol. 7532, pp. 383–396. Springer, Heidelberg (2012). doi:10.1007/978-3-642-34002-4_30
Winkels, R., Hoekstra, R.: Automatic extraction of legal concepts and definitions, pp. 157–166. IOS (2012)
OpenNLP. https://opennlp.apache.org
Stanford CoreNLP. http://nlp.stanford.edu/software
NLP toolkit. http://www.nltk.org
Google n-grams. http://storage.googleapis.com/books/ngrams/books/datasetsv2.html
WordNet. https://wordnet.princeton.edu
SharpNLP. https://sharpnlp.codeplex.com
Zeni, N., Seid, E.A., Engiel, P., Ingolfo, S., Mylopoulos, J.: Building large models of laws with NómosT. In: 35th International Conference on Conceptual Modeling (ER 2016). Springer (2016, to appear)
NodeJs. http://nodejs.org
ASP.NET MVC. www.asp.net/mvc
Germany’s Federal Data Protection Act - Bundesdatenschutzgesetz – BDSG. https://www.loc.gov/law/help/online-privacy-law/germany.php
Data Protection Code - Legislative Decree no. 196/2003. http://194.242.234.211/documents/10160/2012405/DataProtectionCode-2003.pdf
Breaux, T.D., Antón, A.I.: Analyzing regulatory rules for privacy and security requirements. IEEE Trans. on SW. Eng. 34(1), 5–20 (2008)
Amazon Mechanical Turk. https://www.mturk.com
Dozier, C., Kondadadi, R., Light, M., Vachher, A., Veeramachaneni, S., Wudali, R.: Named entity recognition and resolution in legal text. In: Francesconi, E., Montemagni, S., Peters, W., Tiscornia, D. (eds.). LNCS (LNAI), vol. 6036, pp. 27–43Springer, Heidelberg (2010). doi:10.1007/978-3-642-12837-0_2
Lesmo, L., Mazzei, A., Palmirani, M., Radicioni, D.P.: TULSI: an NLP system for extracting legal modificatory provisions. Art. Intell. Law 21(2), 139–172 (2013)
GATE. https://gat.ac.uk
Sannier, N., Adedjouma, M., Sabetzadeh, M., Briand, L.: An automated framework for detection and resolution of cross references in legal texts. Req. Eng., 1–23 (2015). http://link.springer.com/journal/766/onlineFirst/page/1
Oanh Thi, T., Bach Xuan, N., Le Minh, N., Akira, S.: Automated reference resolution in legal texts. Artif. Intell. Law 22(1), 29–60 (2014)
Zeni, N., Kiyavitskaya, N., Mich, L., Mylopoulos, J., Cordy, J.R.: A lightweight approach to semantic annotation of research papers. In: Kedad, Z., Lammari, N., Métais, E., Meziane, F., Rezgui, Y. (eds.) NLDB 2007. LNCS, vol. 4592, pp. 61–72. Springer, Heidelberg (2007). doi:10.1007/978-3-540-73351-5_6
Boella, G., di Caro, L., Humphreys, L., Robaldo, L., van der Torre, L.: NLP challenges for EuNomos a tool to build and manage legal knowledge. In: 8th International Conference on Language Resources and Evaluation (LREC 2012). European Language Resources Ass., Istanbul (2012)
Akoma Ntoso standard. http://www.akomantoso.org
Soria, C., Bartolini, R., Lenci, A., Montemagni, S., Pirrelli, V.: Automatic extraction of semantics in law documents. In: 5th Legislative XML Workshop (2007)
de Maat, E., Krabben, K., Winkels, R.: Machine learning versus knowledge based classification of legal texts. In: 23rd Conference on Legal KW and IS (JURIX 2010), pp. 87–96. IOS Press (2010)
Massey, K.: Legal requirements metrics for compliance analysis. Ph.D. dissertation, North Carolina State University (2012)
Ingolfo, S., Siena, A., Mylopoulos, J., Susi, A., Perini, A.: Arguing regulatory compliance of software requirements. Data Knowl. Eng. 87, 279–296 (2013)
LegalRuleML. https://www.oasis-open.org/committees/legalruleml
Acknowledgements
Nicola Zeni’s work was supported in part by the ERC advanced grant 267856 ‘Lucretius: Foundations for Software Evolution’. Our thanks go to the experts and users who made themselves available to test GaiusT 2.0.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Zeni, N., Mich, L., Mylopoulos, J. (2016). GaiusT 2.0: Evolution of a Framework for Annotating Legal Documents. In: Garoufallou, E., Subirats Coll, I., Stellato, A., Greenberg, J. (eds) Metadata and Semantics Research. MTSR 2016. Communications in Computer and Information Science, vol 672. Springer, Cham. https://doi.org/10.1007/978-3-319-49157-8_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-49157-8_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-49156-1
Online ISBN: 978-3-319-49157-8
eBook Packages: Computer ScienceComputer Science (R0)