Skip to main content

GaiusT 2.0: Evolution of a Framework for Annotating Legal Documents

  • Conference paper
  • First Online:
Metadata and Semantics Research (MTSR 2016)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 672))

Included in the following conference series:

  • 986 Accesses

Abstract

Semantic annotation technologies support the extraction of legal concepts, for example rights and obligations, from legal documents. For software engineers, the final goal is to identify compliance requirements a software system has to fulfill in order to comply with a law or regulation. That implies analyzing and annotating legal documents in prescriptive natural language, still an open problem for research in the field. In this paper we describe GaiusT 2.0, a system for extracting requirements from legal documents. GaiusT 2.0 is the result of the evolution of GaiusT, and has been designed and implemented as a web-based system intended to semi-automate the extraction process. Results of the application of GaiusT 2.0 show that the new version improves performance of the extraction process and also makes the tool more usable.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.
    • Recall is a measure of how well the tool performs in finding relevant items TP/(TP + FN);

    • Precision is a measure of how well the tool performs in not returning irrelevant items TP/(TP + FP);

    • Fallout is a measure of how quickly precision drops as recall is increased FP/(FP + TN);

    • Accuracy is a measure of how well the tool identifies relevant items and rejects irrelevant ones (TP + TN)/N;

    • Error is a measure of how much the tool is prone to accept irrelevant items and rejects relevant ones (FP + FN)/N;

    • F-measure is a harmonic mean of recall and precision 2 × Recall × Precision/(Recall + Precision)

    where TP is the number of items correctly assigned to the category; FP is the number of items incorrectly assigned to the category; FN is the number of items incorrectly rejected from the category; TN is the number of items correctly rejected from the category; and N is the total number of items N = TP + FP + FN + TN.

References

  1. IEEE: IEEE Computer society predicts Top 9 technology trends (2016). http://www.computer.org/web/pressroom/Technology-Trends-2016

  2. Systems Engineering Body of Knowledge (SEBoK) v.1.5.1. (2015). http://sebokwiki.org/wiki/Guide_to_the_Systems_Engineering_Body_of_Knowledge_(SEBoK)

  3. RELAW workshops. http://gaius.isri.cmu.edu/relaw

  4. Jurix conferences. http://jurix.nl/proceedings

  5. Davis, E.: The singularity and the state of the art in artificial intelligence: the technological singularity. In: Ubiquity symposium (Ubiquity 2014), 12 pages (2014)

    Google Scholar 

  6. Zeni, N., Kiyavitskaya, N., Cordy, J.R., Mich, L., Mylopoulos, J.: GaiusT: supporting the extraction of rights and obligations for regulatory compliance. Req. Eng. 20(1), 1–22 (2015). (online 2013)

    Article  Google Scholar 

  7. Kiyavitskaya, N., Zeni, N., Cordy, J.R., Mich, L., Mylopoulos, J.: Cerno: light-weight tool support for semantic annotation of textual documents. Data Knowl. Eng. 68(12), 1470–1492 (2009)

    Article  Google Scholar 

  8. Cordy, J.R.: The TXL source transformation language. Sci. Comput. Program. 61(3), 190–210 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  9. Zeni, N., Mich, L., Mylopoulos, J., Cordy, J.R.: Applying GaiusT for extracting requirements from legal documents. In: 6th International Workshop on Requirements Engineering and Law (RELAW 2013), pp. 65–68. IEEE (2013)

    Google Scholar 

  10. Zeni, N., Mich, L.: Usability issues for systems supporting requirements extraction from legal documents. In: 7th International Workshop on Requirements Engineering and Law (RELAW 2014), pp. 35–38. IEEE (2014)

    Google Scholar 

  11. Health Insurance Portability and Accountability Act – HIPAA. http://www.hhs.gov/ocr/privacy/hipaa/understanding

  12. Bluebook. https://www.legalbluebook.com

  13. Souza, V., Zeni, N., Kiyavitskaya, N., A, P., Mich, L., Mylopoulos, J.: Automating the generation of semantic annotation tools using a clustering technique. In: Kapetanios, E., Sugumaran, V., Spiliopoulou, M. (eds.) NLDB 2008. LNCS, vol. 5039, pp. 91–96. Springer, Heidelberg (2008). doi:10.1007/978-3-540-69858-6_10

    Chapter  Google Scholar 

  14. Kiyavitskaya, N., Zeni, N., Cordy, J.R., Mich, L., Mylopoulos, J.: Applying software analysis technology to lightweight semantic markup of document text. In: Singh, S., Singh, M., Apte, C., Perner, P. (eds.) ICAPR 2005. LNCS, vol. 3686, pp. 590–600. Springer, Heidelberg (2005). doi:10.1007/11551188_65

    Chapter  Google Scholar 

  15. Siena, A., Jureta, I., Ingolfo, S., Susi, A., Perini, A., Mylopoulos, J.: Capturing variability of law with Nómos 2. In: Atzeni, P., Cheung, D., Ram, S. (eds.) ER 2012. LNCS, vol. 7532, pp. 383–396. Springer, Heidelberg (2012). doi:10.1007/978-3-642-34002-4_30

    Chapter  Google Scholar 

  16. Winkels, R., Hoekstra, R.: Automatic extraction of legal concepts and definitions, pp. 157–166. IOS (2012)

    Google Scholar 

  17. OpenNLP. https://opennlp.apache.org

  18. Stanford CoreNLP. http://nlp.stanford.edu/software

  19. NLP toolkit. http://www.nltk.org

  20. Google n-grams. http://storage.googleapis.com/books/ngrams/books/datasetsv2.html

  21. WordNet. https://wordnet.princeton.edu

  22. SharpNLP. https://sharpnlp.codeplex.com

  23. Zeni, N., Seid, E.A., Engiel, P., Ingolfo, S., Mylopoulos, J.: Building large models of laws with NómosT. In: 35th International Conference on Conceptual Modeling (ER 2016). Springer (2016, to appear)

    Google Scholar 

  24. NodeJs. http://nodejs.org

  25. ASP.NET MVC. www.asp.net/mvc

  26. Germany’s Federal Data Protection Act - Bundesdatenschutzgesetz – BDSG. https://www.loc.gov/law/help/online-privacy-law/germany.php

  27. Data Protection Code - Legislative Decree no. 196/2003. http://194.242.234.211/documents/10160/2012405/DataProtectionCode-2003.pdf

  28. Breaux, T.D., Antón, A.I.: Analyzing regulatory rules for privacy and security requirements. IEEE Trans. on SW. Eng. 34(1), 5–20 (2008)

    Article  Google Scholar 

  29. Amazon Mechanical Turk. https://www.mturk.com

  30. Dozier, C., Kondadadi, R., Light, M., Vachher, A., Veeramachaneni, S., Wudali, R.: Named entity recognition and resolution in legal text. In: Francesconi, E., Montemagni, S., Peters, W., Tiscornia, D. (eds.). LNCS (LNAI), vol. 6036, pp. 27–43Springer, Heidelberg (2010). doi:10.1007/978-3-642-12837-0_2

    Chapter  Google Scholar 

  31. Lesmo, L., Mazzei, A., Palmirani, M., Radicioni, D.P.: TULSI: an NLP system for extracting legal modificatory provisions. Art. Intell. Law 21(2), 139–172 (2013)

    Article  Google Scholar 

  32. GATE. https://gat.ac.uk

  33. Sannier, N., Adedjouma, M., Sabetzadeh, M., Briand, L.: An automated framework for detection and resolution of cross references in legal texts. Req. Eng., 1–23 (2015). http://link.springer.com/journal/766/onlineFirst/page/1

  34. Oanh Thi, T., Bach Xuan, N., Le Minh, N., Akira, S.: Automated reference resolution in legal texts. Artif. Intell. Law 22(1), 29–60 (2014)

    Article  Google Scholar 

  35. Zeni, N., Kiyavitskaya, N., Mich, L., Mylopoulos, J., Cordy, J.R.: A lightweight approach to semantic annotation of research papers. In: Kedad, Z., Lammari, N., Métais, E., Meziane, F., Rezgui, Y. (eds.) NLDB 2007. LNCS, vol. 4592, pp. 61–72. Springer, Heidelberg (2007). doi:10.1007/978-3-540-73351-5_6

    Chapter  Google Scholar 

  36. Boella, G., di Caro, L., Humphreys, L., Robaldo, L., van der Torre, L.: NLP challenges for EuNomos a tool to build and manage legal knowledge. In: 8th International Conference on Language Resources and Evaluation (LREC 2012). European Language Resources Ass., Istanbul (2012)

    Google Scholar 

  37. Akoma Ntoso standard. http://www.akomantoso.org

  38. Soria, C., Bartolini, R., Lenci, A., Montemagni, S., Pirrelli, V.: Automatic extraction of semantics in law documents. In: 5th Legislative XML Workshop (2007)

    Google Scholar 

  39. de Maat, E., Krabben, K., Winkels, R.: Machine learning versus knowledge based classification of legal texts. In: 23rd Conference on Legal KW and IS (JURIX 2010), pp. 87–96. IOS Press (2010)

    Google Scholar 

  40. Massey, K.: Legal requirements metrics for compliance analysis. Ph.D. dissertation, North Carolina State University (2012)

    Google Scholar 

  41. Ingolfo, S., Siena, A., Mylopoulos, J., Susi, A., Perini, A.: Arguing regulatory compliance of software requirements. Data Knowl. Eng. 87, 279–296 (2013)

    Article  Google Scholar 

  42. LegalRuleML. https://www.oasis-open.org/committees/legalruleml

Download references

Acknowledgements

Nicola Zeni’s work was supported in part by the ERC advanced grant 267856 ‘Lucretius: Foundations for Software Evolution’. Our thanks go to the experts and users who made themselves available to test GaiusT 2.0.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luisa Mich .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Zeni, N., Mich, L., Mylopoulos, J. (2016). GaiusT 2.0: Evolution of a Framework for Annotating Legal Documents. In: Garoufallou, E., Subirats Coll, I., Stellato, A., Greenberg, J. (eds) Metadata and Semantics Research. MTSR 2016. Communications in Computer and Information Science, vol 672. Springer, Cham. https://doi.org/10.1007/978-3-319-49157-8_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-49157-8_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-49156-1

  • Online ISBN: 978-3-319-49157-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics