GaiusT 2.0: Evolution of a Framework for Annotating Legal Documents

  • Nicola Zeni
  • Luisa MichEmail author
  • John Mylopoulos
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 672)


Semantic annotation technologies support the extraction of legal concepts, for example rights and obligations, from legal documents. For software engineers, the final goal is to identify compliance requirements a software system has to fulfill in order to comply with a law or regulation. That implies analyzing and annotating legal documents in prescriptive natural language, still an open problem for research in the field. In this paper we describe GaiusT 2.0, a system for extracting requirements from legal documents. GaiusT 2.0 is the result of the evolution of GaiusT, and has been designed and implemented as a web-based system intended to semi-automate the extraction process. Results of the application of GaiusT 2.0 show that the new version improves performance of the extraction process and also makes the tool more usable.


Semantic annotation Legal documents Legal requirements Natural language processing Linguistic resources User needs 



Nicola Zeni’s work was supported in part by the ERC advanced grant 267856 ‘Lucretius: Foundations for Software Evolution’. Our thanks go to the experts and users who made themselves available to test GaiusT 2.0.


  1. 1.
    IEEE: IEEE Computer society predicts Top 9 technology trends (2016).
  2. 2.
    Systems Engineering Body of Knowledge (SEBoK) v.1.5.1. (2015).
  3. 3.
  4. 4.
    Jurix conferences.
  5. 5.
    Davis, E.: The singularity and the state of the art in artificial intelligence: the technological singularity. In: Ubiquity symposium (Ubiquity 2014), 12 pages (2014)Google Scholar
  6. 6.
    Zeni, N., Kiyavitskaya, N., Cordy, J.R., Mich, L., Mylopoulos, J.: GaiusT: supporting the extraction of rights and obligations for regulatory compliance. Req. Eng. 20(1), 1–22 (2015). (online 2013)CrossRefGoogle Scholar
  7. 7.
    Kiyavitskaya, N., Zeni, N., Cordy, J.R., Mich, L., Mylopoulos, J.: Cerno: light-weight tool support for semantic annotation of textual documents. Data Knowl. Eng. 68(12), 1470–1492 (2009)CrossRefGoogle Scholar
  8. 8.
    Cordy, J.R.: The TXL source transformation language. Sci. Comput. Program. 61(3), 190–210 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Zeni, N., Mich, L., Mylopoulos, J., Cordy, J.R.: Applying GaiusT for extracting requirements from legal documents. In: 6th International Workshop on Requirements Engineering and Law (RELAW 2013), pp. 65–68. IEEE (2013)Google Scholar
  10. 10.
    Zeni, N., Mich, L.: Usability issues for systems supporting requirements extraction from legal documents. In: 7th International Workshop on Requirements Engineering and Law (RELAW 2014), pp. 35–38. IEEE (2014)Google Scholar
  11. 11.
    Health Insurance Portability and Accountability Act – HIPAA.
  12. 12.
  13. 13.
    Souza, V., Zeni, N., Kiyavitskaya, N., A, P., Mich, L., Mylopoulos, J.: Automating the generation of semantic annotation tools using a clustering technique. In: Kapetanios, E., Sugumaran, V., Spiliopoulou, M. (eds.) NLDB 2008. LNCS, vol. 5039, pp. 91–96. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-69858-6_10 CrossRefGoogle Scholar
  14. 14.
    Kiyavitskaya, N., Zeni, N., Cordy, J.R., Mich, L., Mylopoulos, J.: Applying software analysis technology to lightweight semantic markup of document text. In: Singh, S., Singh, M., Apte, C., Perner, P. (eds.) ICAPR 2005. LNCS, vol. 3686, pp. 590–600. Springer, Heidelberg (2005). doi: 10.1007/11551188_65 CrossRefGoogle Scholar
  15. 15.
    Siena, A., Jureta, I., Ingolfo, S., Susi, A., Perini, A., Mylopoulos, J.: Capturing variability of law with Nómos 2. In: Atzeni, P., Cheung, D., Ram, S. (eds.) ER 2012. LNCS, vol. 7532, pp. 383–396. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-34002-4_30 CrossRefGoogle Scholar
  16. 16.
    Winkels, R., Hoekstra, R.: Automatic extraction of legal concepts and definitions, pp. 157–166. IOS (2012)Google Scholar
  17. 17.
  18. 18.
  19. 19.
    NLP toolkit.
  20. 20.
  21. 21.
  22. 22.
  23. 23.
    Zeni, N., Seid, E.A., Engiel, P., Ingolfo, S., Mylopoulos, J.: Building large models of laws with NómosT. In: 35th International Conference on Conceptual Modeling (ER 2016). Springer (2016, to appear)Google Scholar
  24. 24.
  25. 25.
  26. 26.
    Germany’s Federal Data Protection Act - Bundesdatenschutzgesetz – BDSG.
  27. 27.
    Data Protection Code - Legislative Decree no. 196/2003.
  28. 28.
    Breaux, T.D., Antón, A.I.: Analyzing regulatory rules for privacy and security requirements. IEEE Trans. on SW. Eng. 34(1), 5–20 (2008)CrossRefGoogle Scholar
  29. 29.
    Amazon Mechanical Turk.
  30. 30.
    Dozier, C., Kondadadi, R., Light, M., Vachher, A., Veeramachaneni, S., Wudali, R.: Named entity recognition and resolution in legal text. In: Francesconi, E., Montemagni, S., Peters, W., Tiscornia, D. (eds.). LNCS (LNAI), vol. 6036, pp. 27–43Springer, Heidelberg (2010). doi: 10.1007/978-3-642-12837-0_2 CrossRefGoogle Scholar
  31. 31.
    Lesmo, L., Mazzei, A., Palmirani, M., Radicioni, D.P.: TULSI: an NLP system for extracting legal modificatory provisions. Art. Intell. Law 21(2), 139–172 (2013)CrossRefGoogle Scholar
  32. 32.
  33. 33.
    Sannier, N., Adedjouma, M., Sabetzadeh, M., Briand, L.: An automated framework for detection and resolution of cross references in legal texts. Req. Eng., 1–23 (2015).
  34. 34.
    Oanh Thi, T., Bach Xuan, N., Le Minh, N., Akira, S.: Automated reference resolution in legal texts. Artif. Intell. Law 22(1), 29–60 (2014)CrossRefGoogle Scholar
  35. 35.
    Zeni, N., Kiyavitskaya, N., Mich, L., Mylopoulos, J., Cordy, J.R.: A lightweight approach to semantic annotation of research papers. In: Kedad, Z., Lammari, N., Métais, E., Meziane, F., Rezgui, Y. (eds.) NLDB 2007. LNCS, vol. 4592, pp. 61–72. Springer, Heidelberg (2007). doi: 10.1007/978-3-540-73351-5_6 CrossRefGoogle Scholar
  36. 36.
    Boella, G., di Caro, L., Humphreys, L., Robaldo, L., van der Torre, L.: NLP challenges for EuNomos a tool to build and manage legal knowledge. In: 8th International Conference on Language Resources and Evaluation (LREC 2012). European Language Resources Ass., Istanbul (2012)Google Scholar
  37. 37.
    Akoma Ntoso standard.
  38. 38.
    Soria, C., Bartolini, R., Lenci, A., Montemagni, S., Pirrelli, V.: Automatic extraction of semantics in law documents. In: 5th Legislative XML Workshop (2007)Google Scholar
  39. 39.
    de Maat, E., Krabben, K., Winkels, R.: Machine learning versus knowledge based classification of legal texts. In: 23rd Conference on Legal KW and IS (JURIX 2010), pp. 87–96. IOS Press (2010)Google Scholar
  40. 40.
    Massey, K.: Legal requirements metrics for compliance analysis. Ph.D. dissertation, North Carolina State University (2012)Google Scholar
  41. 41.
    Ingolfo, S., Siena, A., Mylopoulos, J., Susi, A., Perini, A.: Arguing regulatory compliance of software requirements. Data Knowl. Eng. 87, 279–296 (2013)CrossRefGoogle Scholar
  42. 42.

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  1. 1.Department of Information Engineering and Computer ScienceUniversity of TrentoTrentoItaly
  2. 2.Department of Industrial EngineeringUniversity of TrentoTrentoItaly
  3. 3.School of Electrical Engineering and Computer ScienceUniversity of OttawaOttawaCanada

Personalised recommendations