Artificial Intelligence and Law

, Volume 27, Issue 2, pp 117–139 | Cite as

CLAUDETTE: an automated detector of potentially unfair clauses in online terms of service

  • Marco LippiEmail author
  • Przemysław Pałka
  • Giuseppe Contissa
  • Francesca Lagioia
  • Hans-Wolfgang Micklitz
  • Giovanni Sartor
  • Paolo Torroni


Terms of service of on-line platforms too often contain clauses that are potentially unfair to the consumer. We present an experimental study where machine learning is employed to automatically detect such potentially unfair clauses. Results show that the proposed system could provide a valuable tool for lawyers and consumers alike.


Machine learning Terms of service Potentially unfair clauses Natural language processing 



Funding was obtained from European University Institute by author Hans-Wolfgang Micklitz (CLAUDETTE Project).


  1. Aletras N, Tsarapatsanis D, Preoiuc-Pietro D, Lampos V (2016) Predicting judicial decisions of the European Court of Human Rights: a natural language processing perspective. PeerJ Comput Sci 2:e93CrossRefGoogle Scholar
  2. Ashley K (2017) Artificial intelligence and legal analytics: new tools for law practice in the digital age. Cambridge University Press, CambridgeCrossRefGoogle Scholar
  3. Ashley KD, Walker VR (2013) Toward constructing evidence-based legal arguments using legal decision documents and machine learning. In: Francesconi E, Verheij B (eds) ICAIL 2012, Rome, Italy, ACM, pp 176–180.
  4. Bakos Y, Marotta-Wurgler F, Trossen DR (2014) Does anyone read the fine print? Consumer attention to standard-form contracts. J Legal Stud 43(1):1–35CrossRefGoogle Scholar
  5. Bartolini C, Giurgiu A, Lenzini G, Robaldo L (2016) Towards legal compliance by correlating standards and laws with a semi-automated methodology. In: BNCAI, Communications in computer and information science, vol 765. Springer, pp 47–62Google Scholar
  6. Biagioli C, Francesconi E, Passerini A, Montemagni S, Soria C (2005) Automatic semantics extraction in law documents. In: Proceedings of ICAIL, ACM, pp 133–140Google Scholar
  7. Cohen J (1968) Weighted kappa: nominal scale agreement provision for scaled disagreement or partial credit. Psychol Bull 70(4):213CrossRefGoogle Scholar
  8. Collins M, Duffy N (2002) New ranking algorithms for parsing and tagging: kernels over discrete structures, and the voted perceptron. In: Proceedings of the 40th annual meeting of the ACL, ACL, pp 263–270Google Scholar
  9. Department of Commerce (2010) Commercial data privacy and innovation in the internet economy: a dynamic policy framework. Technical report, Department of Commerce Internet Policy Task Force.
  10. Fabian B, Ermakova T, Lentz T (2017) Large-scale readability analysis of privacy policies. In: Proceedings of the international conference on web intelligence, ACM, pp 18–25Google Scholar
  11. Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional lstm and other neural network architectures. Neural Netw 18(5):602–610CrossRefGoogle Scholar
  12. Habernal I, Gurevych I (2017) Argumentation mining in user-generated web discourse. Comput Linguist 43(1):125–179MathSciNetCrossRefGoogle Scholar
  13. Harkous H, Fawaz K, Lebret R, Schaub F, Shin KG, Aberer K (2018) Polisis: automated analysis and presentation of privacy policies using deep learning. arXiv:180202561
  14. Joachims T (1998) Text categorization with support vector machines: Learning with many relevant features. In: ECML, vol 98, pp 137–142Google Scholar
  15. Kim Y (2014) Convolutional neural networks for sentence classification. In: Moschitti A, Pang B, Daelemans W (eds) Proceedings of the 2014 conference on empirical methods in natural language processing, EMNLP 2014, October 25–29, 2014, Doha, Qatar, A meeting of SIGDAT, a special interest group of the ACL, ACL, pp 1746–1751Google Scholar
  16. Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33(1):159–174CrossRefzbMATHGoogle Scholar
  17. Leopold E, Kindermann J (2002) Text categorization with support vector machines. How to represent texts in input space? Mach Learn 46(1–3):423–444CrossRefzbMATHGoogle Scholar
  18. Lippi M, Torroni P (2016a) Argumentation mining: state of the art and emerging trends. ACM Trans Internet Technol 16(2):10:1–10:25CrossRefGoogle Scholar
  19. Lippi M, Torroni P (2016b) Margot: a web server for argumentation mining. Expert Syst Appl 65(C):292–303. CrossRefGoogle Scholar
  20. Lippi M, Palka P, Contissa G, Lagioia F, Micklitz H, Panagis Y, Sartor G, Torroni P (2017) Automated detection of unfair clauses in online consumer contracts. In: Wyner AZ, Casini G (eds) Legal knowledge and information systems—JURIX 2017: the thirtieth annual conference, vol 302, Luxembourg, 13–15 December 2017, IOS Press, Frontiers in Artificial Intelligence and Applications, pp 145–154Google Scholar
  21. Lippi M, Lagioia F, Contissa G, Sartor G, Torroni P (2018) Claim detection in judgments of the EU Court of Justice. In: Artificial intelligence and the complexity of legal systems, VI international workshop (AICOL), selected revised papers. Lecture notes in artificial intelligence, Springer, forthcomingGoogle Scholar
  22. Loos M, Luzak J (2016) Wanted: a bigger stick. On unfair terms in consumer contracts with online service providers. J Consum Policy 39(1):63–90CrossRefGoogle Scholar
  23. McDonald A, Cranor L (2008) The cost of reading privacy policies. I/S J Law Policy Inf Soc 4(3):543–568Google Scholar
  24. Micklitz HW, Reich N (2014) The court and sleeping beauty: the revival of the unfair contract terms directive (UCTD). Common Market Law Rev 51(3):771–808Google Scholar
  25. Micklitz HW, Pałka P, Panagis Y (2017) The empire strikes back: digital control of unfair terms of online services. J Consum Policy 40(3):367–388CrossRefGoogle Scholar
  26. Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arxiv: 1301.3781
  27. Moens MF, Boiy E, Palau RM, Reed C (2007) Automatic detection of arguments in legal texts. In: Proceedings of the 11th international conference on artificial intelligence and law, ACM, pp 225–230Google Scholar
  28. Moschitti A (2006) Efficient convolution kernels for dependency and constituent syntactic trees. In: Fürnkranz J, Scheffer T, Spiliopoulou M (eds) Machine learning: ECML 2006, LNCS, vol 4212. Springer, Berlin Heidelberg, pp 318–329Google Scholar
  29. Nebbia P (2007) Unfair contract terms in European law: a study in comparative and EC law. Bloomsbury Publishing, LondonGoogle Scholar
  30. Obar JA, Oeldorf-Hirsch A (2016) The biggest lie on the internet: ignoring the privacy policies and terms of service policies of social networking services. In: TPRC 44: the 44th research conference on communication, information and internet policyGoogle Scholar
  31. Reich N, Micklitz HW, Rott P, Tonner K (2014) European consumer law. Intersentia, CambridgeGoogle Scholar
  32. Robaldo L, Sun X (2017) Reified input/output logic: combining input/output logic and reification to represent norms coming from existing legislation. J Logic Comput 27(8):2471–2503MathSciNetCrossRefzbMATHGoogle Scholar
  33. Schulte-Nölke H, Twigg-Flesner C, Ebers M (2008) EC consumer law compendium: the consumer acquis and its transposition in the member states. Walter de Gruyter, BerlinGoogle Scholar
  34. Sebastiani F (2002) Machine learning in automated text categorization. ACM Comput Surv 34(1):1–47. CrossRefGoogle Scholar
  35. Shulayeva O, Siddharthan A, Wyner A (2017) Recognizing cited facts and principles in legal judgements. Artif Intell Law 25(1):107–126CrossRefGoogle Scholar
  36. Tsochantaridis I, Joachims T, Hofmann T, Altun Y (2005) Large margin methods for structured and interdependent output variables. J Mach Learn Res 6:1453–1484MathSciNetzbMATHGoogle Scholar

Copyright information

© Springer Nature B.V. 2019

Authors and Affiliations

  1. 1.DISMI – University of Modena and Reggio EmiliaReggio EmiliaItaly
  2. 2.Yale Law School Center for Private Law, Information Society ProjectNew HavenUSA
  3. 3.CIRSFIDUniversity of BolognaBolognaItaly
  4. 4.Law DepartmentEuropean University InstituteSan Domenico di Fiesole, FlorenceItaly
  5. 5.DISI – University of BolognaBolognaItaly

Personalised recommendations