Artificial Intelligence and Law

, Volume 17, Issue 2, pp 125–165 | Cite as

Automatically classifying case texts and predicting outcomes

  • Kevin D. AshleyEmail author
  • Stefanie Brüninghaus


Work on a computer program called SMILE + IBP (SMart Index Learner Plus Issue-Based Prediction) bridges case-based reasoning and extracting information from texts. The program addresses a technologically challenging task that is also very relevant from a legal viewpoint: to extract information from textual descriptions of the facts of decided cases and apply that information to predict the outcomes of new cases. The program attempts to automatically classify textual descriptions of the facts of legal problems in terms of Factors, a set of classification concepts that capture stereotypical fact patterns that effect the strength of a legal claim, here trade secret misappropriation. Using these classifications, the program can evaluate and explain predictions about a problem’s outcome given a database of previously classified cases. This paper provides an extended example illustrating both functions, prediction by IBP and text classification by SMILE, and reports empirical evaluations of each. While IBP’s results are quite strong, and SMILE’s much weaker, SMILE + IBP still has some success predicting and explaining the outcomes of case scenarios input as texts. It marks the first time to our knowledge that a program can reason automatically about legal case texts.


Predicting case outcomes Classifying case texts Case-based reasoning 



The research described here has been supported by Grant No. IDM-9987869 from the National Science Foundation.


  1. Aleven V (1997) Teaching case-based argumentation through a model and examples. Ph.D. dissertation, University of PittsburghGoogle Scholar
  2. Aleven V (2003) Using background knowledge in case-based legal reasoning: a computational model and an intelligent learning environment. Artif Intell 150:183–237zbMATHCrossRefGoogle Scholar
  3. Ashley K (1988) Modeling legal argument: reasoning with cases and hypotheticals. Ph.D. dissertation. COINS technical report No. 88–01, University of Massachusetts, AmherstGoogle Scholar
  4. Ashley K (1990) Modeling legal argument: reasoning with cases and hypotheticals. The MIT Press, CambridgeGoogle Scholar
  5. Ashley K (2000) Designing electronic casebooks that talk back: the CATO program. Jurimetrics J 40:275–319Google Scholar
  6. Ashley K (2002) An AI model of case-based argument from a jurisprudential viewpoint. Artifi Intell Law 10:163–218CrossRefGoogle Scholar
  7. Ashley K, Brüninghaus S (2006) Computer models for legal prediction. Jurimetrics J 46:309–352Google Scholar
  8. Bench-Capon T, Sartor G (2001) Theory based explanation of case law domains. In: Proceedings of the 8th international conference on artificial intelligence and law. ACM Press, pp 12–21Google Scholar
  9. Bench-Capon T, Sartor G (2003) A model of legal reasoning with cases incorporating theories and values. Artif Intell 150:97–143zbMATHCrossRefGoogle Scholar
  10. Boswell J, Chapman R, Fleeman J, Rogers P (1988) Life of Johnson. Oxford University Press, OxfordGoogle Scholar
  11. Branting L (1999) Reasoning with rules and precedents—a computational model of legal analysis. Kluwer, DordrechtGoogle Scholar
  12. Brüninghaus S, Ashley K (1999) Bootstrapping case base development with annotated case summmaries. In: Proceedings of the third international conference on case-based reasoning. LNAI 1650. pp 59–73Google Scholar
  13. Brüninghaus S, Ashley K (2001) Improving the representation of legal case texts with information extraction methods. In: Proceedings of the eighth international conference on artificial intelligence and law. pp 42–51Google Scholar
  14. Brüninghaus S, Ashley K (2003) Predicting the outcome of case-based legal arguments. In: Sartor G (ed) Proceedings of the 9th international conference on artificial intelligence and law (ICAIL-03). ACM Press, pp 234–242Google Scholar
  15. Brüninghaus S, Ashley K (2005) Reasoning with textual cases. In: Proceedings of the sixth international conference on case-based reasoning. Springer, pp 137–151Google Scholar
  16. Burke R, Hammond K, Kulyukin V, Lytinen S, Tomuro N, Schonberg S (1997) Question answering from frequently-asked question files: experiences with the FAQ finder system. 18 Ai Magazine 18:57–66Google Scholar
  17. Cardie C, Howe N (1997) Improving minority class prediction using case-specific feature weights. In: Proceedings of the fourteenth international conference on machine learning. Morgan Kaufmann, pp 57–65Google Scholar
  18. Chapman W, Bridewell W, Hanbury P, Cooper G, Buchanan B (2001) A simple algorithm for identifying negated findings and diseases in discharge Summaries. J Biom Inform 34:301–310, 302Google Scholar
  19. Chorley A, Bench-Capon T (2005) An empirical investigation of reasoning with legal cases through theory construction and application. Artif Intell Law 13:323–371CrossRefGoogle Scholar
  20. Cohen P (1995a) Empirical methods for artificial intelligence. MIT-Press, Cambridge, MAzbMATHGoogle Scholar
  21. Cohen W (1995b) Text categorization and relational learning. In: Proceedings of the twelfth international conference on machine learning, pp 124–132Google Scholar
  22. Cunningham C, Weber R, Proctor J, Fowler C, Murphy M (2004) Investigating graphs in textual case-based reasoning. In: Proceedings of the seventh european conference on case-based reasoning, pp 573–586Google Scholar
  23. Daelemans W, Zavrel J, van der Sloot K, van den Bosch A (2004, 2007) TiMBL: Tilburg Memory Based Learner, version 5.02 (now 6.0)
  24. Dale R (2000) Handbook of natural language processing. Marcel Dekker, Inc., New YorkGoogle Scholar
  25. Daniels J, Rissland E (1997) Finding legally relevant passages in case opinions. In: Proceedings of the sixth international conference on artificial intelligence and law. ACM Press, pp 39–46Google Scholar
  26. Delgado R, Stefancic J (2007) Why do we ask the same questions? The triple helix dilemma revisited. Law Libr J 99:307–328Google Scholar
  27. Dietterich T (1996) Statistical tests for comparing supervised classification learning algorithms. Oregon State University Technical ReportGoogle Scholar
  28. Ejan Mackaay E, Robillard P (1974) Predicting judicial decisions: the nearest neighbour rule and visual representation of case patterns. Datenverarbeitung im Recht 3:302Google Scholar
  29. Fürnkranz J, Mitchell T, Riloff E (1998) A case study in using linguistic phrases for text categorization on the WWW. In: Proceedings of the ICML/AAAI-98 workshop on learning for text classification. Technical Report WS-98-05, pp 5–120Google Scholar
  30. Gonçalves T, Quaresma P (2005) Is linguistic information relevant for the text legal classification problem? In: Proceedings of the tenth international conference on artificial intelligence and law. ACM Press, pp 168–176Google Scholar
  31. Gordon T, Prakken H, Walton D (2007) The carneades model of argument and burden of proof. Artif Intell 171:10–11MathSciNetGoogle Scholar
  32. Grover C, Hachey B, Hughson I, Korycinski C (2003) Automatic summarisation of legal documents. In: Proceedings of ninth international conference on artificial intelligence and law. ACM Press, pp 243–251Google Scholar
  33. Hachey B, Grover C (2006) Extractive summarization of legal texts. Artif Intell Law 14:305–345CrossRefGoogle Scholar
  34. Hanson A (2002) From key numbers to keywords: how automation has transformed the law. Law Libr J 94:563Google Scholar
  35. Hunter D (2000) Near knowledge: inductive learning systems in law, Virginia. J.L. & Tech. 5:9Google Scholar
  36. Jackson P, Moulinier I (2007) Natural language processing for online applications: text retrieval extraction and categorization, 2nd edn. John Benjamins Publishing Co, AmsterdamGoogle Scholar
  37. Jackson P, Al-Kofahi K, Tyrrell A, Vacher A (2003) Information extraction from case law and retrieval of prior cases. Artif Intell 150:239–290CrossRefGoogle Scholar
  38. Kim Won, Wilbur W (2000) Corpus-based statistical screening for phrase identification. J Am Med Inform Assoc 7:499–511Google Scholar
  39. Lenz M (1999) Case retreival nets as a model for building flexible information systems Ph.D. dissertation, Humboldt University, BerlinGoogle Scholar
  40. Lewis D (1992) Representation and learning in information retrieval. Ph.D. dissertation, University of Massachusetts, AmherstGoogle Scholar
  41. Lewis D, Sparck Jones K (1996) Natural language processing for information retrieval. Commun ACM 39:92–101CrossRefGoogle Scholar
  42. McCallum A (2004) Bow: a toolkit for statistical language modeling, text retrieval, classification and clustering.
  43. McCarty LT (2007) Deep semantic interpretations of legal texts. In: Proceedings of the eleventh international conference on artificial intelligence and law, pp 217–224Google Scholar
  44. Mitchell T (1997) Machine learning. McGraw-Hill, New YorkzbMATHGoogle Scholar
  45. Mitra M, Buckley C, Singhal A, Cardie C (1997) An analysis of statistical and syntactic phrases. In: Proceedings of the fifth international conference “recherche d’Information assistee par ordinateur”, pp 200–214Google Scholar
  46. Moens M-F (2006) Information extraction: algorithms and prospects in a retrieval context. Springer, DordrechtzbMATHGoogle Scholar
  47. Moens M-F, Boiy E, Palau R, Reed C (2007) Automatic detection of arguments in legal texts. In: Proceedings of eleventh international conference on artificial intelligence and law (ICAIL-07), pp 225–236Google Scholar
  48. Popple J (1996) A pragmatic legal expert system. Dartmouth. Ashgate. Farnham, UKGoogle Scholar
  49. Provost F, Aronis J, Buchanan B (1999) Rule-space search for knowledge-based discovery. CIIO Working Paper #IS 99-012, Stern School of Business, New York University (visited March 23, 2009) <>
  50. Quinlan R (1993) C4.5: programs for machine learning. Morgan Kaufmann, San FranciscoGoogle Scholar
  51. Quinlan R (2004) C4.5 Release 8.
  52. Riloff E (1996) Automatically generating extraction patterns from untagged text. In: Proceedings of the thirteenth national conference on artificial intelligence, pp 1044–1049Google Scholar
  53. Riloff E, Phillips W (2004) An introduction to the sundance and autoslog systems, University of Utah School of Computing Technical Report #UUCS-04-015. (visited March 23, 2009)
  54. Rose D (1994) A symbolic and connectionist approach to legal information retrieval. Lawrence Erlbaum Publishers, Taylor & Francis Group, PhiladelphiaGoogle Scholar
  55. Salzberg S (1997) On comparing classifiers: pitfalls to avoid and a recommended approach. Data Min Knowl Disc 1(3):317–328CrossRefGoogle Scholar
  56. Thompson P (2001) Automatic categorization of case law. In: Proceedings of the eighth international conference on artificial intelligence and law. ACM Press, pp 70–77Google Scholar
  57. Turtle H (1995) Text retrieval in the legal world. Artif Intell Law 3:5–54CrossRefGoogle Scholar
  58. Uyttendaele C, Moens M-F, Dumortier J (1998) SALOMON: automatic abstracting of legal cases for effective access to court decisions. Artif Intell Law 6:59–79CrossRefGoogle Scholar
  59. Vossos G (1995) Incorporating inductive case-based reasoning into an object-oriented deductive legal knowledge based system. Ph.D. dissertation, Latrobe University, pp 146, 157Google Scholar
  60. Wardeh M, Bench-Capon T, Coenen F (2008) Argument based moderation of benefit assessment. In: Legal knowledge and information systems, Proceedings, Jurix 2008: The twenty-first annual conference, pp 128–137Google Scholar
  61. Weber R (1998) Intelligent jurisprudence research. Doctoral dissertation. Federal University of Santa Catarina, Florianópolis, BrazilGoogle Scholar
  62. Witten I, Eibe F (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San FranciscozbMATHGoogle Scholar
  63. Zeleznikow J, Hunter D (1994) Building intelligent legal information systems–representations and reasoning in law. Kluwer, AmsterdamGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2009

Authors and Affiliations

  1. 1.Learning Research and Development CenterUniversity of PittsburghPittsburghUSA
  2. 2.Graduate Program in Intelligent SystemsUniversity of PittsburghPittsburghUSA

Personalised recommendations