Artificial Intelligence and Law

, Volume 14, Issue 4, pp 305–345 | Cite as

Extractive summarisation of legal texts

Article

Abstract

We describe research carried out as part of a text summarisation project for the legal domain for which we use a new XML corpus of judgments of the UK House of Lords. These judgments represent a particularly important part of public discourse due to the role that precedents play in English law. We present experimental results using a range of features and machine learning techniques for the task of predicting the rhetorical status of sentences and for the task of selecting the most summary-worthy sentences from a document. Results for these components are encouraging as they achieve state-of-the-art accuracy using robust, automatically generated cue phrase information. Sample output from the system illustrates the potential of summarisation technology for legal information management systems and highlights the utility of our rhetorical annotation scheme as a model of legal discourse, which provides a clear means for structuring summaries and tailoring them to different types of users.

Keywords

automatic text summarisation legal discourse natural language processing machine learning XML knowledge management 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aleven, V. (1997). Teaching Case-Based Argumentation through a Model and Examples. Ph.D. thesis. University of Pittsburgh, Pittsburgh, PA, USAGoogle Scholar
  2. Aone, C., Okurowski, M. E., Gorlinsky, J., and Larsen, B. (1999). A Trainable Summarizer with Knowledge Acquired from Robust NLP Techniques. In: Mani I., and Maybury M.T. (eds), Advances in Automatic Text Summarization, 71–80. MIT Press, Cambridge MassechusettsGoogle Scholar
  3. Banko, M., Mittal, V., Kantrowitz, M., and Goldstein, J. (1999). Generating Extraction-Based Summaries from Hand-Written Summaries by Aligning Text Spans. In Proceedings of the 4th Meeting of the Pacific Association for Computational Linguistics. Waterloo, Ontario, CanadaGoogle Scholar
  4. Borko, H., and Bernier, C. L. (1975). Abstracting Concepts and Methods. Academic Press, New YorkGoogle Scholar
  5. Carletta, J., Evert, S., Heid, U., Kilgour, J., Robertson, J., and Voormann, H. (2003). The NITE XML Toolkit: Flexible Annotation for Multi-modal Language Data. Behavior Research Methods, Instruments, and Computers, special issue on Measuring Behavior 35(3):353–363Google Scholar
  6. Cheung, L., Lai, T., Tsou, B., Chik, F., Luk, R., and Kwong, O. (2001). A Preliminary Study of Lexical Density for the Development of XML-based Discourse Structure tagger. In␣Proceedings of the 1st NLP and XML Workshop. Tokyo, JapanGoogle Scholar
  7. Chinchor, N. A. (1998). In Proceedings of the 7th Message Understanding Conference. Fairfax, VirginiaGoogle Scholar
  8. Collins, M. (2000). Discriminative Reranking for Natural Language Parsing. In Proceedings of the 17th International Conference on Machine Learning. Stanford University, CA, USAGoogle Scholar
  9. Curran, J. R. and Clark, S. (2003a). Investigating GIS and Smoothing for Maximum Entropy Taggers. In Proceedings of the 10th Conference of the European Chapter of the Association for Computational Linguistics. Budapest, HungaryGoogle Scholar
  10. Curran, J. R. and Clark, S. (2003b). Language Independent NER using a Maximum Entropy Tagger. In Proceedings of the Conference on Computational Natural Language Learning. Edmonton, Alberta, CanadaGoogle Scholar
  11. Daelemans, W. and Osborne, M. (2003). In Proceedings of the Conference on Computational Language Learning. Edmonton, Alberta, CanadaGoogle Scholar
  12. Darroch, J. N., and Ratcliff, D. (1972). Generalized Iterative Scaling for Log-Linear Models. The Annals of Mathematical Statistics 43(5):1470–1480MathSciNetGoogle Scholar
  13. Farzindar, A. (2005). Résumé Automatique de Textes Juridiques. Ph.D. thesis. Université de Montréal and Université Paris-SorbonneGoogle Scholar
  14. Farzindar, A. and Lapalme, G. (2004). Legal Text Summarization by Exploration of the Thematic Structure and Argumentative Roles. In Proceedings of the ACL-2004 Text Summarization Branches Out Workshop. Barcelona, SpainGoogle Scholar
  15. Fayyad, U. and Irani, K. (1993). Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning. In Proceedings of the 13th International Joint Conference on Artificial Intelligence. Chambéry, FranceGoogle Scholar
  16. Greenwood, K., Bench-Capon, T., and McBurney, P. (2003). Towards a Computational Account of Persuasion in Law. In Proceedings of the 9th International Conference on Artificial Intelligence and Law. Edinburgh, ScotlandGoogle Scholar
  17. Grover, C., Matheson, C., Mikheev, A., and Moens, M. (2000). LT TTT—A Flexible Tokenisation Tool. In Proceedings of the 2nd International Conference on Language Resources and Evaluation. Athens, GreeceGoogle Scholar
  18. Grover, C., Hachey, B., Hughson, I., and Korycinski, C. (2003). Automatic Summarisation of Legal Documents. In Proceedings of the 9th International Conference on Artificial Intelligence and Law. Edinburgh, ScotlandGoogle Scholar
  19. Hachey, B. (2002). Recognising Clauses Using Symbolic and Machine Learning Approaches. Master’s thesis. University of Edinburgh, Edinburgh, ScotlandGoogle Scholar
  20. Jing, H. and McKeown, K. R. (1999). The Decomposition of Human-Written Summary Sentences. In Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Berkeley, CA, USA, 129–136Google Scholar
  21. John, G. H. and Langley, P. (1995). Esitmating Continuous Distributions in Bayesian Classifiers. In Proceedings of the 11th Annual Conference on Uncertainty in Artificial Intelligence. Montréal, Québec, CanadaGoogle Scholar
  22. Krippendorff, K. (1980). Content Analysis: An Introduction to its Methodology. Sage Publications, Beverly Hills, CAGoogle Scholar
  23. Kupiec, J., Pedersen, J., and Chen, F. (1995). A Trainable Document Summarizer. In Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Seattle, WA, USAGoogle Scholar
  24. Lafferty, J., McCallum, A., and Pereira, F. (2001). Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In Proceedings of the 18th International Conference on Machine Learning. Williams College, MA, USAGoogle Scholar
  25. Lapata, M. (2003). Probabilistic Text Structuring: Experiments with Sentence Ordering. In Proceedings of the 41st Meeting of the Association of Computational Linguistics. Sapporo, JapanGoogle Scholar
  26. Littlestone, N. (1988). Learning Quickly when Irrelevant Attributes Abound: A New Linear Threshold Algorithm. Machine Learning 2:285–318Google Scholar
  27. Lupo, C. and Batini, C. (2003). A Federative Approach to Laws Access by Citizens: The “Normeinrete” System. In Proceedings of the 2nd International Conference on Electronic Governance. Prague, Czech RepublicGoogle Scholar
  28. Maley, Y. (1994). The Language of the Law. In: Gibbons J. (eds): Language and the Law, 11–50. Longman, LondonGoogle Scholar
  29. Malouf, R. (2002). A Comparison of Algorithms for Maximum Entropy Parameter Estimation. In Proceedings of the Conference on Computational Natural Language Learning. Taipei, TaiwanGoogle Scholar
  30. Mani, I. (2001). Automatic Summarization. John Benjamins: AmsterdamGoogle Scholar
  31. Mani, I. and Bloedorn, E. (1998). Machine Learning of Generic and User-focused Summarization. In Proceedings of the 15th National Conference on Artificial Intelligence. Madison, WI, USAGoogle Scholar
  32. Mann, W. C., and Thompson, S. A. (1987). Rhetorical Structure Theory: Description and construction of text structures. In: Kempen, G. (eds), Natural Language Generation: New Results in Artificial Intelligence, Psychology, and Linguistics. Marinus Nijhoff Publishers, Dordrecht, NL, pp. 85–95Google Scholar
  33. Marcu, D. (1999). The Automatic Construction of Large-Scale Corpora for Summarization Research. In Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Berkeley, CA, USAGoogle Scholar
  34. McCallum, A., Freitag, D., and Pereira, F. (2000). Maximum Entropy Markov Models for Information Extraction and Segmentation. In Proceedings of the 17th International Conference on Machine Learning. Stanford University, CA, USAGoogle Scholar
  35. McKelvie, D. (1999). XMLPERL 1.0.4 XML processing software. http://www.cogsci.ed.ac.uk/dmck/xmlperl
  36. Mikheev, A. (1997). Automatic Rule Induction for Unknown Word Guessing. Computational Linguistics 23(3):405–423Google Scholar
  37. Minnen, G., Carroll, J., and Pearce, D. (2000). Robust, Applied Morphological Generation. In Proceedings of 1st International Natural Language Generation Conference. Mitzpe Ramon, IsraelGoogle Scholar
  38. Moens, M. -F., and Busser, R. D. (2002). First Steps in Building a Model for the Retrieval of Court Decisions. International Journal of Human-Computer Studies 57(5):429–446CrossRefGoogle Scholar
  39. Moens, M. F., Uyttendaele, C., and Dumortier, J. (1997). Abstracting of Legal Cases: The SALOMON Experience. In The 6th International Conference on Artificial Intelligence and Law. Melbourne, Victoria, AustraliaGoogle Scholar
  40. Molina, A., and Pla, F. (2002). Shallow Parsing Using Specialized HMMs. The Journal of Machine Learning Research 2:595–613MATHCrossRefGoogle Scholar
  41. Myers, G. (1992). In this Paper we Report...: Speech acts and Scientific Facts. Journal of Pragmatics 17(4): 295–313Google Scholar
  42. Osborne, M. (2002). Using Maximum Entropy for Sentence Extraction. In Proceedings of the ACL-2002 Automatic Summarization Workshop. Philadelphia, PA, USAGoogle Scholar
  43. Platt, J. C. (1998). Fast Training of Support Vector Machines using Sequential Minimal Optimization. In Schölkopf, B., Burges, C. J., and Smola, A. J. (eds.), Advances in Kernel Methods: Support Vector Learning. Cambridge, MA: MIT PressGoogle Scholar
  44. Quinlan, R. (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo, CAGoogle Scholar
  45. Ratnaparkhi, A. (1996). A Maximum Entropy Part-of-Speech Tagger. In Proceedings of the 1st Conference on Empirical Methods in Natural Language Processing. Philadelphia, PA, USAGoogle Scholar
  46. Ratnaparkhi, A. (1998). Maximum Entropy Models for Natural Language Ambiguity Resolution. Ph.D. thesis. University of Pennsylvania, Philadelphia, PA, USAGoogle Scholar
  47. Sang, E. T. K. and Déjean, H. (2001). Introduction to the CoNLL-2001 Shared Task: Clause Identification. In Proceedings of the Conference on Computational Language Learning. Toulouse, FranceGoogle Scholar
  48. Spärck-Jones, K. (1998). Automatic Summarising: Factors and Directions. In: Mani, I. and Maybury M. T. (eds.), Advances in Automatic Text Summarisation, 1–14. Cambridge, Massechusetts: MIT PressGoogle Scholar
  49. Swales, J. M. (1990). Genre Analysis: English in Academic and Research Settings. Cambridge University Press, CambridgeGoogle Scholar
  50. Teufel, S. and Moens, M. (1997). Sentence Extraction as a Classification Task. In Proceedings of the ACL/EACL’97 Workshop on Intelligent and Scalable Text Summarization. Madrid, SpainGoogle Scholar
  51. Teufel, S., Moens, M. (1998). Argumentative Classification of Extracted Sentences as a First Step Towards Flexible Abstracting. In: Mani, I., and Maybury, M. T. (eds), Advances in Automatic Text Summarization. MIT Press, Cambridge Massechusetts, pp. 137–175Google Scholar
  52. Teufel, S. and Moens, M. (1999). Discourse-Level Argumentation in Scientific Articles: Human and Automatic Annotation. In Proceedings of the ACL-1999 Workshop Towards Standards and Tools for Discourse Tagging. College Park, MD, USAGoogle Scholar
  53. Teufel, S. and Moens, M. (2000). What’s Yours and What’s Mine: Determining Intellectual Attribution in Scientific Text. In Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora. Hong KongGoogle Scholar
  54. Teufel, S., Moens, M. (2002). Summarising Scientific Articles – Experiments with Relevance and Rhetorical Status. Computational Linguistics 28(4):409–445CrossRefGoogle Scholar
  55. Teufel, S., Carletta, J., and Moens, M. (1999). An Annotation Scheme for Discourse-Level Argumentation in Reserach Articles. In Proceedings of the 9th Conference of the European Chapter of the Association for Computational Linguistics. Bergen, NorwayGoogle Scholar
  56. Thompson, H., Tobin, R., McKelvie, D., and Brew, C. (2004). LT XML version 1.2.7. http://www.ltg.ed.ac.uk/software/xml
  57. van Engers, T. M., van Gog, R., and Sayah, K. (2004). A Case Study on Automated Norm Extraction. In Proceedings of the 17th Annual Conference on Legal Knowledge and Information Systems. Berlin, GermanyGoogle Scholar
  58. Wasson, M. (1998). Using Leading Text for News Summaries: Evaluation Results and Implications for Commercial Summarization Applications. In Proceedings of the Joint 36th Annual Meeting of the Association for Computational Linguistics and the 17th International Conference on Computational Linguistics. Montréal, Québec, CanadaGoogle Scholar
  59. Weber, R. O., Ashley, K. D. and Brninghaus, S. (2006). Textual Case-based Reasoning. Knowledge Engineering Review, 20(3): 255–260Google Scholar
  60. Winkels, R., Boer, A., and Hoekstra, R. (2002). MetaLex: An XML Standard for Legal Documents. In Proceedings of the XML Europe Conference. London, EnglandGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2007

Authors and Affiliations

  1. 1.School of Informatics, University of EdinburghEdinburghUK

Personalised recommendations