Skip to main content
Log in

Abstract meaning representation for legal documents: an empirical research on a human-annotated dataset

  • Original Research
  • Published:
Artificial Intelligence and Law Aims and scope Submit manuscript

Abstract

Natural language processing techniques contribute more and more in analyzing legal documents recently, which supports the implementation of laws and rules using computers. Previous approaches in representing a legal sentence often based on logical patterns that illustrate the relations between concepts in the sentence, often consist of multiple words. Those representations cause the lack of semantic information at the word level. In our work, we aim to tackle such shortcomings by representing legal texts in the form of abstract meaning representation (AMR), a graph-based semantic representation that gains lots of polarity in NLP community recently. We present our study in AMR Parsing (producing AMR from natural language) and AMR-to-text Generation (producing natural language from AMR) specifically for legal domain. We also introduce JCivilCode, a human-annotated legal AMR dataset which was created and verified by a group of linguistic and legal experts. We conduct an empirical evaluation of various approaches in parsing and generating AMR on our own dataset and show the current challenges. Based on our observation, we propose our domain adaptation method applying in the training phase and decoding phase of a neural AMR-to-text generation model. Our method improves the quality of text generated from AMR graph compared to the baseline model. (This work is extended from our two previous papers: “An Empirical Evaluation of AMR Parsing for Legal Documents”, published in the Twelfth International Workshop on Juris-informatics (JURISIN) 2018; and “Legal Text Generation from Abstract Meaning Representation”, published in the 32nd International Conference on Legal Knowledge and Information Systems (JURIX) 2019.).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. We keep the original trained models without retrained on the new dataset LDC2017T10

References

  • Abend O, Rappoport A (2013) Universal conceptual cognitive annotation (UCCA). In: Proceedings of the 51st annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp. 228–238. Association for Computational Linguistics, Sofia, Bulgaria. https://www.aclweb.org/anthology/P13-1023

  • Ballesteros M, Al-Onaizan Y (2017) AMR parsing using stack-LSTMs. In: Proceedings of the 2017 conference on empirical methods in natural language processing, pp. 1269–1275. Association for Computational Linguistics, Copenhagen, Denmark. https://doi.org/10.18653/v1/D17-1130

  • Banarescu L, Bonial C, Cai S, Georgescu M, Griffitt K, Hermjakob U, Knight K, Koehn P, Palmer M, Schneider N (2013) Abstract meaning representation for sembanking. In: Proceedings of the 7th linguistic annotation workshop and interoperability with discourse, pp. 178–186. Association for Computational Linguistics, Sofia, Bulgaria

  • Basile V, Bos J, Evang K, Venhuizen N (2012) Developing a large semantically annotated corpus. In: Proceedings of the eighth international conference on language resources and evaluation (LREC’12), pp. 3196–3200. European Language Resources Association (ELRA), Istanbul, Turkey. http://www.lrec-conf.org/proceedings/lrec2012/pdf/534_Paper.pdf

  • Brandt L, Grimm D, Zhou M, Versley Y (2016) ICL-HD at SemEval-2016 task 8: meaning representation parsing-augmenting AMR parsing with a preposition semantic role labeling neural network. In: Proceedings of the 10th international workshop on semantic evaluation (SemEval-2016), pp. 1160–1166. Association for Computational Linguistics, San Diego, California. https://doi.org/10.18653/v1/S16-1179

  • Cai S, Knight K (2013) Smatch: an evaluation metric for semantic feature structures. In: Proceedings of the 51st annual meeting of the association for computational linguistics (Volume 2: Short Papers), pp. 748–752. Association for Computational Linguistics, Sofia, Bulgaria

  • Cao K, Clark S (2019) Factorising AMR generation through syntax. In: Proceedings of the 2019 conference of the north american chapter of the association for computational linguistics: human language technologies, Volume 1 (Long and Short Papers), pp. 2157–2163. Association for Computational Linguistics, Minneapolis, Minnesota. https://doi.org/10.18653/v1/N19-1223

  • Damonte M, Cohen SB (2019) Structural neural encoders for AMR-to-text generation. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, Volume 1 (Long and Short Papers), pp. 3649–3658. Association for Computational Linguistics, Minneapolis, Minnesota. https://doi.org/10.18653/v1/N19-1366

  • Damonte M, Cohen SB, Satta G (2017) An incremental parser for abstract meaning representation. In: Proceedings of European chapter of the ACL (EACL)

  • Denkowski M, Lavie A (2014) Meteor universal: Language specific translation evaluation for any target language. In: Proceedings of the EACL 2014 workshop on statistical machine translation

  • Dohare S, Karnick H, Gupta V (2017) Text summarization using abstract meaning representation. arXiv preprint arXiv:1706.01678

  • Dong L, Lapata M (2016) Language to logical form with neural attention. In: Proceedings of the 54th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp. 33–43. Association for Computational Linguistics, Berlin, Germany. https://doi.org/10.18653/v1/P16-1004. https://www.aclweb.org/anthology/P16-1004

  • Fan A, Grangier D, Auli M (2018) Controllable abstractive summarization. In: Proceedings of the 2nd Workshop on Neural Machine Translation and Generation, pp. 45–54. Association for Computational Linguistics, Melbourne, Australia. https://doi.org/10.18653/v1/W18-2706. https://www.aclweb.org/anthology/W18-2706

  • Flanigan J, Thomson S, Carbonell J, Dyer C, Smith NA (2014) A discriminative graph-based parser for the abstract meaning representation. In: Proceedings of the 52nd annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp. 1426–1436. Association for Computational Linguistics, Baltimore, Maryland. https://doi.org/10.3115/v1/P14-1134

  • Foland W, Martin JH (2017) Abstract meaning representation parsing using LSTM recurrent neural networks. In: Proceedings of the 55th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp. 463–472. Association for Computational Linguistics, Vancouver, Canada. https://doi.org/10.18653/v1/P17-1043

  • Ge D, Li J, Zhu M, Li S (2019) Modeling source syntax and semantics for neural amr parsing. In: Proceedings of the twenty-eighth international joint conference on artificial intelligence, IJCAI-19, pp. 4975–4981. International joint conferences on artificial intelligence organization. https://doi.org/10.24963/ijcai.2019/691. https://doi.org/10.24963/ijcai.2019/691

  • Ghazvininejad M, Shi X, Priyadarshi J, Knight K (2017) Hafez: an interactive poetry generation system. In: Proceedings of ACL 2017, system demonstrations, pp. 43–48. Association for computational linguistics, Vancouver, Canada. https://www.aclweb.org/anthology/P17-4008

  • Goodman J, Vlachos A, Naradowsky J (2016) Noise reduction and targeted exploration in imitation learning for abstract meaning representation parsing. In: Proceedings of the 54th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp. 1–11. Association for Computational Linguistics, Berlin, Germany. https://doi.org/10.18653/v1/P16-1001

  • Gu J, Lu Z, Li H, Li VO (2016) Incorporating copying mechanism in sequence-to-sequence learning. In: Proceedings of the 54th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp. 1631–1640. Association for Computational Linguistics. https://doi.org/10.18653/v1/P16-1154. http://aclweb.org/anthology/P16-1154

  • Hardy H, Vlachos A (2018) Guided neural language generation for abstractive summarization using abstract meaning representation. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp. 768–773. Association for Computational Linguistics, Brussels, Belgium. https://doi.org/10.18653/v1/D18-1086

  • Jones B, Andreas J, Bauer D, Hermann KM, Knight K, (2012) Semantics-based machine translation with hyperedge replacement grammars. In: Proceedings of COLING 2012, pp. 1359–1376. The COLING, (2012) Organizing Committee. Mumbai, India

  • Katayama T (2007) Legal engineering-an engineering approach to laws in e-society age. In: Proceedings of the 1st international workshop on JURISIN

  • Konstas I, Iyer S, Yatskar M, Choi Y, Zettlemoyer L (2017) Neural AMR: Sequence-to-sequence models for parsing and generation. In: Proceedings of the 55th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp. 146–157. Association for Computational Linguistics, Vancouver, Canada. https://doi.org/10.18653/v1/P17-1014

  • Liao K, Lebanoff L, Liu F (2018) Abstract meaning representation for multi-document summarization. In: Proceedings of the 27th international conference on computational linguistics, pp. 1178–1190. Association for Computational Linguistics, Santa Fe, New Mexico, USA

  • Lin Z, Xue N (2019) Parsing meaning representations: is easier always better? In: Proceedings of the first international workshop on designing meaning representations, pp. 34–43. Association for Computational Linguistics, Florence, Italy. https://doi.org/10.18653/v1/W19-3304

  • Liu F, Flanigan J, Thomson S, Sadeh N, Smith NA (2015) Toward abstractive summarization using semantic representations. In: Proceedings of the 2015 conference of the North American chapter of the association for computational linguistics: human language technologies, pp. 1077–1086. Association for Computational Linguistics, Denver, Colorado. https://doi.org/10.3115/v1/N15-1114

  • Liu Y, Che W, Zheng B, Qin B, Liu T (2018) An AMR aligner tuned by transition-based parser. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp. 2422–2430. Association for Computational Linguistics, Brussels, Belgium. https://doi.org/10.18653/v1/D18-1264

  • Lyu C, Titov I (2018) AMR parsing as graph prediction with latent alignment. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp. 397–407. Association for Computational Linguistics, Melbourne, Australia. https://doi.org/10.18653/v1/P18-1037

  • Mahalunkar A, Kelleher J (2019) Multi-element long distance dependencies: Using SPk languages to explore the characteristics of long-distance dependencies. In: Proceedings of the workshop on deep learning and formal languages: building bridges, pp. 34–43. Association for Computational Linguistics, Florence. https://doi.org/10.18653/v1/W19-3904. https://www.aclweb.org/anthology/W19-3904

  • Mitra A, Baral C (2016) Addressing a question answering challenge by combining statistical methods with inductive rule learning and reasoning. In: Thirtieth AAAI conference on artificial intelligence

  • Nakamura M, Nobuoka S, Shimazu A (2007) Towards translation of legal sentences into logical forms. In: Annual conference of the Japanese society for artificial intelligence, pp. 349–362. Springer

  • Napoles C, Gormley M, Van Durme B (2012) Annotated Gigaword. In: Proceedings of the joint workshop on automatic knowledge base construction and web-scale knowledge extraction (AKBC-WEKEX), pp. 95–100. Association for Computational Linguistics, Montréal, Canada. https://www.aclweb.org/anthology/W12-3018

  • Naseem T, Shah A, Wan H, Florian R, Roukos S, Ballesteros M (2019) Rewarding Smatch: Transition-based AMR parsing with reinforcement learning. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp. 4586–4592. Association for Computational Linguistics, Florence, Italy. https://doi.org/10.18653/v1/P19-1451

  • Navas-Loro M, Satoh K, Rodríguez-Doncel V (2019) Contractframes: bridging the gap between natural language and logics in contract law. In: Kojima K, Sakamoto M, Mineshima K, Satoh K (eds) New frontiers in artificial intelligence. Springer International Publishing, Cham, pp 101–114

    Chapter  Google Scholar 

  • van Noord R, Bos J (2017) Neural semantic parsing by character-based translation: experiments with abstract meaning representations. Comput Linguist Netherlands J 7:93–108

    Google Scholar 

  • Papineni K, Roukos S, Ward T, Zhu WJ (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of 40th annual meeting of the association for computational linguistics, pp. 311–318. Association for Computational Linguistics, Philadelphia, Pennsylvania, USA. https://doi.org/10.3115/1073083.1073135. https://www.aclweb.org/anthology/P02-1040

  • Peng X, Song L, Gildea D (2015) A synchronous hyperedge replacement grammar based approach for AMR parsing. In: Proceedings of the nineteenth conference on computational natural language learning, pp. 32–41. Association for Computational Linguistics, Beijing, China. https://doi.org/10.18653/v1/K15-1004

  • Peng X, Wang C, Gildea D, Xue N (2017) Addressing the data sparsity issue in neural AMR parsing. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics: Volume 1, Long Papers, pp. 366–375. Association for Computational Linguistics, Valencia, Spain

  • Pennington J, Socher R, Manning C (2014) Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp. 1532–1543. Association for Computational Linguistics. https://doi.org/10.3115/v1/D14-1162. http://aclweb.org/anthology/D14-1162

  • Pourdamghani N, Knight K, Hermjakob U (2016) Generating English from abstract meaning representations. In: Proceedings of the 9th international natural language generation conference, pp. 21–25. Association for Computational Linguistics, Edinburgh, UK. https://doi.org/10.18653/v1/W16-6603

  • Rao S, Marcu D, Knight K, Daumé III H (2017) Biomedical event extraction using abstract meaning representation. In: BioNLP 2017, pp. 126–135. Association for Computational Linguistics, Vancouver, Canada. https://doi.org/10.18653/v1/W17-2315. https://www.aclweb.org/anthology/W17-2315

  • Ribeiro LFR, Gardent C, Gurevych I (2019) Enhancing AMR-to-text generation with dual graph representations. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp. 3174–3185. Association for Computational Linguistics, Hong Kong, China. https://doi.org/10.18653/v1/D19-1314. https://www.aclweb.org/anthology/D19-1314

  • Sachan M, Xing E (2016) Machine comprehension using rich semantic representations. In: Proceedings of the 54th annual meeting of the association for computational linguistics (Volume 2: Short Papers), pp. 486–492. Association for Computational Linguistics, Berlin, Germany. https://doi.org/10.18653/v1/P16-2079

  • Shaw P, Massey P, Chen A, Piccinno F, Altun Y (2019) Generating logical forms from graph representations of text and entities. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp. 95–106. Association for Computational Linguistics, Florence, Italy. https://doi.org/10.18653/v1/P19-1010. https://www.aclweb.org/anthology/P19-1010

  • Shaw S, Pajak M, Lisowska A, Tsaftaris SA, O’Neil AQ (2020) Teacher-student chain for efficient semi-supervised histology image classification. arXiv preprint arXiv:2003.08797

  • Song L, Gildea D, Zhang Y, Wang Z, Su J (2019) Semantic neural machine translation using AMR. Trans Assoc Comput Linguist 7:19–31. https://doi.org/10.1162/tacl_a_00252

    Article  Google Scholar 

  • Song L, Zhang Y, Wang Z, Gildea D (2018) A graph-to-sequence model for AMR-to-text generation. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp. 1616–1626. Association for Computational Linguistics, Melbourne, Australia. https://doi.org/10.18653/v1/P18-1150

  • Wang C, Pradhan S, Pan X, Ji H, Xue N (2016) CAMR at SemEval-2016 task 8: An extended transition-based AMR parser. In: Proceedings of the 10th international workshop on semantic evaluation (SemEval-2016), pp. 1173–1178. Association for Computational Linguistics, San Diego, California. https://doi.org/10.18653/v1/S16-1181

  • Wang C, Xue N, Pradhan S (2015) A transition-based algorithm for AMR parsing. In: Proceedings of the 2015 conference of the North American chapter of the association for computational linguistics: human language technologies, pp. 366–375. Association for Computational Linguistics, Denver, Colorado. https://doi.org/10.3115/v1/N15-1040

  • Xie Q, Luong MT, Hovy E, Le QV (2020) Self-training with noisy student improves imagenet classification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 10687–10698

  • Zhang S, Ma X, Duh K, Van Durme B (2019) AMR parsing as sequence-to-graph transduction. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp. 80–94. Association for Computational Linguistics, Florence, Italy. https://doi.org/10.18653/v1/P19-1009

  • Zhou J, Xu F, Uszkoreit H, Qu W, Li R, Gu Y (2016) AMR parsing with an incremental joint model. In: Proceedings of the 2016 conference on empirical methods in natural language processing, pp. 680–689. Association for Computational Linguistics, Austin, Texas. https://doi.org/10.18653/v1/D16-1065

  • Zhu J, Li J, Zhu M, Qian L, Zhang M, Zhou G (2019) Modeling graph structure in transformer for better AMR-to-text generation. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp. 5462–5471. Association for Computational Linguistics, Hong Kong, China. https://doi.org/10.18653/v1/D19-1548. https://www.aclweb.org/anthology/D19-1548

Download references

Acknowledgments

This work was supported by JST CREST Grant Number JPMJCR1513 Japan, JSPS Kakenhi Grant Number 20H04295, 20K2046, and 20K20625. The research also was supported in part by the Asian Office of Aerospace R\&D (AOARD), Air Force Office of Scientific Research (Grant no. FA2386-19-1-4041).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Sinh Trong Vu or Minh Le Nguyen.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Vu, S.T., Le Nguyen, M. & Satoh, K. Abstract meaning representation for legal documents: an empirical research on a human-annotated dataset. Artif Intell Law 30, 221–243 (2022). https://doi.org/10.1007/s10506-021-09292-6

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10506-021-09292-6

Keywords

Navigation