Recurrent neural network-based models for recognizing requisite and effectuation parts in legal texts

  • Truong-Son Nguyen
  • Le-Minh Nguyen
  • Satoshi Tojo
  • Ken Satoh
  • Akira Shimazu
Article

Abstract

This paper proposes several recurrent neural network-based models for recognizing requisite and effectuation (RE) parts in Legal Texts. Firstly, we propose a modification of BiLSTM-CRF model that allows the use of external features to improve the performance of deep learning models in case large annotated corpora are not available. However, this model can only recognize RE parts which are not overlapped. Secondly, we propose two approaches for recognizing overlapping RE parts including the cascading approach which uses the sequence of BiLSTM-CRF models and the unified model approach with the multilayer BiLSTM-CRF model and the multilayer BiLSTM-MLP-CRF model. Experimental results on two Japan law RRE datasets demonstrated advantages of our proposed models. For the Japanese National Pension Law dataset, our approaches obtained an \(F_{1}\) score of 93.27% and exhibited a significant improvement compared to previous approaches. For the Japan Civil Code RRE dataset which is written in English, our approaches produced an \(F_{1}\) score of 78.24% in recognizing RE parts that exhibited a significant improvement over strong baselines. In addition, using external features and in-domain pre-trained word embeddings also improved the performance of RRE systems.

Keywords

Deep learning Recurrent neural networks Long short-term memory Conditional random fields Legal text analysis Recognizing requisite and effectuation parts Sequence labeling 

Notes

Acknowledgements

This work was supported by JSPS KAKENHI Grant Number 15K16048, JSPS KAKENHI Grant Number JP15K12094, and JST CREST Grant Number JPMJCR1513, Japan.

References

  1. Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5(2):157–166CrossRefGoogle Scholar
  2. Boden M (2001) A guide to recurrent neural networks and backpropagationGoogle Scholar
  3. Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135–146Google Scholar
  4. Bottou L (2010) Large-scale machine learning with stochastic gradient descent. In: Proceedings of COMPSTAT’2010, Springer, Berlin pp 177–186Google Scholar
  5. Chiu JP, Nichols E (2015) Named entity recognition with bidirectional lstm-cnns. arXiv preprint arXiv:151108308
  6. Dozier C, Kondadadi R, Light M, Vachher A, Veeramachaneni S, Wudali R (2010) Named entity recognition and resolution in legal text. Springer, Berlin, pp 27–43.  https://doi.org/10.1007/978-3-642-12837-0_2 Google Scholar
  7. Elman JL (1990) Finding structure in time. Cognit Sci 14(2):179–211CrossRefGoogle Scholar
  8. Graves A, Mohamed A, Hinton G (2013) Speech recognition with deep recurrent neural networks. In: IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 6645–6649Google Scholar
  9. Greff K, Srivastava RK, Koutník J, Steunebrink BR, Schmidhuber J (2017) LSTM: A search space odyssey. IEEE Trans Neural Netw Learn Syst 28:2222MathSciNetCrossRefGoogle Scholar
  10. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRefGoogle Scholar
  11. Huang Z, Xu W, Yu K (2015) Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:150801991
  12. Karpathy A (2015) The unreasonable effectiveness of recurrent neural networks. Andrej Karpathy blogGoogle Scholar
  13. Klein D, Manning CD (2003) Accurate unlexicalized parsing. In: Proceedings of the 41st annual meeting on association for computational linguistics. Volume 1, Association for computational linguistics, pp 423–430Google Scholar
  14. Kudo T (2005) CRF++: Yet another CRF toolkit. Software available at https://taku910.github.io/crfpp/
  15. Kudo T, Yamamoto K, Matsumoto Y (2004) Applying conditional random fields to japanese morphological analysis. In: Proceedings of the 2004 conference on empirical methods in natural language processingGoogle Scholar
  16. Lafferty J, McCallum A, Pereira F (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the eighteenth international conference on machine learning, ICML vol 1, pp 282–289Google Scholar
  17. Lample G, Ballesteros M, Subramanian S, Kawakami K, Dyer C (2016) Neural architectures for named entity recognition. arXiv preprint arXiv:160301360
  18. Ling W, Chu-Cheng L, Tsvetkov Y, Amir S (2015) Not all contexts are created equal: Better word representations with variable attentionGoogle Scholar
  19. Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp 3111–3119Google Scholar
  20. Nakamura M, Nobuoka S, Shimazu A (2007) Towards translation of legal sentences into logical forms. In: Annual conference of the Japanese society for artificial intelligence, Springer, Berlin pp 349–362Google Scholar
  21. Ngo XB, Nguyen LM, Shimazu A (2010) Recognition of requisite part and effectuation part in law sentences. In: Proceedings of (ICCPOL), pp 29–34Google Scholar
  22. Ngo XB, Nguyen LM, Tran TO, Shimazu A (2013) A two-phase framework for learning logical structures of paragraphs in legal articles. ACM Trans Asian Lang Inf Process (TALIP) 12(1):3Google Scholar
  23. Nguyen LM, Bach NX, Shimazu A (2011) Supervised and semi-supervised sequence learning for recognition of requisite part and effectuation part in law sentences. In: Proceedings of the 9th international workshop on finite state methods and natural language processing, association for computational linguistics, pp 21–29Google Scholar
  24. Nguyen TS, Nguyen TD, Ho BQ, Nguyen LM (2015) Recognizing logical parts in vietnamese legal texts using conditional random fields. In: IEEE RIVF international conference on computing & communication technologies-research, innovation, and vision for the future (RIVF), pp 1–6Google Scholar
  25. Nguyen TS, Nguyen LM, Ho BQ, Shimazu A (2016a) Recognizing logical parts in legal texts using neural architectures. In: IEEE eighth international conference on knowledge and systems engineering (KSE), pp 252–257Google Scholar
  26. Nguyen TS, Nguyen LM, Tran XC (2016b) Vietnamese named entity recognition at vlsp 2016 evaluation campaign. In: Proceedings of the fourth international workshop on vietnamese language and speech processing, pp 18–23Google Scholar
  27. Pennington J, Socher R, Manning C (2014) Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543Google Scholar
  28. Settles B (2004) Biomedical named entity recognition using conditional random fields and rich feature sets. In: Proceedings of the international joint workshop on natural language processing in biomedicine and its applications, Association for Computational Linguistics, pp 104–107Google Scholar
  29. Sha F, Pereira F (2003) Shallow parsing with conditional random fields. In: Proceedings of the 2003 conference of the north american chapter of the association for computational linguistics on human language technology-volume 1, Association for computational linguistics, pp 134–141Google Scholar
  30. Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958MathSciNetMATHGoogle Scholar
  31. Surdeanu M, Nallapati R, Manning C (2010) Legal claim identification: Information extraction with hierarchically labeled data. In: Proceedings of the 7th international conference on language resources and evaluationGoogle Scholar
  32. Taku Kudo YM (2002) Japanese dependency analysis using cascaded chunking. In: CoNLL 2002: proceedings of the 6th conference on natural language learning 2002 (COLING 2002 Post-Conference Workshops), pp 63–69Google Scholar
  33. Tanaka K, Kawazoe I, Narita H (1993) Standard structure of legal provisions-for the legal knowledge processing by natural language. Information Processing Society of Japan Natural Language Processing, pp 79–86Google Scholar
  34. Tjong Kim Sang EF, De Meulder F (2003) Introduction to the conll-2003 shared task: Language-independent named entity recognition. In: Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003-Volume 4, association for computational linguistics, pp 142–147Google Scholar
  35. Wang P, Qian Y, Soong F, He L, Zhao H (2015a) A unified tagging solution: Bidirectional lstm recurrent neural network with word embedding. arXiv preprint arXiv:151100215
  36. Wang P, Qian Y, Soong FK, He L, Zhao H (2015b) Part-of-speech tagging with bidirectional long short-term memory recurrent neural network. arXiv preprint arXiv:151006168
  37. Zhou J, Xu W (2015) End-to-end learning of semantic role labeling using recurrent neural networks. In: ACL (1), pp 1127–1137Google Scholar

Copyright information

© Springer Science+Business Media B.V., part of Springer Nature 2018

Authors and Affiliations

  • Truong-Son Nguyen
    • 1
    • 2
  • Le-Minh Nguyen
    • 1
  • Satoshi Tojo
    • 1
  • Ken Satoh
    • 3
  • Akira Shimazu
    • 1
  1. 1.Japan Advanced Institute of Science and TechnologyIshikawaJapan
  2. 2.University of Science, VNU-HCMCHo Chi Minh cityVietnam
  3. 3.National Institute of InformaticsTokyoJapan

Personalised recommendations