Multi-label charge predictions leveraging label co-occurrence in imbalanced data scenario

Dong, Hongsong; Yang, Fengbao; Wang, Xiaoxia

doi:10.1007/s00500-020-05029-w

Multi-label charge predictions leveraging label co-occurrence in imbalanced data scenario

Methodologies and Application
Published: 23 May 2020

Volume 24, pages 17821–17846, (2020)
Cite this article

Soft Computing Aims and scope Submit manuscript

613 Accesses
5 Citations
Explore all metrics

Abstract

Charge prediction is to predict associated charges based on fact descriptions and plays a significant role in legal aid systems. It is a fundamental and challenging task to automatically predict charges in the multi-label classification paradigm, which is fit to real applications. Existing works either focus on balanced data scenario and multiple charges or few-shot charges with a single label. Moreover, previous models utilize special initialization with label patterns to improve the performance of the multi-label classification task, which is only applicable when there is less training data, resulting in poor robustness. To this end, a multi-task convolutional neural network combined with bidirectional long short-time memory leveraging label co-occurrence framework, called CBLLC, is introduced to predict multiple charges with article information on imbalanced data occasion. We develop a new learning mechanism to train the framework of charge and article patterns when there is a lot of training data, increasing its robustness. In CBLLC, the data preprocessing process serves to aid the training in a more generalized manner and reduce overfitting. A salient word annotation is introduced to deal with few-shot charges. A better classification result is obtained with processed data and improves the generality of the model. Experimental results of Chinese AI and Law Challenge test set show the superiority of our proposed method compared with the state-of-the-art methods. In particular, a macro-F1 score of 92.9% for charges and 86.6% for articles is achieved with co-occurrence of charges and patterns of articles.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Relation Learning Hierarchical Framework for Multi-label Charge Prediction

A Joint Label-Enhanced Representation Based on Pre-trained Model for Charge Prediction

Few-Shot Charge Prediction with Multi-grained Features and Mutual Information

References

Akcay S, Kundegorski M, Willcocks C, Breckon T (2018) Using deep convolutional neural network architectures for object classification and detection within X-ray baggage security imagery. IEEE Trans Inf Forensics Secur 13(9):2203–2215
Google Scholar
Alawad M, Gao S, Qiu J, Yoon H, Blair C et al (2019) Automatic extraction of cancer registry reportable information from free-text pathology reports using multitask convolutional neural networks. J Am Med Inform Assoc. https://doi.org/10.1093/jamia/ocz153
Article Google Scholar
Arif MH, Li J, Iqbal M, Liu K (2018) Sentiment analysis and spam detection in short informal text using learning classifier systems. Soft Comput 22:7281–7291
Google Scholar
Bader-El-Den M, Teitei E, Perry T (2018) Biased random forest for dealing with the class imbalance problem. IEEE Trans Neural Netw Learn Syst 30(7):2163–2172
Google Scholar
Bahdanau, D, Cho, K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473
Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828
Google Scholar
Bennin K, Keung J, Phannachitta P, Monden A, Mensah S (2018) MAHAKIL: diversity based oversampling approach to alleviate the class imbalance issue in software defect prediction. In: IEEE/ACM 40th international conference on software engineering (ICSE), Gothenburg, pp 699–699
Chaturvedi I, Cambria E, Welsch R, Herrera F (2018) Distinguishing between facts and opinions for sentiment analysis: survey and challenges. Inf Fusion 44:65–77
Google Scholar
Chen H, Chung W, Xu J, Wang G, Qin Y, Chau M (2004) Crime data mining: a general framework and some examples. Computer 37(4):50–56
Google Scholar
Chen T, Xu R, He Y, Xia Y, Wang X (2016) Learning user and product distributed representations using a sequence model for sentiment analysis. IEEE Comput Intell Mag 11(3):34–44
Google Scholar
Chen H, Liu J, Lv Y, Li M, Liu M, Zheng Q (2018a) Semi-supervised clue fusion for spammer detection in Sina Weibo. Inf Fusion 44:22–32
Google Scholar
Chen K, Zhao T, Yang M, Liu L, Tamura A, Wang R et al (2018b) A neural approach to source dependence based context model for statistical machine translation. IEEE/ACM Trans Audio Speech Lang Process 26(2):266–280
Google Scholar
Datta S, Das S (2018) Multiobjective support vector machines: handling class imbalance with pareto optimality. IEEE Trans Neural Netw Learn Syst 30(5):1602–1608
MathSciNet Google Scholar
Er MJ, Zhang Y, Wang N, Pratama M (2016) Attention pooling-based convolutional neural network for sentence modelling. Inf Sci 373:388–403
MATH Google Scholar
Feng Y, Fan LD (2019) Ontology semantic integration based on convolutional neural network. Neural Comput Appl 31(12):8253–8266
Google Scholar
Fiore U, Santis AD, Perla F, Zanetti P, Palmieri F (2019) Using generative adversarial networks for improving classification effectiveness in credit card fraud detection. Inf Sci 479:448–455
Google Scholar
Greff K, Srivastava KJ, Steunebrink B, Schmidhuber J (2017) LSTM: a search space odyssey. IEEE Trans Neural Netw Learn Syst 28(10):2222–2232
MathSciNet Google Scholar
Han H, Bai X, Li P (2018) Augmented sentiment representation by learning context information. Neural Comput Appl 31(12):8475–8482
Google Scholar
Hu Z, Li X, Tu C, Liu Z, Sun M (2018) Few-shot charge prediction with discriminative legal attributes. In: The 27th international conference on computational linguistics (COLING 2018)
Ienco D, Gaetano R, Dupaquier C, Maurel P (2017) Land cover classification via multitemporal spatial data by deep recurrent neural networks. IEEE Geosci Remote Sens Lett 14(10):1685–1689
Google Scholar
Jayakorn V, Fernando D, Costeira JP (2019) Discriminative optimization: theory and applications to computer vision. IEEE Trans Pattern Anal Mach Intell 41(4):829–843
Google Scholar
Jiang X, Ye H, Luo Z, Chao W (2018) Interpretable rationale augmented charge prediction system. In: Coling 2018
Johannes F, Eyke H, Eneldo L, Mencía BK (2008) Multilabel classification via calibrated label ranking. Mach Learn 73(2):133–153
Google Scholar
Jollife I (1986) Principal component analysis. Springer, New York
Google Scholar
Kanghan O, Chung Y, Kim K et al (2019) Classification and visualization of Alzheimer’s disease using volumetric convolutional neural network and transfer learning. Sci Rep 9:18150. https://doi.org/10.1038/s41598-019-54548-6
Article Google Scholar
Karim F, Majumdar S, Darabi H (2019) Insights into lstm fully convolutional networks for time series classification. IEEE Access 7:67718–67725
Google Scholar
Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of EMNLP, pp 1746–1751
Kurata G, Xiang B, Zhou B (2016) Improved neural network-based multi-label classification with better initialization leveraging label co-occurrence. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, San Diego, California, June 12–17, 2016, pp 521–526
Lai S, Xu L, Liu K, Zhao J (2015) Recurrent convolutional neural networks for text classification. In: AAAI, vol 333, pp 2267–2273
Li Y, Algarni A, Albathan M, Shen Y, Bijaksana M (2015) Relevance feature discovery for text mining publisher. IEEE Trans Knowl Data Eng 27(6):1656–1669
Google Scholar
Li J, Fong S, Zhuang Y, Khoury R (2016) Hierarchical classification in text mining for sentiment analysis of online news. Soft Comput 20:3411–3420
Google Scholar
Li J, Zhang G, Yu L, Meng T (2019a) Research and design on cognitive computing framework for predicting judicial decisions. J Sign Process Syst 91:1159–1167. https://doi.org/10.1007/s11265-018-1429-9
Article Google Scholar
Li X, Wang Y, Wang D, Yuan W, Peng D, Mei Q (2019b) Improving rare disease classification using imperfect knowledge graph. BMC Med Inf Decis Mak 19(5):238
Google Scholar
Liu C, Liao T (2005) Classifying criminal charges in Chinese for web-based legal services. In: Proceedings of the 7th Asia-Pacific web conference on web technologies research and development, Shanghai, China, March 29–April 01, 2005, pp 64–75. https://doi.org/10.1007/978-3-540-31849-1_8
Liu Y, Yao J, Lu X, Xia M, Wang X, Liu Y (2019) Roadnet: learning to comprehensively analyze road networks in complex urban scenes from high-resolution remotely sensed images. IEEE Trans Geosci Remote Sens 57(4):2043–2056
Google Scholar
Liu X, Mou L, Cui H, Lu Z, Song S (2020) Finding decision jumps in text classification. Neurocomputing 371:177–187
Google Scholar
Luo B, Feng Y, Xu J, Zhang X, Zhao D (2017) Learning to predict charges for criminal cases with legal basis. In: Proceedings of the 2017 conference on empirical methods in natural language processing, pp 2727–2736. https://doi.org/10.18653/v1/d17-1289
Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, vol 26, pp 3111–3119
Mou L, Ghamisi P, Zhu X (2017) Deep recurrent neural networks for hyperspectral image classification. IEEE Trans Geosci Remote Sens 55(7):3639–3655
Google Scholar
Pan C, Huang J, Gong J, Yuan X (2019) Few-shot transfer learning for text classification with lightweight word embedding based models. IEEE Access 7:53296–53304
Google Scholar
Parwez M, Abulaish M, Jahiruddin (2019) Multi-label classification of microblogging texts using convolution neural network. IEEE Access 7:68678–68691
Google Scholar
Pereira RB, Plastino A, Zadrozny B, Merschmann LHC (2018) Categorizing feature selection methods for multi-label classification. Artif Intell Rev 49(1):57–78
Google Scholar
Phan H, Andreotti F, Cooray N, Chén O, Vos M (2019) Joint classification and prediction CNN framework for automatic sleep stage classification. IEEE Trans Biomed Eng 66(5):1285–1296
Google Scholar
Ravanelli M, Brakel P, Omologo M, Bengio Y (2018) Light gated recurrent units for speech recognition. IEEE Trans Emerg Top Comput Intell 2(2):92–102
Google Scholar
Schwendicke F, Golla T, Dreher M (2019) Convolutional neural networks for dental image diagnostics: a scoping review. J Dent 91:103226
Google Scholar
Shen X, Tian X, Liu T, Xu F, Tao D (2018) Continuous dropout. IEEE Trans Neural Netw Learn Syst 29(9):3926–3937
Google Scholar
Shuang K, Zhang Z, Loo J, Su S (2020) Convolution-deconvolution word embedding: an end-to-end multi-prototype fusion embedding method for natural language processing. Inf Fusion 53:112–122
Google Scholar
Srivastava SK, Singh SK, Suri JS (2020) State-of-the-art methods in healthcare text classification system: AI paradigm. Front Biosci 25:646–672
Google Scholar
Tsoumakas G, Vlahavas I (2007) Random k-label sets: an ensemble method for multilabel classification. In: Proceedings of the 18th European conference on machine learning, Warsaw, Poland, 17–21 Sept 2007, pp 406–417. https://doi.org/10.1007/978-3-540-74958-5_38
Tsoumakas G, Katakis I, Vlahavas I (2010) Mining multi-label data. In: Maimon O, Rokach L (eds) Data mining and knowledge discovery handbook. Springer, Berlin
Google Scholar
Tu Y, Du J, Lee C (2019) Speech enhancement based on teacher-student deep learning using improved speech presence probability for noise-robust speech recognition. IEEE/ACM Trans Audio Speech Lang Process 27(12):2080–2091
Google Scholar
Uysal A (2018) On two-stage feature selection methods for text classification. IEEE Access 6:43233–43251
MathSciNet Google Scholar
Vashishtha S, Susan S (2019) Fuzzy rule based unsupervised sentiment analysis from social media posts. Expert Syst Appl. https://doi.org/10.1016/j.eswa.2019.112834
Article Google Scholar
Wan C, Wang Y, Liu Y, Ji J, Feng G (2019) Composite feature extraction and selection for text classification. IEEE Access 7:35208–35219
Google Scholar
Wang G, Chen H, Xu J, Atabakhsh H (2006) Automatically detecting criminal identity deception: an adaptive detection algorithm. IEEE Trans Syst Man Cybern Part A Syst Hum 36(5):988–999
Google Scholar
Xiao C, Zhong H, Guo Z, Tu C, Liu Z, Sun M, Feng Y, Han X, Hu Z, Wang H, Xu J (2018) CAIL2018: a large-scale legal dataset for judgment prediction. arXiv preprint arXiv:1807.02478
Xie J, Hao M, Liu W, Lin Y (2020) Fused variable screening for massive imbalanced data. Comput Stat Data Anal 141:94–108
MathSciNet MATH Google Scholar
Yang Z, Yang D, Dyer C, He X, Smola A-J, Hovy E-H (2016) Hierarchical attention networks for document classification. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 1480–1489
Ye H, Jiang X, Luo Z, Chao W (2018) Interpretable charge predictions for criminal cases: learning to generate court views from fact descriptions. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies. https://doi.org/10.18653/v1/n18-1168
Zhang M-L, Zhou Z-H (2007) ML-kNN: a lazy learning approach to multi-label learning. Pattern Recognit 40(7):2038–2048
MATH Google Scholar
Zhong H, Xiao C (2018) Overview of CAIL2018: legal judgment prediction competition. arXiv preprint arXiv:1810.05851v1,2018
Zhong H, Guo H, Tu C, Xiao C, Liu Z, Sun M (2018) Legal judgment prediction via topological learning. In: Proceedings of the 2018 conference on empirical methods in natural language processing, Brussels, Belgium, October 31–November 4, 2018. 2018 Association for Computational Linguistics, pp 3540–3549

Download references

Acknowledgements

This work is supported by National Key R&D Program of China, under Grant No. 2018YFC0830800. The authors would like to thank Dr. Xiaoyang Li from School of Electronics and Information, Northwestern Polytechnical University, for his valuable comments on the article.

Author information

Authors and Affiliations

School of Information and Communication Engineering, North University of China, Taiyuan, China
Hongsong Dong, Fengbao Yang & Xiaoxia Wang
College of Information Science and Engineering, Shanxi Agricultural University, Taigu, China
Hongsong Dong

Authors

Hongsong Dong
View author publications
You can also search for this author in PubMed Google Scholar
Fengbao Yang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoxia Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fengbao Yang.

Ethics declarations

Conflict of interest

All authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dong, H., Yang, F. & Wang, X. Multi-label charge predictions leveraging label co-occurrence in imbalanced data scenario. Soft Comput 24, 17821–17846 (2020). https://doi.org/10.1007/s00500-020-05029-w

Download citation

Published: 23 May 2020
Issue Date: December 2020
DOI: https://doi.org/10.1007/s00500-020-05029-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-label charge predictions leveraging label co-occurrence in imbalanced data scenario

Abstract

Access this article

Similar content being viewed by others

A Relation Learning Hierarchical Framework for Multi-label Charge Prediction

A Joint Label-Enhanced Representation Based on Pre-trained Model for Charge Prediction

Few-Shot Charge Prediction with Multi-grained Features and Mutual Information

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-label charge predictions leveraging label co-occurrence in imbalanced data scenario

Abstract

Access this article

Similar content being viewed by others

A Relation Learning Hierarchical Framework for Multi-label Charge Prediction

A Joint Label-Enhanced Representation Based on Pre-trained Model for Charge Prediction

Few-Shot Charge Prediction with Multi-grained Features and Mutual Information

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation