Abstract
The proliferation of legal documents in various formats and their dispersion across multiple courts present a significant challenge for users seeking precise matches to their information requirements. Despite notable advancements in legal information retrieval systems, research into legal recommender systems remains limited. A plausible factor contributing to this scarcity could be the absence of extensive publicly accessible datasets or benchmarks. While a few studies have emerged in this field, a comprehensive analysis of the distinct attributes of legal data that influence the design of effective legal recommenders is notably absent in the current literature. This paper addresses this gap by initially amassing a comprehensive session-based dataset from Jusbrasil, one of Brazil’s largest online legal platforms. Subsequently, we scrutinize and discourse key facets of legal session-based recommendation data, including session duration, types of recommendable legal artifacts, coverage, and popularity. Furthermore, we introduce the first session-based recommendation benchmark tailored to the legal domain, shedding light on the performance and constraints of several renowned session-based recommendation approaches. These evaluations are based on real-world data sourced from Jusbrasil.
Similar content being viewed by others
Data availability
The data used in this work is available in the Zenodo repository (https://zenodo.org/record/8401278).
Code availability
Not applicable.
Notes
Available for download here: https://zenodo.org/record/8401278.
Available for download here: https://drive.google.com/drive/folders/1ritDnO_Zc6DFEU6UND9C 8VCisT0ETVp5.
References
Abdollahpouri H, Burke R, Mobasher B (2017) Recommender systems as multistakeholder environments. In: Proceedings of the 25th conference on user modeling, adaptation and personalization. UMAP ’17. Association for Computing Machinery, New York, NY, USA, pp 347–348. https://doi.org/10.1145/3079628.3079657
Abdollahpouri H, Adomavicius G, Burke R, Guy I, Jannach D, Kamishima T, Krasnodebski J, Pizzato LA (2020) Multistakeholder recommendation: survey and research directions. User Model User-Adapt Interact 30(1):127–158. https://doi.org/10.1007/s11257-019-09256-1
Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD international conference on management of data. SIGMOD ’93. Association for Computing Machinery, New York, NY, USA, pp 207–216. https://doi.org/10.1145/170035.170072
Al-Kofahi K, Jackson P, Dahn M, Elberti C, Keenan W, Duprey J (2007) A document recommendation system blending retrieval and categorization technologies. In: 2007 AAAI workshop on intelligent techniques for web personalization and recommender systems in E-commerce, pp 9–16
Cooley R, Mobasher B, Srivastava J (1999) Data preparation for mining world wide web browsing patterns. Knowl Inf Syst 1(1):5–32. https://doi.org/10.1007/BF03325089
da Silva DC, Manzato MG, Durão FA (2021) Exploiting personalized calibration and metrics for fairness recommendation. Expert Syst Appl 181:115112. https://doi.org/10.1016/j.eswa.2021.115112
Deshpande M, Karypis G (2004) Item-based top-n recommendation algorithms. ACM Trans Inf Syst 22(1):143–177. https://doi.org/10.1145/963770.963776
Dhanani J, Mehta R, Rana D (2021a) Legal document recommendation system: a cluster based pairwise similarity computation. J Intell Fuzzy Syst 41:1–13. https://doi.org/10.3233/JIFS-189871
Dhanani J, Mehta R, Rana D (2021b) Legal document recommendation system: a dictionary based approach. Int J Web Inf Syst 17(3):187–203. https://doi.org/10.1108/IJWIS-02-2021-0015
Dhanani J, Mehta R, Rana D (2022) Effective and scalable legal judgment recommendation using pre-learned word embedding. Complex Intell Syst. https://doi.org/10.1007/s40747-022-00673-1
Drumond L, Girardi R (2008) A multi-agent legal recommender system. Artif Intell Law 16(2):175–207. https://doi.org/10.1007/s10506-008-9062-8
Garg D, Gupta P, Malhotra P, Vig L, Shroff G (2019) Sequence and time aware neighborhood for session-based recommendations: Stan. In: Proceedings of the 42nd international ACM SIGIR conference on research and development in information retrieval. SIGIR’19. Association for Computing Machinery, New York, NY, USA, pp 1069–1072. https://doi.org/10.1145/3331184.3331322
Guo H (2023) Fairness testing for recommender systems. In: Proceedings of the 32nd ACM SIGSOFT international symposium on software testing and analysis. ISSTA 2023. Association for Computing Machinery, New York, NY, USA, pp 1546–1548. https://doi.org/10.1145/3597926.3605235
Guo H, Li J, Wang J, Liu X, Wang D, Hu Z, Zhang R, Xue H (2023) Fairrec: fairness testing for deep recommender systems. In: Proceedings of the 32nd ACM SIGSOFT international symposium on software testing and analysis. ISSTA 2023. Association for Computing Machinery, New York, NY, USA, pp 310–321. https://doi.org/10.1145/3597926.3598058
He R, McAuley J (2016) Fusing similarity models with Markov chains for sparse sequential recommendation. In: 2016 IEEE 16th international conference on data mining (ICDM), pp 191–200. https://doi.org/10.1109/ICDM.2016.0030
Hidasi B, Karatzoglou A, Baltrunas L, Tikk D (2016) Session-based recommendations with recurrent neural networks. In: Bengio Y, LeCun Y (eds) 4th international conference on learning representations, ICLR 2016, San Juan, Puerto Rico, May 2–4, 2016, conference track proceedings. arXiv:abs/1511.06939
Huang Z, Low C, Teng M, Zhang H, Ho DE, Krass MS, Grabmair M (2021) Context-aware legal citation recommendation using deep learning. In: Proceedings of the eighteenth international conference on artificial intelligence and law. Association for Computing Machinery, New York, NY, USA, pp 79–88. https://doi.org/10.1145/3462757.3466066
Jannach D, Ludewig M (2017) When recurrent neural networks meet the neighborhood for session-based recommendation. In: Proceedings of the eleventh ACM conference on recommender systems. RecSys ’17. Association for Computing Machinery, New York, NY, USA, pp 306–310. https://doi.org/10.1145/3109859.3109872
Jannach D, Lerche L, Kamehkhosh I, Jugovac M (2015) What recommenders recommend: an analysis of recommendation biases and possible countermeasures. User Model User-Adapt Interact 25(5):427–491. https://doi.org/10.1007/s11257-015-9165-3
Jannach D, Quadrana M, Cremonesi P (2022) In: Ricci F, Rokach L, Shapira B (eds) Session-based recommender systems. Springer, New York, NY, pp 301–334. https://doi.org/10.1007/978-1-0716-2197-4_8
Kabbur S, Ning X, Karypis G (2013) Fism: factored item similarity models for top-n recommender systems. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining. KDD ’13. Association for Computing Machinery, New York, NY, USA, pp 659–667. https://doi.org/10.1145/2487575.2487589
Kamehkhosh I, Jannach D, Ludewig M (2017) A comparison of frequent pattern techniques and a deep learning method for session-based recommendation. In: RecTemp@RecSys
Koren Y, Bell R, Volinsky C (2009) Matrix factorization techniques for recommender systems. Computer 42(8):30–37. https://doi.org/10.1109/MC.2009.263
Latifi S, Mauro N, Jannach D (2021) Session-aware recommendation: a surprising quest for the state-of-the-art. Inf Sci 573:291–315. https://doi.org/10.1016/j.ins.2021.05.048
Li J, Ren P, Chen Z, Ren Z, Lian T, Ma J (2017) Neural attentive session-based recommendation. In: Proceedings of the 2017 ACM on conference on information and knowledge management. CIKM ’17. Association for Computing Machinery, New York, NY, USA, pp 1419–1428. https://doi.org/10.1145/3132847.3132926
Liu Q, Zeng Y, Mokhosi R, Zhang H (2018) Stamp: short-term attention/memory priority model for session-based recommendation. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery and data mining. KDD ’18. Association for Computing Machinery, New York, NY, USA, pp 1831–1839. https://doi.org/10.1145/3219819.3219950
Lu Q, Conrad JG (2012) Bringing order to legal documents—an issue-based recommendation system via cluster association. In: Filipe J, Dietz JLG (eds) KEOD 2012—proceedings of the international conference on knowledge engineering and ontology development, Barcelona, Spain, 4–7 October, 2012, pp 76–88
Ludewig M, Jannach D (2018) Evaluation of session-based recommendation algorithms. User Model User-Adapt Interact 28(4–5):331–390. https://doi.org/10.1007/s11257-018-9209-6
Ludewig M, Mauro N, Latifi S, Jannach D (2021) Empirical analysis of session-based recommendation algorithms. User Model User Adapt Interact 31(1):149–181. https://doi.org/10.1007/s11257-020-09277-1
Mitchell TM (1997) Machine learning, vol 1. McGraw-Hill, New York
Norris JR (1997) Markov chains. Cambridge series in statistical and probabilistic mathematics. https://doi.org/10.1017/CBO9780511810633
Ostendorff M, Ash E, Ruas T, Gipp B, Moreno-Schneider J, Rehm G (2021) Evaluating document representations for content-based legal literature recommendations. In: Proceedings of the eighteenth international conference on artificial intelligence and law. Association for Computing Machinery, New York, NY, USA, pp 109–118. https://doi.org/10.1145/3462757.3466073
Quadrana M, Cremonesi P, Jannach D (2018) Sequence-aware recommender systems. ACM Comput Surv. https://doi.org/10.1145/3190616
Rendle S (2022) In: Ricci F, Rokach L, Shapira B (eds) Item recommendation from implicit feedback. Springer, New York, NY, pp 143–171. https://doi.org/10.1007/978-1-0716-2197-4_4
Rendle S, Freudenthaler C, Gantner Z, Schmidt-Thieme L (2009) Bpr: Bayesian personalized ranking from implicit feedback. In: Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence. UAI ’09. AUAI Press, Arlington, Virginia, USA, pp 452–461
Rendle S, Freudenthaler C, Schmidt-Thieme L (2010) Factorizing personalized Markov chains for next-basket recommendation. In: Proceedings of the 19th international conference on world wide web. WWW ’10. Association for Computing Machinery, New York, NY, USA, pp 811–820. https://doi.org/10.1145/1772690.1772773
Ricci F, Rokach L, Shapira B, Kantor PB (eds) (2011) Recommender systems handbook. Springer, New York. https://doi.org/10.1007/978-0-387-85820-3
Sansone C, Sperlí G (2022) Legal information retrieval systems: state-of-the-art and open issues. Inf Syst 106:101967. https://doi.org/10.1016/j.is.2021.101967
Thomas M, Vacek T, Shuai X, Liao W, Sanchez G, Sethia P, Teo D, Madan K, Custis T (2020) Quick check: a legal research recommendation system. In: NLLP@KDD
Verstrepen K, Goethals B (2014) Unifying nearest neighbors collaborative filtering. In: Proceedings of the 8th ACM conference on recommender systems. RecSys ’14. Association for Computing Machinery, New York, NY, USA, pp 177–184. https://doi.org/10.1145/2645710.2645731
Wang S, Hu L, Wang Y, Cao L, Sheng QZ, Orgun M (2019) Sequential recommender systems: challenges, progress and prospects. In: Proceedings of the twenty-eighth international joint conference on artificial intelligence, IJCAI-19, pp 6332–6338. https://doi.org/10.24963/ijcai.2019/883
Wang S, Cao L, Wang Y, Sheng QZ, Orgun MA, Lian D (2021) A survey on session-based recommender systems. ACM Comput Surv. https://doi.org/10.1145/3465401
Winkels R, Boer A, Vredebregt B, van Someren A (2014) Towards a legal recommender system. In: Frontiers in artificial intelligence and applications, pp 169–178
Wu S, Tang Y, Zhu Y, Wang L, Xie X, Tan T (2019) Session-based recommendation with graph neural networks. In: Proceedings of the thirty-third AAAI conference on artificial intelligence and thirty-first innovative applications of artificial intelligence conference and ninth aaai symposium on educational advances in artificial intelligence. AAAI’19/IAAI’19/EAAI’19. https://doi.org/10.1609/aaai.v33i01.3301346
Yang J, Ma W, Zhang M, Zhou X, Liu Y, Ma S (2021) Legalgnn: legal information enhanced graph neural network for recommendation. ACM Trans Inf Syst. https://doi.org/10.1145/3469887
Zangerle E, Pichl M, Gassler W, Specht G (2014) #nowplaying music dataset: extracting listening behavior from twitter. In: Proceedings of the first international workshop on internet-scale multimedia management. WISMM ’14. Association for Computing Machinery, New York, NY, USA, pp 21–26. https://doi.org/10.1145/2661714.2661719
Zheng Y (2019) Multi-stakeholder recommendations: case studies, methods and challenges. In: Proceedings of the 13th ACM conference on recommender systems. RecSys ’19. Association for Computing Machinery, New York, NY, USA, pp 578–579. https://doi.org/10.1145/3298689.3346951
Acknowledgements
This research is partially supported by the Jusbrasil Postdoctoral Fellowship Program, the IPDEC Institute, the Brazilian funding agency FAPEAM-POSGRAD 2022, the Coordination for the Improvement of Higher Education Personnel-Brazil (CAPES) financial code 001, and individual grant from CNPq to Altigran da Silva (307248/2019-4).
Funding
This research is partially supported by the Jusbrasil Postdoctoral Fellowship Program, the IPDEC Institute, the Brazilian funding agency FAPEAM-POSGRAD 2022, the Coordination for the Improvement of Higher Education Personnel-Brazil (CAPES) financial code 001, and individual grant from CNPq to Altigran da Silva (307248/2019-4).
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Material preparation, data collection, experiments and analysis were performed by MAD. The manuscript was written by all authors. All authors read and approved the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no confict of interest.
Ethics approval
Not applicable.
Consent to participate
Not applicable.
Consent for publication
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A: Results for logged and unlogged users in JusBrasilRec collection
Appendix A: Results for logged and unlogged users in JusBrasilRec collection
In this appendix, we present the results for jusbrasilrec_logged_users (Tables 8, 10) and jusbrasilrec_unlogged_users (Tables 9, 11) considering top-10 recommendation lists for both the next item and rest of the session evaluation scenarios.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Domingues, M.A., de Moura, E.S., Marinho, L.B. et al. A large scale benchmark for session-based recommendations on the legal domain. Artif Intell Law (2023). https://doi.org/10.1007/s10506-023-09378-3
Accepted:
Published:
DOI: https://doi.org/10.1007/s10506-023-09378-3