Multi-dimension Diversification in Legal Information Retrieval

  • Marios KoniarisEmail author
  • Ioannis Anagnostopoulos
  • Yannis Vassiliou
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10041)


The number of freely available legal data sets is increasing at high speed. Citizens can easily access a lot of information about regulations, court orders, statutes, opinions and analytical documents. Such openness brings undeniable benefits in terms of transparency, participation and availability of new services. However, legal information overload poses new challenges, especially in the field of Legal Information Retrieval. Search result diversification has gained attention as a way to increase user satisfaction in web search. We hypothesize that such a strategy will also be beneficial for search on legal data sets. We address diversification of results in legal search by introducing legal domain specific diversification criteria and adopting several state of the art methods from the web search, network analysis and text summarization domains. We evaluate our diversification framework using a real data set from the Common Law domain that we subjectively annotated with relevance judgments for this purpose. Our findings reveal that web search diversification techniques outperform other approaches (e.g. summarization-based, graph-based methods) in the context of legal diversification, as well as that the diversity criteria we introduce provide distinctively diverse subsets of resulting documents, thus differentiating our proposal in respect to traditional diversification techniques.


  1. 1.
    Agrawal, R., Gollapudi, S., Halverson, A., Ieong, S.: Diversifying search results. In: Proceedings of WSDM 2009, pp. 5–14 (2009)Google Scholar
  2. 2.
    Biagioli, C., Francesconi, E., Passerini, A., Montemagni, S., Soria, C.: Automatic semantics extraction in law documents. In: Proceedings of ICAIL 2005 (2005)Google Scholar
  3. 3.
    Carbonell, J., Goldstein, J.: The use of mmr, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of SIGIR 1998, pp. 335–336 (1998)Google Scholar
  4. 4.
    Clarke, C.L.A., Kolla, M., Cormack, G.V., Vechtomova, O., Ashkan, A., Büttcher, S., MacKinnon, I.: Novelty and diversity in information retrieval evaluation. In: Proceedings of SIGIR 2008 (2008)Google Scholar
  5. 5.
    Cronen-Townsend, S., Croft, W.B.: Quantifying query ambiguity. In: Proceedings of Human Language Technology Research 2002 (2002)Google Scholar
  6. 6.
    Erkan, G., Radev, D.R.: LexRank: graph-based lexical centrality as salience in text summarization. J. Artif. Int. Res. 22(1), 457–479 (2004)Google Scholar
  7. 7.
    Farzindar, A., Lapalme, G.: Legal text summarization by exploration of the thematic structures and argumentative roles. In: Text Summarization Branches Out Workshop Held in Conjunction with ACL, pp. 27–34 (2004)Google Scholar
  8. 8.
    Farzindar, A., Lapalme, G.: Letsum, an automatic legal text summarizing system. In: Proceedings of JURIX 2004, pp. 11–18 (2004)Google Scholar
  9. 9.
    Fowler, J.H., Johnson, T.R., Spriggs, J.F., Jeon, S., Wahlbeck, P.J.: Network analysis and the law: measuring the legal importance of precedents at the U.S. Supreme Court. Polit. Anal. 15(3), 324–346 (2006)CrossRefGoogle Scholar
  10. 10.
    Fowler, J.H., Jeon, S.: The authority of Supreme Court precedent. Soc. Netw. 30(1), 16–30 (2008)CrossRefGoogle Scholar
  11. 11.
    Galgani, F., Compton, P., Hoffmann, A.: Citation based summarisation of legal texts. In: Anthony, P., Ishizuka, M., Lukose, D. (eds.) PRICAI 2012. LNCS (LNAI), vol. 7458, pp. 40–52. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-32695-0_6 CrossRefGoogle Scholar
  12. 12.
    Gangemi, A., Sagri, M.T., Tiscornia, D.: Metadata for content description in legal information. In: Proceedings of LegOnt Workshop on Legal Ontologies (2003)Google Scholar
  13. 13.
    Gollapudi, S., Sharma, A.: An axiomatic approach for result diversification. In: Proceedings of WWW 2009, pp. 381–390 (2009)Google Scholar
  14. 14.
    Grabmair, M., Ashley, K.D., Chen, R., Sureshkumar, P., Wang, C., Nyberg, E., Walker, V.R.: Introducing LUIMA. In: Proceedings of ICAIL 2015 (2015)Google Scholar
  15. 15.
    Hoekstra, R., Breuker, J., di Bello, M., Boer, A.: The lkif core ontology of basic legal concepts. In: Proceedings of the Workshop on Legal Ontologies and Artificial Intelligence Techniques (LOAIT 2007) (2007)Google Scholar
  16. 16.
    Hu, S., Dou, Z., Wang, X., Sakai, T., Wen, J.R.: Search result diversification based on hierarchical intents. In: Proceedings of CIKM 2015, pp. 63–72 (2015)Google Scholar
  17. 17.
    Klein, M.C., Van Steenbergen, W., Uijttenbroek, E.M., Lodder, A.R., van Harmelen, F.: Thesaurus-based retrieval of case law. In: Proceedings of JURIX 2006, vol. 152, p. 61 (2006)Google Scholar
  18. 18.
    Koniaris, M., Anagnostopoulos, I., Vassiliou, Y.: Network analysis in the legal domain: a complex model for european union legal sources. In: Physics and Society, Cornell University Library, arXiv (2015).
  19. 19.
    Langville, A.N., Meyer, C.D.: A survey of eigenvector methods for web information retrieval. SIAM Rev. 47(1), 135–161 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  20. 20.
    Lu, Q., Conrad, J.G., Al-Kofahi, K., Keenan, W.: Legal document clustering with built-in topic segmentation. In: Proceedings of CIKM 2011, p. 383 (2011)Google Scholar
  21. 21.
    Marx, S.M.: Citation networks in the law. Jurimetrics J. 10(4), 121–137 (1970)MathSciNetGoogle Scholar
  22. 22.
    Mei, Q., Guo, J., Radev, D.: Divrank: the interplay of prestige and diversity in information networks. In: Proceedings of KDD 2010, pp. 1009–1018 (2010)Google Scholar
  23. 23.
    Loza Mencía, E., Fürnkranz, J.: Efficient pairwise multilabel classification for large-scale problems in the legal domain. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008. LNCS, vol. 5212, pp. 50–65. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-87481-2_4 CrossRefGoogle Scholar
  24. 24.
    Moens, M.F.: Summarizing court decisions. Inf. Process. Manage. 43(6), 1748–1764 (2007)MathSciNetCrossRefGoogle Scholar
  25. 25.
    Moens, M.: Innovative techniques for legal text retrieval. Artif. Intell. Law 9(1), 29–57 (2001)CrossRefGoogle Scholar
  26. 26.
    van Opijnen, M.: Citation analysis and beyond: in search of indicators measuring case law importance. In: Proceedings of JURIX 2012, pp. 95–104 (2012)Google Scholar
  27. 27.
    Otterbacher, J., Erkan, G., Radev, D.R.: Biased LexRank: passage retrieval using random walks with question-based priors. Inf. Process. Manage. 45(1), 42–54 (2009)CrossRefGoogle Scholar
  28. 28.
    Santos, R.L.T., Macdonald, C., Ounis, I.: Search result diversification. Found. Trends Inf. Retrieval 9(1), 1–90 (2015)CrossRefGoogle Scholar
  29. 29.
    Santos, R.L., Macdonald, C., Ounis, I.: Exploiting query reformulations for web search result diversification. In: Proceedings of WWW 2010, pp. 881–890 (2010)Google Scholar
  30. 30.
    Saravanan, M., Ravindran, B., Raman, S.: Improving legal information retrieval using an ontological framework. Artif. Intell. Law 17(2), 101–124 (2009)CrossRefGoogle Scholar
  31. 31.
    Schweighofer, E.: Semantic indexing of legal documents. In: Francesconi, E., Montemagni, S., Peters, W., Tiscornia, D. (eds.) Semantic Processing of Legal Texts. LNCS, vol. 6036, pp. 157–169. Springer, Heidelberg (2010). doi: 10.1007/978-3-642-12837-0_9 CrossRefGoogle Scholar
  32. 32.
    Schweighofer, E., Liebwald, D.: Advanced lexical ontologies and hybrid knowledge based systems: first steps to a dynamic legal electronic commentary. Artif. Intell. Law 15(2), 103–115 (2007)CrossRefGoogle Scholar
  33. 33.
    Wang, J., Zhu, J.: Portfolio theory of information retrieval. In: Proceedings of SIGIR 2009 (2009)Google Scholar
  34. 34.
    Winkels, R., Boer, A., Plantevin, I.: Creating context networks in dutch legislation. In: Proceedings of JURIX 2013, vol. 259, p. 155 (2013)Google Scholar
  35. 35.
    Winkels, R., Boer, A., Vredebregt, B., van Someren, A.: Towards a legal recommender system. In: Proceedings of JURIX 2014, pp. 169–178 (2014)Google Scholar
  36. 36.
    Zhai, C.X., Cohen, W.W., Lafferty, J.: Beyond independent relevance. In: Proceedings of SIGIR 2003 (2003)Google Scholar
  37. 37.
    Zhu, X., Goldberg, A.B., Van Gael, J., Andrzejewski, D.: Improving diversity in ranking using absorbing random walks. In: HLT-NAACL, pp. 97–104 (2007)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Marios Koniaris
    • 1
    Email author
  • Ioannis Anagnostopoulos
    • 2
  • Yannis Vassiliou
    • 1
  1. 1.KDBS Lab, School of ECENational Technical University of AthensAthensGreece
  2. 2.Department of Computer Science and Biomedical InformaticsUniversity of ThessalyLamiaGreece

Personalised recommendations