Categorisation of Supreme Court Cases Using Multiple Horizontal Thesauri

  • Sameerchand Pudaruth
  • K. M. Sunjiv Soydaudah
  • Rajendra Parsad Gunputh
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 385)


Text classification is a branch of Artificial Intelligence which deals with the assignment of textual documents to a controlled group of classes. The aim of this paper is to assess the use of a controlled vocabulary in the categorisation of legal texts. Controlled vocabularies such as the Medical Subject Headings, Compendex and AGROVOC have been proved to be very useful in the fields of biomedical research, engineering and agriculture, respectively. In this work, a number of lexicons are created for some pre-defined areas of law through an automated approach. The lexicons are then used to categorise cases from the Supreme Court into eight distinct areas of law. We then compared the performance of these lexicons with each other. We found that lexicons which have a mixture of single words and short phrases performs slightly better than those consisting simply of single words. Weights were also assigned to the terms and this had a significant positive impact on the classification accuracy. The number of words in each thesaurus was kept constant. A hierarchical classification was also attempted whereby cases were first classified into either a civil case or a criminal case. Civil cases were then further classified into company, labour, contract and land cases while criminal cases were classified into drugs, homicide, road traffic offences and other criminal offences. Our best model achieves a global accuracy of 78.9 %. Thus, we have demonstrated that it is possible to get good classification accuracies with legal cases through the use of automatically generated thesauri. This outcome of this research can become an integral part of the eJudiciary project that has already been initiated by the government. In line with the vision of the Judiciary, we are hereby in the process of creating an intelligent legal information system which will benefit all legal actors and will have a definite positive impact on the legal landscape of the Republic of Mauritius. Lawyers, attorneys and their assistants would spend less time on legal research and hence they would have more time to prepare their arguments for their case. We are optimistic in believing that this will make the whole business of providing justice more effective and more efficient through the reduction in postponement of cases and a reduction in the average disposal time of cases.


Supreme Court Cases Categorisation Horizontal Thesaurus 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Supreme Court. Annual Report of the Judiciary 2013. Republic of Mauritius (2013)Google Scholar
  2. 2.
    National Library Of Medicine. Medical Subject Headings (online). United States of America (2014). (accessed 01 March 2015)
  3. 3.
    Elsevier. Engineering Village [online]. The Netherlands (2014). (accessed 01 March 2015)
  4. 4.
    Food And Agriculture Organisation. Agrovoc (online). FAO, United Nations (2014). (accessed 01 March 2015)
  5. 5.
    The Getty Research Institute. Art and Architecture Thesaurus (online). Los Angeles, California, US (2014). (accessed 01 March 2015)
  6. 6.
    Europa. EuroVoc, the EU’s Multilingual Thesaurus [online]. Bruxelles, Belgium (2014). (accessed 01 March 2015)
  7. 7.
    Bleik, S., Mishra, M., Huan, J., Song, M.: Text Categorization of Biomedical Data Sets using Graph Kernels and a Controlled Vocabulary. IEEE/ACM Transactions on Computational Biology and Bioinformatics 10(5), 1211–1217 (2013)CrossRefGoogle Scholar
  8. 8.
    Saric, F., Basic, B.D., Moens, M.F., Snajder, J.: Multi-label classification of croatian legal documents using EuroVoc thesaurus. In: Proceedings of the SPLeT - Semantic Processing of Legal Texts: Legal Resources and Access to Law Workshop Location, Reykjavik, Iceland, May 27, 2014Google Scholar
  9. 9.
    Romero, R., Iglesias, E.L., Borrajo, L., Marey, C.M.R.: Using dictionaries for biomedical text classification. In: Rocha, M.P., Rodríguez, J.M.C., Fdez-Riverola, F., Valencia, A. (eds.) 5th International Conference on Practical Applications of Computational Biology & Bioinformatics (PACBB 2011). AISC, vol. 93, pp. 365–372. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  10. 10.
    Schijvenaars, B.J.A., Schuemie, M.J., van Mulligen, E.M., Weeber, M., Jelier R., Mons, B., Kors, J.A.: A concept-based approach to text categorization. In: Proceedings of Text REtrieval Conference (TREC) 2005 Genomics Track (2005)Google Scholar
  11. 11.
    Svenonius, E.: Design of Controlled Vocabularies. Encyclopedia of Library and Information Science (2003). doi: 10.1081/E-ELIS120009038
  12. 12.
    Golub, K.: Using Controlled Vocabularies in Automated Subject Classification of Textual Web Pages, in the Context of Browsing. Theory and Practice of Digital Libraries (TCDL) 2(2) (2006)Google Scholar
  13. 13.
    Gray, A.J.G., Gray, N., Ounis, I.: Searching and exploring controlled vocabularies. In: Proceedings of the ACM Workshop on Exploiting Semantic Annotations in Information Retrieval (ESAIR), pp. 1–5 (2009)Google Scholar
  14. 14.
    Lee-Smeltzer, K.H.J.: Finding the Needle: Controlled Vocabularies, Resource Discovery and Dublin Core. Library Collections, Acquisitions & Technical Services 24, 205–215 (2000)CrossRefGoogle Scholar
  15. 15.
    Piotrowski, M., Senn, C.: Harvesting indices to grow a controlled vocabulary: towards improved access to historical legal texts. In: Proceedings of the 6th EACL Workshop in Language Technology for Cultural Heritage, Social Sciences and Humanities, Avignon, France, pp. 24–29, April 24, 2012Google Scholar
  16. 16.
    Trieschnigg, D., Pezik, P., Lee, V., de Jong, F., Kraaji, W., Wandrebholz-Schuhmannd, D.: MeSH-Up: Effective MeSH Text Classification for Improved Document Retrieval. Bioinformatics 25(11), 1412–1418 (2009)CrossRefGoogle Scholar
  17. 17.
    Harping, P.: Introduction to Controlled Vocabularies: Terminology for Art, Architecture and Other Cultural Works. Getty Research Institute, Los Angeles (2010)Google Scholar
  18. 18.
    Zubiaga, A.M.: Harnessing Folksonomies for Resource Classification. Thesis (PhD). National University of Distance Education, Spain (2012)Google Scholar
  19. 19.
    Zubiaga, A., Korner, C., Strohmaier, M.: Tags vs Shelves: from social tagging to social classification. In: Proceedings of the 22nd ACM International Conference on Hypertext and Hypermedia, Eindhoven, The Netherlands, June 6–9, 2011Google Scholar
  20. 20.
    Magableh, M., Cau, A., Zedan, H., Ward, M.: Towards a multilingual semantic folksonomy. In: Proceedings of the IADIS International Conference on Collaborative Technologies, pp. 178–182, Freiburg, Germany, July 2010Google Scholar
  21. 21.
    Wordreference (2015). (accessed 14 March 2015)
  22. 22.
    Ejudiciary Mauritius (2015). (accessed 17 March 2015)

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Sameerchand Pudaruth
    • 1
  • K. M. Sunjiv Soydaudah
    • 2
  • Rajendra Parsad Gunputh
    • 3
  1. 1.Department of Ocean Engineering & ICT, Faculty of Ocean StudiesUniversity of MauritiusMokaMauritius
  2. 2.Electrical and Electronic Engineering Department, Faculty of EngineeringUniversity of MauritiusMokaMauritius
  3. 3.Department of Law, Faculty of Law and ManagementUniversity of MauritiusMokaMauritius

Personalised recommendations