Abstract
Text classification is a branch of Artificial Intelligence which deals with the assignment of textual documents to a controlled group of classes. The aim of this paper is to assess the use of a controlled vocabulary in the categorisation of legal texts. Controlled vocabularies such as the Medical Subject Headings, Compendex and AGROVOC have been proved to be very useful in the fields of biomedical research, engineering and agriculture, respectively. In this work, a number of lexicons are created for some pre-defined areas of law through an automated approach. The lexicons are then used to categorise cases from the Supreme Court into eight distinct areas of law. We then compared the performance of these lexicons with each other. We found that lexicons which have a mixture of single words and short phrases performs slightly better than those consisting simply of single words. Weights were also assigned to the terms and this had a significant positive impact on the classification accuracy. The number of words in each thesaurus was kept constant. A hierarchical classification was also attempted whereby cases were first classified into either a civil case or a criminal case. Civil cases were then further classified into company, labour, contract and land cases while criminal cases were classified into drugs, homicide, road traffic offences and other criminal offences. Our best model achieves a global accuracy of 78.9 %. Thus, we have demonstrated that it is possible to get good classification accuracies with legal cases through the use of automatically generated thesauri. This outcome of this research can become an integral part of the eJudiciary project that has already been initiated by the government. In line with the vision of the Judiciary, we are hereby in the process of creating an intelligent legal information system which will benefit all legal actors and will have a definite positive impact on the legal landscape of the Republic of Mauritius. Lawyers, attorneys and their assistants would spend less time on legal research and hence they would have more time to prepare their arguments for their case. We are optimistic in believing that this will make the whole business of providing justice more effective and more efficient through the reduction in postponement of cases and a reduction in the average disposal time of cases.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Supreme Court. Annual Report of the Judiciary 2013. Republic of Mauritius (2013)
National Library Of Medicine. Medical Subject Headings (online). United States of America (2014). http://www.nlm.nih.gov/mesh/ (accessed 01 March 2015)
Elsevier. Engineering Village [online]. The Netherlands (2014). www.engineeringvillage.com/ (accessed 01 March 2015)
Food And Agriculture Organisation. Agrovoc (online). FAO, United Nations (2014). http://aims.fao.org/standards/agrovoc/ (accessed 01 March 2015)
The Getty Research Institute. Art and Architecture Thesaurus (online). Los Angeles, California, US (2014). http://www.getty.edu/research/tools/vocabularies/aat/index.html/ (accessed 01 March 2015)
Europa. EuroVoc, the EU’s Multilingual Thesaurus [online]. Bruxelles, Belgium (2014). http://eurovoc.europa.eu/drupal/ (accessed 01 March 2015)
Bleik, S., Mishra, M., Huan, J., Song, M.: Text Categorization of Biomedical Data Sets using Graph Kernels and a Controlled Vocabulary. IEEE/ACM Transactions on Computational Biology and Bioinformatics 10(5), 1211–1217 (2013)
Saric, F., Basic, B.D., Moens, M.F., Snajder, J.: Multi-label classification of croatian legal documents using EuroVoc thesaurus. In: Proceedings of the SPLeT - Semantic Processing of Legal Texts: Legal Resources and Access to Law Workshop Location, Reykjavik, Iceland, May 27, 2014
Romero, R., Iglesias, E.L., Borrajo, L., Marey, C.M.R.: Using dictionaries for biomedical text classification. In: Rocha, M.P., Rodríguez, J.M.C., Fdez-Riverola, F., Valencia, A. (eds.) 5th International Conference on Practical Applications of Computational Biology & Bioinformatics (PACBB 2011). AISC, vol. 93, pp. 365–372. Springer, Heidelberg (2011)
Schijvenaars, B.J.A., Schuemie, M.J., van Mulligen, E.M., Weeber, M., Jelier R., Mons, B., Kors, J.A.: A concept-based approach to text categorization. In: Proceedings of Text REtrieval Conference (TREC) 2005 Genomics Track (2005)
Svenonius, E.: Design of Controlled Vocabularies. Encyclopedia of Library and Information Science (2003). doi:10.1081/E-ELIS120009038
Golub, K.: Using Controlled Vocabularies in Automated Subject Classification of Textual Web Pages, in the Context of Browsing. Theory and Practice of Digital Libraries (TCDL) 2(2) (2006)
Gray, A.J.G., Gray, N., Ounis, I.: Searching and exploring controlled vocabularies. In: Proceedings of the ACM Workshop on Exploiting Semantic Annotations in Information Retrieval (ESAIR), pp. 1–5 (2009)
Lee-Smeltzer, K.H.J.: Finding the Needle: Controlled Vocabularies, Resource Discovery and Dublin Core. Library Collections, Acquisitions & Technical Services 24, 205–215 (2000)
Piotrowski, M., Senn, C.: Harvesting indices to grow a controlled vocabulary: towards improved access to historical legal texts. In: Proceedings of the 6th EACL Workshop in Language Technology for Cultural Heritage, Social Sciences and Humanities, Avignon, France, pp. 24–29, April 24, 2012
Trieschnigg, D., Pezik, P., Lee, V., de Jong, F., Kraaji, W., Wandrebholz-Schuhmannd, D.: MeSH-Up: Effective MeSH Text Classification for Improved Document Retrieval. Bioinformatics 25(11), 1412–1418 (2009)
Harping, P.: Introduction to Controlled Vocabularies: Terminology for Art, Architecture and Other Cultural Works. Getty Research Institute, Los Angeles (2010)
Zubiaga, A.M.: Harnessing Folksonomies for Resource Classification. Thesis (PhD). National University of Distance Education, Spain (2012)
Zubiaga, A., Korner, C., Strohmaier, M.: Tags vs Shelves: from social tagging to social classification. In: Proceedings of the 22nd ACM International Conference on Hypertext and Hypermedia, Eindhoven, The Netherlands, June 6–9, 2011
Magableh, M., Cau, A., Zedan, H., Ward, M.: Towards a multilingual semantic folksonomy. In: Proceedings of the IADIS International Conference on Collaborative Technologies, pp. 178–182, Freiburg, Germany, July 2010
Wordreference (2015). http://www.wordreference.com/fren/ (accessed 14 March 2015)
Ejudiciary Mauritius (2015). https://www.ejudiciary.mu/ (accessed 17 March 2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Pudaruth, S., Soydaudah, K.M.S., Gunputh, R.P. (2016). Categorisation of Supreme Court Cases Using Multiple Horizontal Thesauri. In: Berretti, S., Thampi, S., Dasgupta, S. (eds) Intelligent Systems Technologies and Applications. Advances in Intelligent Systems and Computing, vol 385. Springer, Cham. https://doi.org/10.1007/978-3-319-23258-4_31
Download citation
DOI: https://doi.org/10.1007/978-3-319-23258-4_31
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23257-7
Online ISBN: 978-3-319-23258-4
eBook Packages: EngineeringEngineering (R0)