Abstract
The Competition on Legal Information Extraction/Entailment (COLIEE) involves the legal question answering task. The information retrieval task for finding relevant articles to questions is one of the subtasks in COLIEE. In this paper, we compare the characteristics of the test data provided in two different language (English and Japanese) and analyze topic difficulty based on the submission data by using the retrieval results of Indri, a state-of-the-art information retrieval system. We also discuss issues relating to the design of new COLIEE information retrieval tasks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
One COLIEE competition uses one year bar exam data (e.g., H22 for COLIEE 2014) for IR task and another one year for entailment (e.g. H23(2011) for COLIEE 2014).
- 2.
Since participants didn’t have information about the relevant articles, they can submit multiple runs with different settings as candidates for evaluation.
- 3.
- 4.
Phrase-based queries can take into account the word ordering in the query.
- 5.
- 6.
Capital letter such as “A”, “B”, and “C” are used for anonymize original name in Japanese judicial precedent and also be used in the exam.
- 7.
A part of Japanese questions include description about the explanation of their question types and this part is not translated into English one. For example, Japanese question has “ ” (There are five descriptions about the legal decision (a) to (o). Please select a combination of correct description(s) from 1 to 5 below.), but no corresponding description in English one.
References
Kim, M.Y., Goebel, R., Kano, Y., Satoh, K.: COLIEE-2016: evaluation of the competition on legal information extraction and entailment. In: The Proceedings of the 10th International Workshop on Juris-Informatics (JURISIN2016), Paper 11 (2016)
Kano, Y., Kim, M.Y., Goebel, R., Satoh, K.: Overview of COLIEE 2017. In: Satoh, K., Kim, M.Y., Kano, Y., Goebel, R., Oliveira, T. (eds.) 4th Competition on Legal Information Extraction and Entailment, COLIEE 2017. EPiC Series in Computing, vol. 47, pp. 1–8. EasyChair (2017)
Kim, M.-Y., Xu, Y., Goebel, R.: Legal question answering using ranking SVM and syntactic/semantic similarity. In: Murata, T., Mineshima, K., Bekki, D. (eds.) JSAI-isAI 2014. LNCS (LNAI), vol. 9067, pp. 244–258. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-48119-6_18
Kim, K., Heo, S., Jung, S., Hong, K., Rhim, Y.Y.: An ensemble based legal information retrieval and entailment system. In: The Proceedings of the 10th International Workshop on Juris-Informatics (JURISIN2016), Paper 11 (2016)
Onodera, D., Yoshioka, M.: Civil code article information retrieval system based on legal terminology and civil code article structure. In: The Proceedings of the 10th International Workshop on Juris-Informatics (JURISIN2016), Paper 19 (2016)
Heo, S., Hong, K., Rhim, Y.Y.: Legal content fusion for legal information retrieval. In: Proceedings of the 16th Edition of the International Conference on Articial Intelligence and Law, ICAIL 2017, pp. 277–281. ACM, New York (2017)
Carvalho, D.S., Tran, V., Tran, K.V., Minh, N.L.: Improving legal information retrieval by distributional composition with term order probabilities. In Satoh, K., Kim, M.Y., Kano, Y., Goebel, R., Oliveira, T. (eds.) 4th Competition on Legal Information Extraction and Entailment, COLIEE 2017. EPiC Series in Computing, vol. 47, pp. 43–56. EasyChair (2017)
Yoshioka, M., Onodera, D.: A civil code article information retrieval system based on phrase alignment with article structure analysis and ensemble approach. In: Satoh, K., Kim, M.Y., Kano, Y., Goebel, R., Oliveira, T. (eds.) 4th Competition on Legal Information Extraction and Entailment, COLIEE 2017. EPiC Series in Computing, vol. 47, pp. 9–22. EasyChair (2017)
Strohman, T., Metzler, D., Turtle, H., Croft, W.B.: Indri: a language model-based search engine for complex queries. In: Proceedings of the International Conference on Intelligent Analysis, pp. 2–6 (2005)
Metzler, D., Strohman, T., Croft, W.: Indri at trec 2006: lessons learned from three terabyte tracks. In: Proceedings of the Text REtrieval Conference (2006)
Ernsting, B., Weerkamp, W., de Rijke, M., et al.: Language modeling approaches to blog post and feed finding. In: Proceedings of the Sixteenth Text REtrieval Conference (TREC 2007) (2007)
Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38, 39–41 (1995)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Acknowledgment
We would like to thank organizers of COLIEE to provide submitted runs data for this analysis and would also like to thank all participants of COLIEE to provide valuable information. This work was partially supported by JSPS KAKENHI Grant Number 16H01756.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Yoshioka, M. (2018). Analysis of COLIEE Information Retrieval Task Data. In: Arai, S., Kojima, K., Mineshima, K., Bekki, D., Satoh, K., Ohta, Y. (eds) New Frontiers in Artificial Intelligence. JSAI-isAI 2017. Lecture Notes in Computer Science(), vol 10838. Springer, Cham. https://doi.org/10.1007/978-3-319-93794-6_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-93794-6_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-93793-9
Online ISBN: 978-3-319-93794-6
eBook Packages: Computer ScienceComputer Science (R0)