Analysis of COLIEE Information Retrieval Task Data

Yoshioka, Masaharu

doi:10.1007/978-3-319-93794-6_1

Masaharu Yoshioka²⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10838))

Included in the following conference series:

JSAI International Symposium on Artificial Intelligence

1127 Accesses
2 Citations
2 Altmetric

Abstract

The Competition on Legal Information Extraction/Entailment (COLIEE) involves the legal question answering task. The information retrieval task for finding relevant articles to questions is one of the subtasks in COLIEE. In this paper, we compare the characteristics of the test data provided in two different language (English and Japanese) and analyze topic difficulty based on the submission data by using the retrieval results of Indri, a state-of-the-art information retrieval system. We also discuss issues relating to the design of new COLIEE information retrieval tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
One COLIEE competition uses one year bar exam data (e.g., H22 for COLIEE 2014) for IR task and another one year for entailment (e.g. H23(2011) for COLIEE 2014).
2.
Since participants didn’t have information about the relevant articles, they can submit multiple runs with different settings as candidates for evaluation.
3.
https://www.lemurproject.org/indri/.
4.
Phrase-based queries can take into account the word ordering in the query.
5.
http://www.lemurproject.org/stopwords/stoplist.dft.
6.
Capital letter such as “A”, “B”, and “C” are used for anonymize original name in Japanese judicial precedent and also be used in the exam.
7.
A part of Japanese questions include description about the explanation of their question types and this part is not translated into English one. For example, Japanese question has “ ” (There are five descriptions about the legal decision (a) to (o). Please select a combination of correct description(s) from 1 to 5 below.), but no corresponding description in English one.

References

Kim, M.Y., Goebel, R., Kano, Y., Satoh, K.: COLIEE-2016: evaluation of the competition on legal information extraction and entailment. In: The Proceedings of the 10th International Workshop on Juris-Informatics (JURISIN2016), Paper 11 (2016)
Google Scholar
Kano, Y., Kim, M.Y., Goebel, R., Satoh, K.: Overview of COLIEE 2017. In: Satoh, K., Kim, M.Y., Kano, Y., Goebel, R., Oliveira, T. (eds.) 4th Competition on Legal Information Extraction and Entailment, COLIEE 2017. EPiC Series in Computing, vol. 47, pp. 1–8. EasyChair (2017)
Google Scholar
Kim, M.-Y., Xu, Y., Goebel, R.: Legal question answering using ranking SVM and syntactic/semantic similarity. In: Murata, T., Mineshima, K., Bekki, D. (eds.) JSAI-isAI 2014. LNCS (LNAI), vol. 9067, pp. 244–258. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-48119-6_18
Chapter Google Scholar
Kim, K., Heo, S., Jung, S., Hong, K., Rhim, Y.Y.: An ensemble based legal information retrieval and entailment system. In: The Proceedings of the 10th International Workshop on Juris-Informatics (JURISIN2016), Paper 11 (2016)
Google Scholar
Onodera, D., Yoshioka, M.: Civil code article information retrieval system based on legal terminology and civil code article structure. In: The Proceedings of the 10th International Workshop on Juris-Informatics (JURISIN2016), Paper 19 (2016)
Google Scholar
Heo, S., Hong, K., Rhim, Y.Y.: Legal content fusion for legal information retrieval. In: Proceedings of the 16th Edition of the International Conference on Articial Intelligence and Law, ICAIL 2017, pp. 277–281. ACM, New York (2017)
Google Scholar
Carvalho, D.S., Tran, V., Tran, K.V., Minh, N.L.: Improving legal information retrieval by distributional composition with term order probabilities. In Satoh, K., Kim, M.Y., Kano, Y., Goebel, R., Oliveira, T. (eds.) 4th Competition on Legal Information Extraction and Entailment, COLIEE 2017. EPiC Series in Computing, vol. 47, pp. 43–56. EasyChair (2017)
Google Scholar
Yoshioka, M., Onodera, D.: A civil code article information retrieval system based on phrase alignment with article structure analysis and ensemble approach. In: Satoh, K., Kim, M.Y., Kano, Y., Goebel, R., Oliveira, T. (eds.) 4th Competition on Legal Information Extraction and Entailment, COLIEE 2017. EPiC Series in Computing, vol. 47, pp. 9–22. EasyChair (2017)
Google Scholar
Strohman, T., Metzler, D., Turtle, H., Croft, W.B.: Indri: a language model-based search engine for complex queries. In: Proceedings of the International Conference on Intelligent Analysis, pp. 2–6 (2005)
Google Scholar
Metzler, D., Strohman, T., Croft, W.: Indri at trec 2006: lessons learned from three terabyte tracks. In: Proceedings of the Text REtrieval Conference (2006)
Google Scholar
Ernsting, B., Weerkamp, W., de Rijke, M., et al.: Language modeling approaches to blog post and feed finding. In: Proceedings of the Sixteenth Text REtrieval Conference (TREC 2007) (2007)
Google Scholar
Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38, 39–41 (1995)
Article Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Google Scholar

Download references

Acknowledgment

We would like to thank organizers of COLIEE to provide submitted runs data for this analysis and would also like to thank all participants of COLIEE to provide valuable information. This work was partially supported by JSPS KAKENHI Grant Number 16H01756.

Author information

Authors and Affiliations

Graduate School of Information Science and Technology, Hokkaido University, N-14 W-9, Kita-ku, Sapporo, 060-0814, Japan
Masaharu Yoshioka

Authors

Masaharu Yoshioka
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Masaharu Yoshioka .

Editor information

Editors and Affiliations

Chiba University, Chiba, Japan
Sachiyo Arai
National Institute of Advanced Industrial Science and Technology, Ibaraki, Japan
Kazuhiro Kojima
Ochanomizu University, Tokyo, Japan
Koji Mineshima
Ochanomizu University, Tokyo, Japan
Daisuke Bekki
National Institute of Informatics, Tokyo, Japan
Ken Satoh
Fujitsu Laboratories Limited, Kanagawa, Japan
Yuiko Ohta

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yoshioka, M. (2018). Analysis of COLIEE Information Retrieval Task Data. In: Arai, S., Kojima, K., Mineshima, K., Bekki, D., Satoh, K., Ohta, Y. (eds) New Frontiers in Artificial Intelligence. JSAI-isAI 2017. Lecture Notes in Computer Science(), vol 10838. Springer, Cham. https://doi.org/10.1007/978-3-319-93794-6_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-93794-6_1
Published: 30 June 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-93793-9
Online ISBN: 978-3-319-93794-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics