Skip to main content

Analysis of COLIEE Information Retrieval Task Data

  • Conference paper
  • First Online:
New Frontiers in Artificial Intelligence (JSAI-isAI 2017)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10838))

Included in the following conference series:

Abstract

The Competition on Legal Information Extraction/Entailment (COLIEE) involves the legal question answering task. The information retrieval task for finding relevant articles to questions is one of the subtasks in COLIEE. In this paper, we compare the characteristics of the test data provided in two different language (English and Japanese) and analyze topic difficulty based on the submission data by using the retrieval results of Indri, a state-of-the-art information retrieval system. We also discuss issues relating to the design of new COLIEE information retrieval tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    One COLIEE competition uses one year bar exam data (e.g., H22 for COLIEE 2014) for IR task and another one year for entailment (e.g. H23(2011) for COLIEE 2014).

  2. 2.

    Since participants didn’t have information about the relevant articles, they can submit multiple runs with different settings as candidates for evaluation.

  3. 3.

    https://www.lemurproject.org/indri/.

  4. 4.

    Phrase-based queries can take into account the word ordering in the query.

  5. 5.

    http://www.lemurproject.org/stopwords/stoplist.dft.

  6. 6.

    Capital letter such as “A”, “B”, and “C” are used for anonymize original name in Japanese judicial precedent and also be used in the exam.

  7. 7.

    A part of Japanese questions include description about the explanation of their question types and this part is not translated into English one. For example, Japanese question has “ ” (There are five descriptions about the legal decision (a) to (o). Please select a combination of correct description(s) from 1 to 5 below.), but no corresponding description in English one.

References

  1. Kim, M.Y., Goebel, R., Kano, Y., Satoh, K.: COLIEE-2016: evaluation of the competition on legal information extraction and entailment. In: The Proceedings of the 10th International Workshop on Juris-Informatics (JURISIN2016), Paper 11 (2016)

    Google Scholar 

  2. Kano, Y., Kim, M.Y., Goebel, R., Satoh, K.: Overview of COLIEE 2017. In: Satoh, K., Kim, M.Y., Kano, Y., Goebel, R., Oliveira, T. (eds.) 4th Competition on Legal Information Extraction and Entailment, COLIEE 2017. EPiC Series in Computing, vol. 47, pp. 1–8. EasyChair (2017)

    Google Scholar 

  3. Kim, M.-Y., Xu, Y., Goebel, R.: Legal question answering using ranking SVM and syntactic/semantic similarity. In: Murata, T., Mineshima, K., Bekki, D. (eds.) JSAI-isAI 2014. LNCS (LNAI), vol. 9067, pp. 244–258. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-48119-6_18

    Chapter  Google Scholar 

  4. Kim, K., Heo, S., Jung, S., Hong, K., Rhim, Y.Y.: An ensemble based legal information retrieval and entailment system. In: The Proceedings of the 10th International Workshop on Juris-Informatics (JURISIN2016), Paper 11 (2016)

    Google Scholar 

  5. Onodera, D., Yoshioka, M.: Civil code article information retrieval system based on legal terminology and civil code article structure. In: The Proceedings of the 10th International Workshop on Juris-Informatics (JURISIN2016), Paper 19 (2016)

    Google Scholar 

  6. Heo, S., Hong, K., Rhim, Y.Y.: Legal content fusion for legal information retrieval. In: Proceedings of the 16th Edition of the International Conference on Articial Intelligence and Law, ICAIL 2017, pp. 277–281. ACM, New York (2017)

    Google Scholar 

  7. Carvalho, D.S., Tran, V., Tran, K.V., Minh, N.L.: Improving legal information retrieval by distributional composition with term order probabilities. In Satoh, K., Kim, M.Y., Kano, Y., Goebel, R., Oliveira, T. (eds.) 4th Competition on Legal Information Extraction and Entailment, COLIEE 2017. EPiC Series in Computing, vol. 47, pp. 43–56. EasyChair (2017)

    Google Scholar 

  8. Yoshioka, M., Onodera, D.: A civil code article information retrieval system based on phrase alignment with article structure analysis and ensemble approach. In: Satoh, K., Kim, M.Y., Kano, Y., Goebel, R., Oliveira, T. (eds.) 4th Competition on Legal Information Extraction and Entailment, COLIEE 2017. EPiC Series in Computing, vol. 47, pp. 9–22. EasyChair (2017)

    Google Scholar 

  9. Strohman, T., Metzler, D., Turtle, H., Croft, W.B.: Indri: a language model-based search engine for complex queries. In: Proceedings of the International Conference on Intelligent Analysis, pp. 2–6 (2005)

    Google Scholar 

  10. Metzler, D., Strohman, T., Croft, W.: Indri at trec 2006: lessons learned from three terabyte tracks. In: Proceedings of the Text REtrieval Conference (2006)

    Google Scholar 

  11. Ernsting, B., Weerkamp, W., de Rijke, M., et al.: Language modeling approaches to blog post and feed finding. In: Proceedings of the Sixteenth Text REtrieval Conference (TREC 2007) (2007)

    Google Scholar 

  12. Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38, 39–41 (1995)

    Article  Google Scholar 

  13. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)

    Google Scholar 

Download references

Acknowledgment

We would like to thank organizers of COLIEE to provide submitted runs data for this analysis and would also like to thank all participants of COLIEE to provide valuable information. This work was partially supported by JSPS KAKENHI Grant Number 16H01756.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Masaharu Yoshioka .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yoshioka, M. (2018). Analysis of COLIEE Information Retrieval Task Data. In: Arai, S., Kojima, K., Mineshima, K., Bekki, D., Satoh, K., Ohta, Y. (eds) New Frontiers in Artificial Intelligence. JSAI-isAI 2017. Lecture Notes in Computer Science(), vol 10838. Springer, Cham. https://doi.org/10.1007/978-3-319-93794-6_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-93794-6_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-93793-9

  • Online ISBN: 978-3-319-93794-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics