Legal Question Answering Using Ranking SVM and Syntactic/Semantic Similarity

  • Mi-Young KimEmail author
  • Ying Xu
  • Randy Goebel
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9067)


We describe a legal question answering system which combines legal information retrieval and textual entailment. We have evaluated our system using the data from the first competition on legal information extraction/entailment (COLIEE) 2014. The competition focuses on two aspects of legal information processing related to answering yes/no questions from Japanese legal bar exams. The shared task consists of two phases: legal ad hoc information retrieval and textual entailment. The first phase requires the identification of Japan civil law articles relevant to a legal bar exam query. We have implemented two unsupervised baseline models (tf-idf and Latent Dirichlet Allocation (LDA)-based Information Retrieval (IR)), and a supervised model, Ranking SVM, for the task. The features of the model are a set of words, and scores of an article based on the corresponding baseline models. The results show that the Ranking SVM model nearly doubles the Mean Average Precision compared with both baseline models. The second phase is to answer “Yes” or “No” to previously unseen queries, by comparing the meanings of queries with relevant articles. The features used for phase two are syntactic/semantic similarities and identification of negation/antonym relations. The results show that our method, combined with rule-based model and the unsupervised model, outperforms the SVM-based supervised model.


Legal text mining Question answering Recognizing textual entailment Information retrieval Ranking SVM Latent dirichlet allocation (LDA) 



This research was supported by the Alberta Innovates Centre for Machine Learning (AICML) and the iCORE division of Alberta Innovates Technology Futures.


  1. 1.
    Jones, K.S.: A statistical interpretation of term specificity and its application in retrieval. In: Willett, P. (ed.) Document Retrieval Systems, pp. 132–142. Taylor Graham Publishing, London (1988)Google Scholar
  2. 2.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)zbMATHGoogle Scholar
  3. 3.
    Joachims, T.: Optimizing search engines using clickthrough data. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2002, pp. 133–142. ACM, New York (2002)Google Scholar
  4. 4.
    Wei, X., Croft, W.B.: LDA-based document models for ad-hoc retrieval. In: Proceeding of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2006, pp. 178–185. ACM, New York (2006)Google Scholar
  5. 5.
    Maxwell, K.T., Oberlander, J., Croft, W.B.: Feature-based selection of dependency paths in ad hoc information retrieval. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, vol. 1, Long Papers, pp. 507–516. Association for Computational Linguistics, Sofia, August 2013Google Scholar
  6. 6.
    Kim, M-Y., Xu, Y., Goebel, R., Satoh, K.: Answering yes/no questions in legal bar exams. In: JURISIN (2013)Google Scholar
  7. 7.
    Jikoun, V., de Rijke, M.: Recognizing textual entailment using lexical similarity. In: Proceedings of the PASCAL Challenges Workshop on RTE (2005)Google Scholar
  8. 8.
    MacCartney, B., Grenager, T., de Marneffe, M.-C., Cer, D., Manning, C.D.: Learning to recognize features of valid textual entailments. In Proceedings of HLT-NAACL (2006)Google Scholar
  9. 9.
    Sno, R., Vanderwende, L., Menezes, A.: Effectively using syntax for recognizing false entailment. In: Proceedings of HLT-NAACL (2006)Google Scholar
  10. 10.
    Lai, A., Hockenmaier, J.: Illinois-LH: a denotational and distributional approach to semantics. In: Proceedings of SemEval 2014: International Workshop on Semantic Evaluation (2014)Google Scholar
  11. 11.
    Ohno, S., Hamanishi, M.: New Synonym Dictionary. Kadokawa Shoten, Tokyo (1981)Google Scholar
  12. 12.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)CrossRefGoogle Scholar
  13. 13.
    Kim, M-Y., Kang, S-J., Lee, J-H.: Resolving ambiguity in inter-chunk dependency parsing. In: Proceedings of 6th Natural Language Processing Pacific Rim Symposium, pp. 263–270 (2001)Google Scholar
  14. 14.
    Walas, M.: How to answer yes/no spatial questions using qualitative reasoning? In: Gelbukh, A. (ed.) CICLing 2012, Part II. LNCS, vol. 7182, pp. 330–341. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  15. 15.
    Marelli, M., Bentivogli, L., Baroni, M., Bernardi, R., Menini, S., Zamparelli, R.: SemEval-2014 task 1: evaluation of compositional distributional semantic models on full sentences through semantic relatedness and textual entailment. In: Proceedings of SemEval 2014: International Workshop on Semantic Evaluation (2014)Google Scholar
  16. 16.
    Bdour, W.N., Gharaibeh, N.K.: Development of yes/no arabic question answering system. Int. J. Artif. Intell. Appl. 4(1), 51–63 (2013)Google Scholar
  17. 17.
    Kouylekov, M., Magnini, B.: Tree edit distance for recognizing textual entailment: estimating the cost of insertion. In: Proceedings of the second PASCAL Challenges Workshop on RTE (2006) Google Scholar
  18. 18.
    Vanderwende, L., Menezes, A., Snow, R.: Microsoft research at RTE-2: syntactic contributions in the entailment task: an implementation. In: Proceedings of the Second PASCAL Challenges Workshop on RTE (2006)Google Scholar
  19. 19.
    Nielsen, R.D., Ward, W., Martin, J.H.: Toward dependency path based entailment. In: Proceedings of the Second PASCAL Challenges Workshop on RTE (2006)Google Scholar
  20. 20.
    Zanzotto, F.M., Moschitti, A., Pennacchiotti, M., Pazienza, M.T.: Learning textual entailment from examples. In: Proceedings of the Second PASCAL Challenges Workshop on RTE (2006)Google Scholar
  21. 21.
    Harmeling, S.: An extensible probabilistic transformation-based approach to the third recognizing textual entailment challenge. In: Proceedings of ACL PASCAL Workshop on Textual Entailment and Paraphrasing (2007)Google Scholar
  22. 22.
    Marsi, E., Krahmer, E., Bosma, W.: Dependency-based paraphrasing for recognizing textual entailment. In: Proceedings of ACL PASCAL Workshop on Textual Entailment and Paraphrasing (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  1. 1.Department of Computing ScienceUniversity of AlbertaEdmontonCanada

Personalised recommendations