Inducing Predictive Models for Decision Support in Administrative Adjudication
Administrative adjudications are the most common form of legal decisions in many countries, so improving the efficiency, accuracy, and consistency of administrative processes could significantly benefit agencies and citizens alike. We explore the hypothesis that predictive models induced from previous administrative decisions can improve subsequent decision-making processes. This paper describes three datasets for exploring this hypothesis: motion-rulings, Board of Veterans Appeals (BVA) decisions; and World Intellectual Property Organization (WIPO) domain name dispute decisions. Three different approaches for prediction in these domains were tested: maximum entropy over token n-grams; SVM over token n-grams; and a Hierarchical Attention Network (HAN) applied to the full text. Each approach was capable of predicting outcomes, with the simpler WIPO cases appearing to be much more predictable than BVA or motion-ruling cases. We explore several approaches to using predictive models to identify salient phrases in the predictive texts (i.e., motion or contentions and factual background) and propose a design for incorporating this information into a decision-support tool.
The MITRE Corporation is a not-for-profit company, chartered in the public interest, that operates multiple federally funded research and development centers. This document is approved for Public Release; Distribution Unlimited. Case Number 17-2336. 2017 The MITRE Corporation. All rights reserved.
- 1.Aletras, N., Tsarapatsanis, D., Preotiuc-Pietro, D., Lampos, V.: Predicting judicial decisions of the European Court of Human Rights: a natural language processing perspective. PeerJ Comput. Sci. (2016). https://peerj.com/articles/cs-93/
- 2.Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
- 9.Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. CoRR abs/1301.3781 (2013). http://arxiv.org/abs/1301.3781
- 10.Mladenić, D., Brank, J., Grobelnik, M., Milic-Frayling, N.: Feature selection using linear classifier weights: interaction with classification models. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2004, pp. 234–241. ACM, New York (2004). https://doi.org/10.1145/1008992.1009034
- 12.Peterson, M., Waterman, D.: Rule-based models of legal expertise. In: Walters, C. (ed.) Computing Power and Legal Reasoning, pp. 627–659. West Publishing Company, Minneapolis (1985)Google Scholar
- 13.Platt, J.C.: Fast training of support vector machines using sequential minimal optimization. In: Schölkopf, B., Burges, C.J.C., Smola, A.J. (eds.) Advances in Kernel Methods, pp. 185–208. MIT Press, Cambridge (1999). http://dl.acm.org/citation.cfm?id=299094.299105
- 14.Sidhu, D.: Moneyball sentencing. Boston Coll. Law Rev. 56(2), 672–731 (2015)Google Scholar
- 15.Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: Proceedings of NAACL-HLT, pp. 1480–1489 (2016)Google Scholar