Advertisement

Inducing Predictive Models for Decision Support in Administrative Adjudication

  • L. Karl BrantingEmail author
  • Alexander Yeh
  • Brandy Weiss
  • Elizabeth Merkhofer
  • Bradford Brown
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10791)

Abstract

Administrative adjudications are the most common form of legal decisions in many countries, so improving the efficiency, accuracy, and consistency of administrative processes could significantly benefit agencies and citizens alike. We explore the hypothesis that predictive models induced from previous administrative decisions can improve subsequent decision-making processes. This paper describes three datasets for exploring this hypothesis: motion-rulings, Board of Veterans Appeals (BVA) decisions; and World Intellectual Property Organization (WIPO) domain name dispute decisions. Three different approaches for prediction in these domains were tested: maximum entropy over token n-grams; SVM over token n-grams; and a Hierarchical Attention Network (HAN) applied to the full text. Each approach was capable of predicting outcomes, with the simpler WIPO cases appearing to be much more predictable than BVA or motion-ruling cases. We explore several approaches to using predictive models to identify salient phrases in the predictive texts (i.e., motion or contentions and factual background) and propose a design for incorporating this information into a decision-support tool.

Notes

Acknowledgments

The MITRE Corporation is a not-for-profit company, chartered in the public interest, that operates multiple federally funded research and development centers. This document is approved for Public Release; Distribution Unlimited. Case Number 17-2336. 2017 The MITRE Corporation. All rights reserved.

References

  1. 1.
    Aletras, N., Tsarapatsanis, D., Preotiuc-Pietro, D., Lampos, V.: Predicting judicial decisions of the European Court of Human Rights: a natural language processing perspective. PeerJ Comput. Sci. (2016). https://peerj.com/articles/cs-93/
  2. 2.
    Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
  3. 3.
    Berger, A.L., Pietra, V.J.D., Pietra, S.A.D.: A maximum entropy approach to natural language processing. Comput. Linguist. 22(1), 39–71 (1996). http://dl.acm.org/citation.cfm?id=234285.234289Google Scholar
  4. 4.
    Branting, L.K.: Data-centric and logic-based models for automated legal problem solving. Artif. Intell. Law 25(1), 5–27 (2017).  https://doi.org/10.1007/s10506-017-9193-xCrossRefGoogle Scholar
  5. 5.
    Gallager, R.G.: Information Theory and Reliable Communication. Wiley, New York (1968)zbMATHGoogle Scholar
  6. 6.
    Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003). http://dl.acm.org/citation.cfm?id=944919.944968zbMATHGoogle Scholar
  7. 7.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)CrossRefGoogle Scholar
  8. 8.
    Keerthi, S., Shevade, S., Bhattacharyya, C., Murthy, K.: Improvements to Platt’s SMO algorithm for SVM classifier design. Neural Comput. 13(3), 637–649 (2001)CrossRefGoogle Scholar
  9. 9.
    Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. CoRR abs/1301.3781 (2013). http://arxiv.org/abs/1301.3781
  10. 10.
    Mladenić, D., Brank, J., Grobelnik, M., Milic-Frayling, N.: Feature selection using linear classifier weights: interaction with classification models. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2004, pp. 234–241. ACM, New York (2004).  https://doi.org/10.1145/1008992.1009034
  11. 11.
    Nielsen, J.: Usability Engineering. Morgan Kaufmann Publishers Inc., San Francisco (1993)CrossRefGoogle Scholar
  12. 12.
    Peterson, M., Waterman, D.: Rule-based models of legal expertise. In: Walters, C. (ed.) Computing Power and Legal Reasoning, pp. 627–659. West Publishing Company, Minneapolis (1985)Google Scholar
  13. 13.
    Platt, J.C.: Fast training of support vector machines using sequential minimal optimization. In: Schölkopf, B., Burges, C.J.C., Smola, A.J. (eds.) Advances in Kernel Methods, pp. 185–208. MIT Press, Cambridge (1999). http://dl.acm.org/citation.cfm?id=299094.299105
  14. 14.
    Sidhu, D.: Moneyball sentencing. Boston Coll. Law Rev. 56(2), 672–731 (2015)Google Scholar
  15. 15.
    Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: Proceedings of NAACL-HLT, pp. 1480–1489 (2016)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.The MITRE CorporationMcLeanUSA

Personalised recommendations