, Volume 18, Issue 4, pp 431-457

Automation of legal sensemaking in e-discovery

Rent the article at a discount

Rent now

* Final gross prices may vary according to local VAT.

Get Access

Abstract

Retrieval of relevant unstructured information from the ever-increasing textual communications of individuals and businesses has become a major barrier to effective litigation/defense, mergers/acquisitions, and regulatory compliance. Such e-discovery requires simultaneously high precision with high recall (high-P/R) and is therefore a prototype for many legal reasoning tasks. The requisite exhaustive information retrieval (IR) system must employ very different techniques than those applicable in the hyper-precise, consumer search task where insignificant recall is the accepted norm. We apply Russell, et al.’s cognitive task analysis of sensemaking by intelligence analysts to develop a semi-autonomous system that achieves high IR accuracy of F1 ≥ 0.8 compared to F1 < 0.4 typical of computer-assisted human-assessment (CAHA) or alternative approaches such as Roitblat, et al.’s. By understanding the ‘Learning Loop Complexes’ of lawyers engaged in successful small-scale document review, we have used socio-technical design principles to create roles, processes, and technologies for scalable human-assisted computer-assessment (HACA). Results from the NIST-TREC Legal Track’s interactive task from both 2008 and 2009 validate the efficacy of this sensemaking approach to the high-P/R IR task.