Artificial Intelligence and Law

, Volume 18, Issue 4, pp 431–457 | Cite as

Automation of legal sensemaking in e-discovery

  • Christopher Hogan
  • Robert S. Bauer
  • Dan Brassil


Retrieval of relevant unstructured information from the ever-increasing textual communications of individuals and businesses has become a major barrier to effective litigation/defense, mergers/acquisitions, and regulatory compliance. Such e-discovery requires simultaneously high precision with high recall (high-P/R) and is therefore a prototype for many legal reasoning tasks. The requisite exhaustive information retrieval (IR) system must employ very different techniques than those applicable in the hyper-precise, consumer search task where insignificant recall is the accepted norm. We apply Russell, et al.’s cognitive task analysis of sensemaking by intelligence analysts to develop a semi-autonomous system that achieves high IR accuracy of F1 ≥ 0.8 compared to F1 < 0.4 typical of computer-assisted human-assessment (CAHA) or alternative approaches such as Roitblat, et al.’s. By understanding the ‘Learning Loop Complexes’ of lawyers engaged in successful small-scale document review, we have used socio-technical design principles to create roles, processes, and technologies for scalable human-assisted computer-assessment (HACA). Results from the NIST-TREC Legal Track’s interactive task from both 2008 and 2009 validate the efficacy of this sensemaking approach to the high-P/R IR task.


e-discovery Sensemaking Information retrieval 


  1. Bauer RS, Jade T, Hedin B, Hogan C (2008) Automated legal sensemaking: the centrality of relevance and intentionality. In: Proceedings of the second international workshop on supporting search and sensemaking for electronically stored information in discovery proceedings (DESI II)Google Scholar
  2. Bauer RS, Brassil D, Hogan C, Taranto G, Brown JS (2009) Impedance matching of humans ⇔ machines in high-Q information retrieval systems. In: Proceedings of the 2009 IEEE international conference on systems, man, and cyberneticsGoogle Scholar
  3. Belkin N (1980) Anomolous states of knowledge as a basis for information retrieval. Can J Inf Sci 5:133–143Google Scholar
  4. Blair DC, Maron ME (1985) An evaluation of retrieval effectiveness for a full-text document retrieval system. Commun ACM 28(3):289–299CrossRefGoogle Scholar
  5. Card SK (2005) The science of analytical reasoning. In: Illuminating the path: the research and development agenda for visual analytics. National Visualization and Analytics Center, Richland, WA. Accessed 20 Dec 2009
  6. Cormack GV, Mojdeh M (2010) Machine learning for information retrieval: TREC 2009 web, relevance feedback and legal tracks. In: The eighteenth text retrieval conference (TREC 2009) proceedingsGoogle Scholar
  7. Dervin B (1983) An overview of sense-making research: concepts methods and results. Presented at the International Communication Association annual meeting, DallasGoogle Scholar
  8. Dervin B (1992) From the mind’s eye of the user: the sense-making qualitative-quantitative methodology. In: Glazier JD, Powell RR (eds) Qualitative research in information management. Libraries Unlimited CO, Englewood, pp 61–84Google Scholar
  9. EDRM (2010) Electronic discovery reference model. Accessed 4 Jan 2010
  10. Fein BE, Merrell BL, Nelson FE (2010) Backstop LLP and Cleary Gottlied Steen and Hamilton LLP at TREC legal track 2009. In: The eighteenth text retrieval conference (TREC 2009) proceedingsGoogle Scholar
  11. Hogan C, Brassil D, Rugani SM, Reinhart J, Gerber M, Jade T (2009) H5 at TREC 2008 legal interactive: user modeling, assessment & measurement. In: Proceedings of the seventeenth text retrieval conference proceedings (TREC 2008)Google Scholar
  12. Kershaw A (2005) Automated document review proves its reliability. Digit Discov Evid 5(11):10–12Google Scholar
  13. Klein G, Phillips JK, Rall EL, Peluso DA (2006a) A data-frame theory of sensemaking. In: Expertise out of context: proceedings of the sixth international conference on naturalistic decision makingGoogle Scholar
  14. Klein G, Moon B, Hoffman RR (2006b) Making sense of sensemaking 2: a macrocognitive model. IEEE Intell Syst 21(5):88–92CrossRefGoogle Scholar
  15. Koenemann J, Belkin NJ (1996) A case for interaction: a study of interactive information retrieval behavior and effectiveness. In: Proceedings of the human factors in computing systems conference (CHI’96). ACM Press, New YorkGoogle Scholar
  16. Kuropka D (2004) Modelle zur Repräsentation natürlichsprachlicher Dokumente. Ontologie-basiertes Information-Filtering und -Retrieval mit relationalen Datenbanken. Advances in information systems and management science, Bd. 10. Logos Verlag, BerlinGoogle Scholar
  17. Lewis DD (1998) Naive (Bayes) at forty: the independence assumption in information retrieval. In: Proceedings ECML, pp 4–15. SpringerGoogle Scholar
  18. Linderman A (2005) Using sense-making methodology in legal and law enforcement investigations. Presented at a non-divisional workshop held at the meeting of the International Communication Association, New York CityGoogle Scholar
  19. Marchionini G (2006) Toward human-computer information retrieval. In: June/July 2006 bulletin of the American society for information scienceGoogle Scholar
  20. Marcus S et al. (eds) (2004) Manual for complex litigation, fourth. Federal Judicial CenterGoogle Scholar
  21. Oard DW, Hedin B, Tomlinson S, Baron JR (2009) Overview of the TREC 2008 legal track. In: Proceedings of the seventeenth text retrieval conference proceedings (TREC 2008)Google Scholar
  22. Rangan V, Jiang M (2010) Clearwell systems at TREC 2009 legal interactive. In: The eighteenth text retrieval conference (TREC 2009) proceedingsGoogle Scholar
  23. Roitblat HL, Kershaw A, Oot P (2010) Document categorization in legal electronic discovery: computer classification vs manual review. J Am Soc Inf Sci Technol 61(1):1–11Google Scholar
  24. Rosenfeld L, Morville P (2002) Information architecture for the World Wide Web, 2nd edn. O’Reilly Media, SebastopolGoogle Scholar
  25. Russell DM, Stefik MJ, Pirolli PL, Card SK (1993) The cost structure of sensemaking. In: Proceedings of the INTERACT ‘93 and CHI ‘93 conference on human factors in computing systems, pp 269–276Google Scholar
  26. Saracevic T, Spink A, Wu MW (2007) Users and intermediaries in information retrieval: What are they talking about? In: Proceedings of the sixth international conference on user modeling (UM97), pp 43–54Google Scholar
  27. Schaffer TL, Elkins JR (1987) Legal interviewing and counseling in a nutshell, 2nd edn. West Publishing, RochesterGoogle Scholar
  28. Sterenzy T (2010) EQUIVIO at TREC 2009 legal interactive. in the eighteenth text retrieval conference (TREC 2009) proceedingsGoogle Scholar
  29. Takayama L, Card SK (2008) Tracing the microstructure of sensemaking. In: Proceedings of the CHI 2008 workshop on sensemakingGoogle Scholar
  30. Thompson P, Turtle H, Yang B, Flood J (1995) TREC-3 Ad Hoc retrieval and routing experiments using the WIN System. In Proceedings of the third text retrieval conference (TREC-3)Google Scholar
  31. Voorhees EM, Harman DK (2005) TREC: experiment and evaluation in information retrieval. The MIT Press, Cambridge, MAGoogle Scholar
  32. Wang J, Coles C, Elliot R, Adrianakou S (2010a) ZL technologies at TREC 2009 legal interactive: comparing exclusionary and investigative approaches for electronic discovery using the TREC Enron Corpus. In The eighteenth text retrieval conference (TREC 2009) proceedingsGoogle Scholar
  33. Wang J, Sun Y, Thompson P (2010b) TREC 2009 at the University of Buffalo: interactive legal e-discovery with Enron Emails. In the eighteenth text retrieval conference (TREC 2009) proceedingsGoogle Scholar
  34. Willgang TE, Shapard J, Sienstra D, Miletich D (1997) Discovery and disclosure practice, problems, and proposals for change: a case-based national survey of counsel in closed federal civil cases. Reports on discovery for the advisory committee on civil rules of the judicial conference of the United States, federal judicial centerGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2010

Authors and Affiliations

  • Christopher Hogan
    • 1
  • Robert S. Bauer
    • 1
  • Dan Brassil
    • 1
  1. 1.H5San FranciscoCA

Personalised recommendations