Personal and Ubiquitous Computing

, Volume 12, Issue 3, pp 197–221 | Cite as

Design and evaluation of systems to support interaction capture and retrieval

  • Steve Whittaker
  • Simon Tucker
  • Kumutha Swampillai
  • Rachel Laban
Original Article


Although many recent systems have been built to support Information Capture and Retrieval (ICR), these have not generally been successful. This paper presents studies that evaluate two different hypotheses for this failure, firstly that systems fail to address user needs and secondly that they provide only rudimentary support for ICR. Having first presented a taxonomy of different systems built to support ICR, we then describe a study that attempts to identify user needs for ICR. On the basis of that study we carried out two user-oriented evaluations. In the first, we carried out a task-based evaluation of a state-of-the-art ICR system, finding that it failed to provide users with abstract ways to view meetings data, and did not present users with information categories that they considered to be important. In a second study, we introduce a new method for comparative evaluation of different techniques for accessing meetings data. The second study showed that simple interface techniques that extracted key information from meetings were effective in allowing users to extract gist from meetings data. We conclude with a discussion of outstanding issues and future directions for ICR research.


Automatic Speech Recognition Public Record Temporal Compression Meeting Minute Meeting Participant 


  1. 1.
  2. 2.
    Arons B (1997) SpeechSkimmer: a system for interactively skimming recorded speech. ACM Trans Comput Human Interact, pp 3–38Google Scholar
  3. 3.
    Bett M, Gross R, Yu H, Zhu X, Pan Y, Yang J, Waibel A (2000) Multimodal meeting tracker. In: Proceedings of RIAO, Paris, FranceGoogle Scholar
  4. 4.
    Brotherton JA, Bhalodia JR, Abowd GD (1998) Automated capture, integration and visualization of multiple media streams. In: Proceedings of the IEEE international conference on multimedia computing and systems, pp 54–63Google Scholar
  5. 5.
  6. 6.
    Christel MG, Smith MA, Taylor CR, Winkler DB (1998) Evolving video skims into useful multimedia abstractions. In: Proceedings of CHI ‘98, Los Angeles, CA, pp 171–178Google Scholar
  7. 7.
    Colbath S, Kubala F, Liu D, Srivastava A (2000) Spoken documents: creating searchable archives from continuous audio. In: Proceedings of 33rd Hawaii international conference on system sciencesGoogle Scholar
  8. 8.
    Covell M, Withgott M, Slaney M (1998) Mach1: nonuniform time-scale modification of speech. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing, Seattle, WA, 12–15 MayGoogle Scholar
  9. 9.
    Cremers AHM, Hilhorst B, Vermeeren APOS (2005) What was discussed by whom, how, when and where? Personalized browsing of annotated multimedia meeting recordings. HCI International, Las VegasGoogle Scholar
  10. 10.
    Cutler R, Rui Y, Gupta A, Cadiz JJ, Tashev I, He L, Colburn A, Zhang Z, Liu Z, Silverberg S (2002) Distributed meetings: a meeting capture and broadcasting system. In: Proceedings of 10th ACM international conference on multimedia, Juan-les-Pins, France, pp 503–512Google Scholar
  11. 11.
    Degen L, Mander R, Salomon G (1992) Working with audio: integrating personal tape recorders and desktop computers. In: Proceedings of CHI ‘92, Monterey, CA, USA, pp 413–418Google Scholar
  12. 12.
    Foote J, Boreczky G, Wilcox L (1998) An intelligent media browser using automatic multimodal analysis. In: Proceedings of ACM multimedia, Bristol, UK, pp 375–380Google Scholar
  13. 13.
    Geyer W, Richter H, Fuchs L, Frauenhofer T, Daijavad S, Poltrock S (2001) A team collaboration space supporting capture and access of virtual meetings. In: Proceedings of 2001 international ACM SIGGROUP conference on supporting group work, Boulder, CO, pp 188–196Google Scholar
  14. 14.
    Girgensohm A, Borczky J, Wilcox L (2001) Keyframe-based user interfaces for digital video. IEEE Comput, pp 61–67Google Scholar
  15. 15.
    Hindus D, Schmandt C (1992) Ubiquitous audio: capturing spontaneous collaboration. In: Proceedings of 1992 ACM conference on computer-supported cooperative work, Toronto, ON, Canada, pp 210–217Google Scholar
  16. 16.
    IM2 Project,
  17. 17.
    Jaimes A, Wang Q, Kato N, Ikeda K, Miyazaki J (2004) Visual trigger templates for knowledge-based indexing. In: IEEE PCM 2004, Tokyo, JapanGoogle Scholar
  18. 18.
    Kazman R, Al-Halimi R, Hunt W, Mantei M (1996) Four paradigms for indexing video conferences. IEEE Multimed 3(1):63–73CrossRefGoogle Scholar
  19. 19.
    Kazman R, Kominek J (1997) Supporting the retrieval process in multimedia information systems. In: Proceedings of proceedings of the 30th annual Hawaii international conference on system sciences, Hawaii, pp 229–238Google Scholar
  20. 20.
    Kimber DG, Wilcox LD, Chen FR, Moran TP (1995) Speaker segmentation for browsing recorded audio. In: Proceedings of CHI ‘95, pp 212–213Google Scholar
  21. 21.
    Lalanne D, Sire S, Ingold R, Behera A, Mekhaldi D, Rotz D (2003) A research agenda for assessing the utility of document annotations in multimedia databases of meeting recordings. In: Proceedings of 3rd international workshop on multimedia data and document engineering, Berlin, GermanyGoogle Scholar
  22. 22.
    Lee D, Erol B, Graham J, Hull, Jonathan J, Murata N (2002) Portable meeting recorder. In: Proceedings of ACM Multimedia, pp 493–502Google Scholar
  23. 23.
    Lisowska A (2003) Multimodal interface design for the multimodal meeting domain: preliminary indications from a query analysis study. University of Geneva, Geneva, p 30Google Scholar
  24. 24.
    Mantei M (1988) Capturing the capture lab concepts: a case study in the design of computer supported meeting environments. In Proceedings of the conference on computer supported cooperative work, Portland, OR, September 1988Google Scholar
  25. 25.
    Mani I (2001) Summarization evaluation: an overview. In: Proceedings of the NTCIR workshop, 2001Google Scholar
  26. 26.
    Moran T, VanMelle W, Chiu P (1998) Spatial interpretation of domain objects integrated into a freeform electronic whiteboard, UIST, pp 175–184Google Scholar
  27. 27.
    Moran Thomas P, Palen L, Harrison S, Chiu P, Kimber D, Minneman S, Melle W, Zellweger P (1997) I’ll get that off the audio: a case study of salvaging multimedia meeting records. In: Proceedings of CHI ‘97, Atlanta, GA, pp 202–209Google Scholar
  28. 28.
    Streitz J Geisler NA, Holmer T (1998) Roomware for cooperative buildings: integrated design of architectural spaces and information spaces, cooperative buildings: integrating information, organization, and architecture. In: Lecture notes in computer science, vol 1370. Springer, Heidelberg, pp 4–21Google Scholar
  29. 29.
    Nenkova A, Passoneau R (2004) Evaluating content selection in summarization. In: Proceedings of the HLT-NAACL conferenceGoogle Scholar
  30. 30.
    Olson JS, Olson GM, Storrøsten M, Carter M (1992) How a group-editor changes the character of a design meeting as well as its outcome. In: Proceedings of CSCW’92, pp 91–98Google Scholar
  31. 31.
    Poole MS, DeSanctis G (1989) Use of group decision support systems as an appropriation process. HICSS Conference, pp 149–157Google Scholar
  32. 32.
    Roy DK, Schmandt C (1996) NewsComm: a hand-held interface for interactive access to structured audio. In: Proceedings of CHI ‘96, pp 173–180Google Scholar
  33. 33.
    Spark-Jones K (1996) A statistical interpretation of term specificity and its application in retrieval. J Doc 28:11–21CrossRefGoogle Scholar
  34. 34.
    Tucker S, Whittaker S (2005) Novel techniques for time compressing speech: an exploratory study. In IEEE international conference on acoustics, speech, and signal processing, Philadelphia, USAGoogle Scholar
  35. 35.
    Tucker S, Whittaker S (2004) Accessing multimodal meeting data: systems, problems and possibilities. In: Proceedings of MLMI’04, Springer, Heidelberg, pp 1–11Google Scholar
  36. 36.
    Verhelst W (2000) Overlap-add methods for time-scaling of speech. Speech Commun 30(4):207–221CrossRefGoogle Scholar
  37. 37.
    Wellner P, Flynn M, Guillemot M (2004) Browsing recorded meetings with Ferret. In: Proceedings of MLMI’04, Springer, Heidelberg, pp 12–21Google Scholar
  38. 38.
    Wellner P, Flynn M, Tucker S, Whittaker S (2005) A meeting browser evaluation test. In: Proceedings of CHI05 conference on human factors in computing systems, ACM Press, New YorkGoogle Scholar
  39. 39.
    Whittaker S, Amento B (2003) Seeing what your are hearing: co-ordinating responses to trouble reports in network troubleshooting. In: Proceedings of European conference on computer supported cooperative work, Kluwer, Netherlands, pp 219–238Google Scholar
  40. 40.
    Whittaker S, Amento B (2004) Semantic speech editing. Proceedings of conference on computer human interaction, ACM Press, New York, pp 527–534Google Scholar
  41. 41.
    Whittaker S, Schwarz H (1995) Back to the future: pen and paper technology supports complex group coordination. In: Proceedings of CHI’95 Conference on Computer Human Interaction, ACM Press, New York, pp 495–502Google Scholar
  42. 42.
    Whittaker S, Davies R, Hirschberg J, Muller U (2000) Jotmail: a voicemail interface that enables you to see what was said. In: Proceedings of CHI2000 conference on human computer interaction, ACM Press, New York, pp 89–96Google Scholar
  43. 43.
    Whittaker S, Frohlich D, Daly-Jones O (1994a) Informal communication: what is it like and how might we support it? In: Plaisant C (ed) Proceedings of CHI’94 conference on computer human interaction, Boston, USA, ACM Press, New York, pp 130–137Google Scholar
  44. 44.
    Whittaker S, Hirschberg J, Amento B, Stark L, Bacchiani M, Isenhour P, Stead L, Zamchick G, Rosenberg A (2002) SCANMail: a voicemail interface that makes speech browsable, readable and searchable. In Proceedings of CHI2002 conference on human computer interaction, ACM Press, New York, pp 275–282Google Scholar
  45. 45.
    Whittaker S, Hirschberg J, Choi J, Hindle D, Pereira F, Singhal A (1999) SCAN: designing and evaluating user interfaces to support retrieval from speech archives. In: Proceedings of SIGIR99 conference on research and development in information retrieval, pp 26–33Google Scholar
  46. 46.
    Whittaker S, Hyland P, Wiley M (1994b) Filochat: handwritten notes provide access to recorded conversations. In: Proceedings of CHI ‘94, Boston, Massachusetts, USA, pp 271–277Google Scholar
  47. 47.
    Wilcox L, Schilit W, Sawhney N (1997) Dynomite: a dynamically organized ink and audio notebook. In: Proceedings of CHI ‘97, March 1997, pp 186–193 Google Scholar
  48. 48.
    Yu H, Tomokiyo T, Wang H, Waibel A (2000) New developments in automatic meeting transcription. In: Proceedings of ICSLP, Beijing, China Google Scholar
  49. 49.
    Walker M, Whittaker S, Stent A, Maloor P, Moore J, Johnston M, Vasireddy V (2004) Generation and evaluation of user tailored responses in dialogue. Cogn Sci 28:811–840CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London Limited 2007

Authors and Affiliations

  • Steve Whittaker
    • 1
  • Simon Tucker
    • 1
  • Kumutha Swampillai
    • 1
  • Rachel Laban
    • 1
  1. 1.Department of Information StudiesUniversity of SheffieldSheffieldUK

Personalised recommendations