Abstract
In this work we describe a large-scale extrinsic evaluation of automatic speech summarization technologies for meeting speech. The particular task is a decision audit, wherein a user must satisfy a complex information need, navigating several meetings in order to gain an understanding of how and why a given decision was made. We compare the usefulness of extractive and abstractive technologies in satisfying this information need, and assess the impact of automatic speech recognition (ASR) errors on user performance. We employ several evaluation methods for participant performance, including post-questionnaire data, human subjective and objective judgments, and an analysis of participant browsing behaviour.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Mani, I.: Summarization evaluation: An overview. In: Proc. of the NTCIR Workshop 2 Meeting on Evaluation of Chinese and Japanese Text Retrieval and Text Summarization, Tokyo, Japan, pp. 77–85 (2001)
Jing, H., Barzilay, R., McKeown, K., Elhadad, M.: Summarization evaluation methods: Experiments and analysis. In: Proc. of the AAAI Symposium on Intelligent Summarization, Stanford, USA, pp. 60–68 (1998)
Mani, I., House, D., Klein, G., Hirschman, L., Firmin, T., Sundheim, B.: The TIPSTER SUMMAC text summarization evaluation. In: Proc. of EACL 1999, Bergen, Norway, pp. 77–85 (1999)
Harman, D., Over, P.: Document understanding conference 2004. In: Proc. of the DUC 2004, Boston, USA (2004)
Dorr, B., Monz, C., President, S., Schwartz, R., Zajic, D.: A methodology for extrinsic evaluation of text summarization: Does ROUGE correlate? In: ACL 2005, MTSE Workshop, Ann Arbor, USA, pp. 1–8 (2005)
Hirschman, L., Light, M., Breck, E.: Deep read: A reading comprehension system. In: Proc. of ACL 1999, College Park, MD, USA, pp. 325–332 (1999)
Morris, A., Kasper, G., Adams, D.: The effects and limitations of automated text condensing on reading comprehension performance. Information Systems Research 3, 17–35 (1992)
Wellner, P., Flynn, M., Tucker, S., Whittaker, S.: A meeting browser evaluation test. In: Proc. of the SIGCHI Conference on Human Factors in Computing Systems 2005, pp. 2021–2024. ACM Press, New York (2005)
Kraaij, W., Post, W.: Task based evaluation of exploratory search systems. In: Proc. of SIGIR 2006 Workshop, Evaluation Exploratory Search Systems, Seattle, USA, pp. 24–27 (2006)
Hirschberg, J., Bacchiani, M., Hindle, D., Isenhour, P., Rosenberg, A., Stark, L., Stead, L., Whittaker, S., Zamchick, G.: SCANMail: Browsing and searching speech data by content. In: Proc. of Interspeech 2001, Aalborg, Denmark, pp. 1299–1302 (2001)
Whittaker, S., Hirschberg, J., Amento, B., Stark, L., Bacchiani, M., Isenhour, P., Stead, L., Zamchick, G., Rosenberg, A.: Scanmail: a voicemail interface that makes speech browsable, readable and searchable. In: Proc. of the SIGCHI 2002, Minneapolis, Minnesota, pp. 275–282. ACM, New York (2002)
Whittaker, S., Tucker, S., Swampillai, K., Laban, R.: Design and evaluation of systems to support interaction capture and retrieval. Personal and Ubiquitous Computing (to appear)
Tucker, S., Whittaker, S.: Accessing multimodal meeting data: Systems, problems and possibilities. In: Bengio, S., Bourlard, H. (eds.) MLMI 2004. LNCS, vol. 3361, pp. 1–11. Springer, Heidelberg (2005)
Wellner, P., Flynn, M., Guillemot, M.: Browsing recorded meetings with Ferret. In: Bengio, S., Bourlard, H. (eds.) MLMI 2004. LNCS, vol. 3361, pp. 12–21. Springer, Heidelberg (2005)
Carletta, J., Ashby, S., Bourban, S., Flynn, M., Guillemot, M., Hain, T., Kadlec, J., Karaiskos, V., Kraaij, W., Kronenthal, M., Lathoud, G., Lincoln, M., Lisowska, A., McCowan, I., Post, W., Reidsma, D., Wellner, P.: The AMI meeting corpus: A pre-announcement. In: Renals, S., Bengio, S. (eds.) MLMI 2005. LNCS, vol. 3869, pp. 28–39. Springer, Heidelberg (2006)
Murray, G., Renals, S.: Term-weighting for summarization of multi-party spoken dialogues. In: Popescu-Belis, A., Renals, S., Bourlard, H. (eds.) MLMI 2007. LNCS, vol. 4892, pp. 155–166. Springer, Heidelberg (2008)
Kleinbauer, T., Becker, S., Becker, T.: Combining multiple information layers for the automatic generation of indicative meeting abstracts. In: Proc. of ENLG 2007, Dagstuhl, Germany (2007)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Murray, G., Kleinbauer, T., Poller, P., Renals, S., Kilgour, J., Becker, T. (2008). Extrinsic Summarization Evaluation: A Decision Audit Task. In: Popescu-Belis, A., Stiefelhagen, R. (eds) Machine Learning for Multimodal Interaction. MLMI 2008. Lecture Notes in Computer Science, vol 5237. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85853-9_32
Download citation
DOI: https://doi.org/10.1007/978-3-540-85853-9_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85852-2
Online ISBN: 978-3-540-85853-9
eBook Packages: Computer ScienceComputer Science (R0)