Skip to main content

Multi-Document Summarization Techniques for Generating Image Descriptions: A Comparative Analysis

  • Chapter
  • First Online:
Multi-source, Multilingual Information Extraction and Summarization

Abstract

This paper reports an initial study that aims to assess the viability of multi-document summarization techniques for automatic captioning of geo-referenced images. The automatic captioning procedure requires summarizing multiple Web documents that contain information related to images’ location. We use different state-of-the art summarization systems to generate generic and query-based multi-document summaries and evaluate them using ROUGE metrics [24] relative to human generated summaries. Results show that query-based summaries perform better than generic ones and thus are more appropriate for the task of image captioning or generation of short descriptions related to the location/place captured in the image. For our future work in automatic image captioning this result suggests that developing the query-based summarizer further and biasing it to account for user-specific requirements will prove worthwhile.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www-nlpir.nist.gov/projects/duc/index.html

  2. 2.

    www.VirtualTourist.com

  3. 3.

    http://search.yahoo.com/

  4. 4.

    http://htmlparser.sourceforge.net/

  5. 5.

    http://www.dcs.shef.ac.uk/~saggion/summa/default.htm

  6. 6.

    http://gate.ac.uk

  7. 7.

    http://www.summarization.com/mead/

  8. 8.

    http://www.d.umn.edu/~tpederse/text-similarity.html

  9. 9.

    http://aye.comp.nus.edu.sg/~qiu/NLPTools/JavaRAP.html

References

  1. Aker, A., Gaizauskas, R.: Summary generation for toponym-referenced images using object type language models. In: Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP), Borovets (2009)

    Google Scholar 

  2. Aker, A., Gaizauskas, R.: Model summaries for location-related images. In: Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC), Valletta (2010)

    Google Scholar 

  3. Barnard, K., Duygulu, P., Forsyth, D., de Freitas, N., Blei, D., Jordan, M.: Matching words and pictures. J. Mach. Learn. Res. 3, 1107–1135 (2003)

    Google Scholar 

  4. Bellare, K., Das Sarma, A., Loiwal, N., Mehta, V., Ramakrishnan, G., Bhattacharyya, P.: Generic text summarization using WordNet. In: Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC), Lisbon (2004)

    Google Scholar 

  5. Carenini, G., Ng, R., Pauls, A.: Multi-document summarization of evaluative text. In: Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL), Trento (2006)

    Google Scholar 

  6. Cesarano, C., Mazzeo, A., Picariello, A.: A system for summary-document similarity in notary domain. In: Proceedings of the International Workshop on Database and Expert Systems Applications, Regensburg (2007)

    Google Scholar 

  7. Chowdary, C., Kumar, P.S.: Update summarizer using MMR approach. In: Proceedings of the Text Analysis Conference (TAC), Gaithersburg (2008)

    Google Scholar 

  8. Deschacht, K., Moens, M.: Text analysis for automatic image annotation. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL), Prague (2007)

    Google Scholar 

  9. El-haj, M., Hammo, B.: Evaluation of query-based arabic text summarization system. In: Proceedings of the International Conference on Natural Language Processing and Software Engineering, Beijing (2008)

    Google Scholar 

  10. Erkan, G., Radev, D.: LexRank: graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. 22, 457–479 (2004)

    Google Scholar 

  11. Fan, J., Gao, Y., Luo, H., Keim, D., Li, Z.: A novel approach to enable semantic and visual image summarization for exploratory image search. In: Proceedings of the 1st ACM International Conference on Multimedia Information Retrieval (MIR), Vancouver (2008)

    Google Scholar 

  12. Ferrández, O., Micol, D., Muñoz, R., Palomar, M.: A perspective-based approach for solving textual entailment recognition. In: Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing, Prague (2007)

    Google Scholar 

  13. Fiszman, M., Rindflesch, T., Kilicoglu, H.: Abstraction summarization for managing the biomedical research literature. In: Proceedings of the HLT-NAACL Workshop on Computational Lexical Semantics, Boston (2004)

    Google Scholar 

  14. Givón, T.: Syntax: A Functional-Typological Introduction, vol. II. John Benjamins Publishing Company, Amsterdam/Philadelphia (1990)

    Google Scholar 

  15. Glickman, O.: Applied textual entailment. Ph.D. thesis, Bar Ilan University (2006)

    Google Scholar 

  16. Goldstein, J., Mittal, V., Carbonell, J., Kantrowitz, M.: Multi-document summarization by sentence extraction. In: Proceedings of the NAACL-ANLP Workshop on Automatic summarization, Seattle (2000)

    Google Scholar 

  17. Gotti, F., Lapalme, G., Nerima, L., Wehrli, E.: GOFAISUM: a symbolic summarizer for DUC. In: Proceedings of the Document Understanding Conference (DUC), Rochester (2007)

    Google Scholar 

  18. Gupta, S., Nenkova, A., Jurafsky, D.: Measuring importance and query relevance in topic-focused multi-document summarization. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL), Prague. Demo and Poster Sessions (2007)

    Google Scholar 

  19. He, L., Sanocki, E., Gupta, A., Grudin, J.: Auto-summarization of audio-video presentations. In: Proceedings of the Seventh ACM International Conference on Multimedia (MULTIMEDIA), Orlando (1999)

    Google Scholar 

  20. Hsin-Hsi, C., Chuan-Jie, L.: A multilingual news summarizer. In: Proceedings of the 18th Conference on Computational Linguistics (COLING), SaarbrĂĽcken (2000)

    Google Scholar 

  21. Jaoua, M., Ben Hamadou, A.: Automatic text summarization of scientific articles based on classification of extract’s population. In: Proceedings of the 4th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing), Mexico City (2003)

    Google Scholar 

  22. Kan, M.Y., McKeown, K., Klavans, J.: Domain-specific informative and indicative summarization for information retrieval. In: Proceedings of the Document Understanding Conference (DUC), New Orleans (2001)

    Google Scholar 

  23. Lesk, M.: Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from a ice cream cone. In: Proceedings of SIGDOC, Toronto (1986)

    Google Scholar 

  24. Lin, C.Y.: ROUGE: a package for automatic evaluation of summaries. In: Proceedings of the Workshop on Text Summarization Branches Out (WAS), Barcelona (2004)

    Google Scholar 

  25. Lloret, E., Ferrández, O., Muñoz, R., Palomar, M.: A text summarization approach under the influence of textual entailment. In: Proceedings of the 5th Natural Language Processing and Cognitive Science Workshop, Barcelona (2008)

    Google Scholar 

  26. Lloret, E., Palomar, M.: A gradual combination of features for building automatic summarisation systems. In: Proceedings of the 12th International Conference on Text, Speech and Dialogue (TSD), Pilsen (2009)

    Google Scholar 

  27. Luhn, H.P.: The automatic creation of literature abstracts. IBM J. Res. Dev. 2, 159–165 (1958)

    Google Scholar 

  28. Mani, I.: Automatic Summarization. John Benjamins Publishing Company, Amsterdam/Philadelphia (2001)

    Google Scholar 

  29. Marcu, D.: Discourse trees are good indicators of importance in text. In: Mani, I., Mayburg, M.T. (eds.) Advances in Automatic Text Summarization. MIT, Cambridge, MA (1999)

    Google Scholar 

  30. Mihalcea, R., Ceylan, H.: Explorations in automatic book summarization. In: Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), Prague (2007)

    Google Scholar 

  31. Mori, Y., Takahashi, H., Oka, R.: Automatic word assignment to images based on image division and vector quantization. In: Proceedings of RIAO 2000: Content-Based Multimedia Information Access, Paris (2000)

    Google Scholar 

  32. Pan, J.Y., Yang, H.J., Duygulu, P., Faloutsos, C.: Automatic image captionin. In: Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), Taipei (2004)

    Google Scholar 

  33. Pedersen, T., Patwardhan, S., Michelizzi, J.: WordNet::Similarity – measuring the relatedness of concepts. In: Proceedings of the National Conference on Artificial Intelligence (AAAI), San Jose (2004)

    Google Scholar 

  34. Plaza, L., Díaz, A., Gervás, P.: Concept-graph based biomedical automatic summarization using ontologies. In: Proceedings of the 3rd Textgraphs Workshop on Graph-based Algorithms for Natural Language Processing, Manchester, pp. 53–56. (2008)

    Google Scholar 

  35. Plaza, L., Díaz, A., Gervás, P.: Automatic summarization of news using WordNet concept graphs. In: Proceedings of the Informatics IADIS International Conference, Algarve (2009)

    Google Scholar 

  36. Plaza, L., Lloret, E., Aker, A.: Improving automatic image captioning using text summarization techniques. In: Proceedings of the 13th International Conference on Text, Speech and Dialogue (TSD), Brno (2010)

    Google Scholar 

  37. Qiu, L., Kan, M.Y., Chua, T.S.: A public reference implementation of the RAP anaphora resolution algorithm. In: Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC), Lisbon (2004)

    Google Scholar 

  38. Radev, D., BlairGoldensohn, S., Zhang, Z.: Experiments in single and multidocument summarization using MEAD. In: Proceedings of the Document Understanding Conference (DUC), New Orleans (2001)

    Google Scholar 

  39. Radev, D., Allison, T., Blair-Goldensohn, S., Blitzer, J., Çelebi, A., Dimitrov, S., Drabek, E., Hakim, A., Lam, W., Liu, D., Otterbacher, J., Qi, H., Saggion, H., Teufel, S., Topper, M., Winkel, A., Zhang, Z.: MEAD – a platform for multidocument multilingual text summarization. In: Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC), Lisbon (2004)

    Google Scholar 

  40. Radev, D., Jing, H., Styś, M., Tam, D.: Centroid-based summarization of multiple documents. Inf. Process. Manage. 40(6), 919–938 (2004)

    Google Scholar 

  41. Radev, D., Blitzer, J., Winkel, A., Allison, T., Topper, M.: Mead documentation v3.10. Tech. rep. URL http://www.summarization.com/mead/ (2006). Accessed June 2011

  42. Saggion, H.: SUMMA: A robust and adaptable summarization tool. Rev. Trait. Automat. Lang. 49(2), 103–125 (2008)

    Google Scholar 

  43. Saggion, H., Gaizauskas, R.: Multi-document summarization by cluster/profile relevance and redundancy removal. In: Proceedings of the Document Understanding Conference (DUC), Boston (2004)

    Google Scholar 

  44. Saggion, H., Teufel, S., Radev, D., Lam, W.: Meta-evaluation of summaries in a cross-lingual environment using content-based metrics. In: Proceedings of the 19th International Conference on Computational Linguistics (COLING), Taipei (2002)

    Google Scholar 

  45. Salton, G.: Automatic Text Processing. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA (1988)

    Google Scholar 

  46. Salton, G., Buckley, C.: Term-Weighting Approaches in Automatic Text Retrieval. Pergamon, Tarrytown, NY, USA (1988)

    Google Scholar 

  47. Salton, G., Lesk, M.: Computer evaluation of indexing and text processing. ACM J. 15(1), 8–36 (1968)

    Google Scholar 

  48. Schilder, F., Kondadadi, R.: FastSum: fast and accurate query-based multi-document summarization. In: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics (ACL): Human Language Technologies. Short Papers, Columbus (2008)

    Google Scholar 

  49. Sekine, S., Nobata, C.: A survey for multi-document summarization. In: Proceedings of the HLT-NAACL Workshop on Text Summarization, Edmonton (2003)

    Google Scholar 

  50. Spärck Jones, K.: Automatic summarizing: factors and directions. In: Mani, I., Mayburg, M.T. (eds.) Advances in Automatic Text Summarization, pp. 1–14. MIT, Cambridge, MA (1999)

    Google Scholar 

  51. Steinberger, J., Poesio, M., Kabadjov, M., Ježek, K.: Two uses of anaphora resolution in summarization. Inf. Process. Manage. 43(6), 1663–1680 (2007)

    Google Scholar 

  52. Sun, J.T., Shen, D., Zeng, H.J., Yang, Q., Lu, Y., Chen, Z.: Web-page summarization using clickthrough data. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Salvador (2005)

    Google Scholar 

  53. Svore, K., Vanderwende, L., Burges, C.: Enhancing single-document summarization by combining RankNet and third-party sources. In: Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), Prague (2007)

    Google Scholar 

  54. Titov, I., McDonald, R.: A joint model of text and aspect ratings for sentiment summarization. In: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics (ACL): Human Language Technologies, Columbus (2008)

    Google Scholar 

  55. Trappey, A., Trappey, C., Wu, C.Y.: Automatic patent document summarization for collaborative knowledge systems and services. J. Syst. Sci. Syst. Eng. 1, 71–94 (2009)

    Google Scholar 

  56. Westerveld, T.: Image retrieval: content versus context. In: Proceedings of RIAO 2000: Content-Based Multimedia Information Access, Paris (2000)

    Google Scholar 

  57. Yoo, I., Hu, X., Song, I.Y.: A coherent graph-based semantic clustering and summarization approach for biomedical literature and a new summarization evaluation method. BMC Bioinformatics 8(9) (2007)

    Google Scholar 

  58. Zechner, K., Waibel, A.: DiaSumm: flexible summarization of spontaneous dialogues in unrestricted domains. In: Proceedings of the 18th Conference on Computational Linguistics (COLING), SaarbrĂĽcken (2000)

    Google Scholar 

Download references

Acknowledgements

This work was supported by the EU-funded TRIPOD project (IST-FP6-045335) and by the Spanish Government through the FPU program and the projects TIN2009-14659-C03-01, TSI 020312-2009-44, and TIN2009-13391-C04-01; by Conselleria d’Educació – Generalitat Valenciana (grant no. PROMETEO/2009/119 and ACOMP/2010/286); and the FPI program (BES-2007-16268) from the Spanish Ministry of Science and Innovation (project TEXT-MESS (TIN2006-15265-C06-01)). We would like to thank Horacio Saggion for his support with SUMMA. We are also grateful to Emina Kurtic for comments on the previous versions of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ahmet Aker .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Aker, A., Plaza, L., Lloret, E., Gaizauskas, R. (2013). Multi-Document Summarization Techniques for Generating Image Descriptions: A Comparative Analysis. In: Poibeau, T., Saggion, H., Piskorski, J., Yangarber, R. (eds) Multi-source, Multilingual Information Extraction and Summarization. Theory and Applications of Natural Language Processing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28569-1_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-28569-1_14

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-28568-4

  • Online ISBN: 978-3-642-28569-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics