Cross-Lingual Speech-to-Text Summarization

  • Elvys Linhares PontesEmail author
  • Carlos-Emiliano González-Gallardo
  • Juan-Manuel Torres-Moreno
  • Stéphane Huet
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 833)


Cross-Lingual Text Summarization generates a summary in a language different from the language of the source documents. We propose a French-to-English cross-lingual transcript summarization framework that automatically segments a French transcript and analyzes the information in the source and the target languages to estimate the saliency of sentences. Additionally, we use a multi-sentence compression method to simultaneously compress and improve the informativeness of sentences. Experimental results show that our framework outperformed extractive methods using automatic sentence segmentation, even with transcription errors.


Cross-Lingual Text Summarization Multi-sentence compression Automatic speech recognition 



This work was granted by the European Project CHISTERA-AMIS ANR-15-CHR2-0001.


  1. 1.
    Banerjee, S., Mitra, P., Sugiyama, K.: Multi-document abstractive summarization using ILP based multi-sentence compression. In: 24th International Conference on Artificial Intelligence (IJCAI), IJCAI 2015, pp. 1208–1214 (2015)Google Scholar
  2. 2.
    Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. arXiv preprint arXiv:1607.04606 (2016)
  3. 3.
    Christensen, H., Gotoh, Y., Kolluru, B., Renals, S.: Are extractive text summarisation techniques portable to broadcast news? In: IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp. 489–494 (2003)Google Scholar
  4. 4.
    Filippova, K.: Multi-sentence compression: Finding shortest paths in word graphs. In: COLING, pp. 322–330 (2010)Google Scholar
  5. 5.
    Furui, S., Kikuchi, T., Shinnaka, Y., Hori, C.: Speech-to-text and speech-to-speech summarization of spontaneous speech. IEEE Trans. Speech Audio Process. 12(4), 401–408 (2004)CrossRefGoogle Scholar
  6. 6.
    Giannakopoulos, G., El-Haj, M., Favre, B., Litvak, M., Steinberger, J., Varma, V.: TAC2011 multiling pilot overview. In: 4th Text Analysis Conference TAC (2011)Google Scholar
  7. 7.
    González-Gallardo, C.E., Torres-Moreno, J.M.: Sentence boundary detection for French with subword-level information vectors and convolutional neural networks. ArXiv, February 2018Google Scholar
  8. 8.
    Kovář, V., Horák, A., Jakubíček, M.: Syntactic analysis using finite patterns: a new parsing system for czech. In: Language and Technology Conference, pp. 161–171. Springer (2009)Google Scholar
  9. 9.
    Kulkarni, N., Finlayson, M.A.: jMWE: a Java toolkit for detecting multi-word expressions. In: Workshop on Multiword Expressions: From Parsing and Generation to the Real World (MWE), pp. 122–124 (2011)Google Scholar
  10. 10.
    Leuski, A., Lin, C.Y., Zhou, L., Germann, U., Och, F.J., Hovy, E.: Cross-lingual C*ST*RD: English access to Hindi information, vol. 2, no. 3, pp. 245–269, September 2003Google Scholar
  11. 11.
    Lin, C.Y.: ROUGE: a package for automatic evaluation of summaries. In: Workshop Text Summarization Branches Out (ACL 2004), pp. 74–81 (2004)Google Scholar
  12. 12.
    Manning, C., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S., McClosky, D.: The Stanford CoreNLP natural language processing toolkit. In: 52nd Annual Meeting of the Association for Computational Linguistics (ACL): System Demonstrations, pp. 55–60 (2014)Google Scholar
  13. 13.
    McKeown, K., Hirschberg, J., Galley, M., Maskey, S.: From text to speech summarization. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), vol. 5, p. v/997 (2005)Google Scholar
  14. 14.
    Mrozinski, J., Whittaker, E.W., Chatain, P., Furui, S.: Automatic sentence segmentation of speech for automatic summarization. In: IEEE International Conference on Acoustics Speech and Signal Processing Proceedings (ICASSP) (2006)Google Scholar
  15. 15.
    Niu, J., Chen, H., Zhao, Q., Su, L., Atiquzzaman, M.: Multi-document abstractive summarization using chunk-graph and recurrent neural network. In: IEEE International Conference on Communications, ICC, pp. 1–6 (2017)Google Scholar
  16. 16.
    Och, F.J., Ney, H.: A systematic comparison of various statistical alignment models. Comput. Linguist. 29(1), 19–51 (2003)CrossRefGoogle Scholar
  17. 17.
    Orasan, C., Chiorean, O.A.: Evaluation of a cross-lingual Romanian-English multi-document summariser. In: 6th International Conference on Language Resources and Evaluation (LREC) (2008)Google Scholar
  18. 18.
    Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web. In: 7th International World Wide Web Conference, Brisbane, Australia, pp. 161–172 (1998)Google Scholar
  19. 19.
    Pontes, E.L., Huet, S., Torres-Moreno, J.M., Linhares, A.C.: Cross-language text summarization using sentence and multi-sentence compression. In: Natural Language Processing and Information Systems. pp. 467–479. Springer International Publishing, Cham (2018)Google Scholar
  20. 20.
    Pontes, E.L., Huet, S., Gouveia da Silva, T., Linhares, A.C., Torres-Moreno, J.M.: Multi-sentence compression with word vertex-labeled graphs and integer linear programming. In: TextGraphs-12: The Workshop on Graph-Based Methods for Natural Language Processing. ACL (2018)Google Scholar
  21. 21.
    Pontes, E.L., Torres-Moreno, J.M., Linhares, A.C.: LIA-RAG: a system based on graphs and divergence of probabilities applied to speech-to-text summarization. In: Addendum, M..P. (ed.) Multiling CCCS (2015)Google Scholar
  22. 22.
    Rott, M., Červa, P.: Speech-to-text summarization using automatic phrase extraction from recognized text. In: International Conference on Text, Speech, and Dialogue (TSD), pp. 101–108. Springer (2016)Google Scholar
  23. 23.
    Taskiran, C.M., Pizlo, Z., Amir, A., Ponceleon, D., Delp, E.J.: Automated video program summarization using speech transcripts. IEEE Trans. Multimedia 8(4), 775–791 (2006)CrossRefGoogle Scholar
  24. 24.
    Torres-Moreno, J.M.: Automatic Text Summarization. Wiley, London (2014)CrossRefGoogle Scholar
  25. 25.
    Wan, X.: Using bilingual information for cross-language document summarization. In: ACL, pp. 1546–1555 (2011)Google Scholar
  26. 26.
    Wan, X., Li, H., Xiao, J.: Cross-language document summarization based on machine translation quality prediction. In: ACL, pp. 917–926 (2010)Google Scholar
  27. 27.
    Wan, X., Luo, F., Sun, X., Huang, S., Yao, J.g.: Cross-language document summarization via extraction and ranking of multiple summaries. In: Knowledge and Information Systems (2018)Google Scholar
  28. 28.
    Yao, J., Wan, X., Xiao, J.: Phrase-based compressive cross-language summarization. In: EMNLP, pp. 118–127 (2015)Google Scholar
  29. 29.
    Yuan, Z., Briscoe, T.: Grammatical error correction using neural machine translation. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 380–386. Association for Computational Linguistics (2016)Google Scholar
  30. 30.
    Zhang, J., Zhou, Y., Zong, C.: Abstractive cross-language summarization via translation model enhanced predicate argument structure fusing. IEEE/ACM Trans. Audio Speech Lang. Process. 24(10), 1842–1853 (2016)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Elvys Linhares Pontes
    • 1
    • 2
    Email author
  • Carlos-Emiliano González-Gallardo
    • 1
    • 2
  • Juan-Manuel Torres-Moreno
    • 1
    • 2
  • Stéphane Huet
    • 1
  1. 1.LIA, Université d’Avignon et des Pays de VaucluseAvignonFrance
  2. 2.École Polytechnique de MontréalMontréalCanada

Personalised recommendations