Advertisement

Cross-Lingual Speech-to-Text Summarization

  • Elvys Linhares PontesEmail author
  • Carlos-Emiliano González-Gallardo
  • Juan-Manuel Torres-Moreno
  • Stéphane Huet
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 833)

Abstract

Cross-Lingual Text Summarization generates a summary in a language different from the language of the source documents. We propose a French-to-English cross-lingual transcript summarization framework that automatically segments a French transcript and analyzes the information in the source and the target languages to estimate the saliency of sentences. Additionally, we use a multi-sentence compression method to simultaneously compress and improve the informativeness of sentences. Experimental results show that our framework outperformed extractive methods using automatic sentence segmentation, even with transcription errors.

Keywords

Cross-Lingual Text Summarization Multi-sentence compression Automatic speech recognition 

Notes

Acknowledgement

This work was granted by the European Project CHISTERA-AMIS ANR-15-CHR2-0001.

References

  1. 1.
    Banerjee, S., Mitra, P., Sugiyama, K.: Multi-document abstractive summarization using ILP based multi-sentence compression. In: 24th International Conference on Artificial Intelligence (IJCAI), IJCAI 2015, pp. 1208–1214 (2015)Google Scholar
  2. 2.
    Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. arXiv preprint arXiv:1607.04606 (2016)
  3. 3.
    Christensen, H., Gotoh, Y., Kolluru, B., Renals, S.: Are extractive text summarisation techniques portable to broadcast news? In: IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp. 489–494 (2003)Google Scholar
  4. 4.
    Filippova, K.: Multi-sentence compression: Finding shortest paths in word graphs. In: COLING, pp. 322–330 (2010)Google Scholar
  5. 5.
    Furui, S., Kikuchi, T., Shinnaka, Y., Hori, C.: Speech-to-text and speech-to-speech summarization of spontaneous speech. IEEE Trans. Speech Audio Process. 12(4), 401–408 (2004)CrossRefGoogle Scholar
  6. 6.
    Giannakopoulos, G., El-Haj, M., Favre, B., Litvak, M., Steinberger, J., Varma, V.: TAC2011 multiling pilot overview. In: 4th Text Analysis Conference TAC (2011)Google Scholar
  7. 7.
    González-Gallardo, C.E., Torres-Moreno, J.M.: Sentence boundary detection for French with subword-level information vectors and convolutional neural networks. ArXiv, February 2018Google Scholar
  8. 8.
    Kovář, V., Horák, A., Jakubíček, M.: Syntactic analysis using finite patterns: a new parsing system for czech. In: Language and Technology Conference, pp. 161–171. Springer (2009)Google Scholar
  9. 9.
    Kulkarni, N., Finlayson, M.A.: jMWE: a Java toolkit for detecting multi-word expressions. In: Workshop on Multiword Expressions: From Parsing and Generation to the Real World (MWE), pp. 122–124 (2011)Google Scholar
  10. 10.
    Leuski, A., Lin, C.Y., Zhou, L., Germann, U., Och, F.J., Hovy, E.: Cross-lingual C*ST*RD: English access to Hindi information, vol. 2, no. 3, pp. 245–269, September 2003Google Scholar
  11. 11.
    Lin, C.Y.: ROUGE: a package for automatic evaluation of summaries. In: Workshop Text Summarization Branches Out (ACL 2004), pp. 74–81 (2004)Google Scholar
  12. 12.
    Manning, C., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S., McClosky, D.: The Stanford CoreNLP natural language processing toolkit. In: 52nd Annual Meeting of the Association for Computational Linguistics (ACL): System Demonstrations, pp. 55–60 (2014)Google Scholar
  13. 13.
    McKeown, K., Hirschberg, J., Galley, M., Maskey, S.: From text to speech summarization. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), vol. 5, p. v/997 (2005)Google Scholar
  14. 14.
    Mrozinski, J., Whittaker, E.W., Chatain, P., Furui, S.: Automatic sentence segmentation of speech for automatic summarization. In: IEEE International Conference on Acoustics Speech and Signal Processing Proceedings (ICASSP) (2006)Google Scholar
  15. 15.
    Niu, J., Chen, H., Zhao, Q., Su, L., Atiquzzaman, M.: Multi-document abstractive summarization using chunk-graph and recurrent neural network. In: IEEE International Conference on Communications, ICC, pp. 1–6 (2017)Google Scholar
  16. 16.
    Och, F.J., Ney, H.: A systematic comparison of various statistical alignment models. Comput. Linguist. 29(1), 19–51 (2003)CrossRefGoogle Scholar
  17. 17.
    Orasan, C., Chiorean, O.A.: Evaluation of a cross-lingual Romanian-English multi-document summariser. In: 6th International Conference on Language Resources and Evaluation (LREC) (2008)Google Scholar
  18. 18.
    Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web. In: 7th International World Wide Web Conference, Brisbane, Australia, pp. 161–172 (1998)Google Scholar
  19. 19.
    Pontes, E.L., Huet, S., Torres-Moreno, J.M., Linhares, A.C.: Cross-language text summarization using sentence and multi-sentence compression. In: Natural Language Processing and Information Systems. pp. 467–479. Springer International Publishing, Cham (2018)Google Scholar
  20. 20.
    Pontes, E.L., Huet, S., Gouveia da Silva, T., Linhares, A.C., Torres-Moreno, J.M.: Multi-sentence compression with word vertex-labeled graphs and integer linear programming. In: TextGraphs-12: The Workshop on Graph-Based Methods for Natural Language Processing. ACL (2018)Google Scholar
  21. 21.
    Pontes, E.L., Torres-Moreno, J.M., Linhares, A.C.: LIA-RAG: a system based on graphs and divergence of probabilities applied to speech-to-text summarization. In: Addendum, M..P. (ed.) Multiling CCCS (2015)Google Scholar
  22. 22.
    Rott, M., Červa, P.: Speech-to-text summarization using automatic phrase extraction from recognized text. In: International Conference on Text, Speech, and Dialogue (TSD), pp. 101–108. Springer (2016)Google Scholar
  23. 23.
    Taskiran, C.M., Pizlo, Z., Amir, A., Ponceleon, D., Delp, E.J.: Automated video program summarization using speech transcripts. IEEE Trans. Multimedia 8(4), 775–791 (2006)CrossRefGoogle Scholar
  24. 24.
    Torres-Moreno, J.M.: Automatic Text Summarization. Wiley, London (2014)CrossRefGoogle Scholar
  25. 25.
    Wan, X.: Using bilingual information for cross-language document summarization. In: ACL, pp. 1546–1555 (2011)Google Scholar
  26. 26.
    Wan, X., Li, H., Xiao, J.: Cross-language document summarization based on machine translation quality prediction. In: ACL, pp. 917–926 (2010)Google Scholar
  27. 27.
    Wan, X., Luo, F., Sun, X., Huang, S., Yao, J.g.: Cross-language document summarization via extraction and ranking of multiple summaries. In: Knowledge and Information Systems (2018)Google Scholar
  28. 28.
    Yao, J., Wan, X., Xiao, J.: Phrase-based compressive cross-language summarization. In: EMNLP, pp. 118–127 (2015)Google Scholar
  29. 29.
    Yuan, Z., Briscoe, T.: Grammatical error correction using neural machine translation. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 380–386. Association for Computational Linguistics (2016)Google Scholar
  30. 30.
    Zhang, J., Zhou, Y., Zong, C.: Abstractive cross-language summarization via translation model enhanced predicate argument structure fusing. IEEE/ACM Trans. Audio Speech Lang. Process. 24(10), 1842–1853 (2016)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Elvys Linhares Pontes
    • 1
    • 2
    Email author
  • Carlos-Emiliano González-Gallardo
    • 1
    • 2
  • Juan-Manuel Torres-Moreno
    • 1
    • 2
  • Stéphane Huet
    • 1
  1. 1.LIA, Université d’Avignon et des Pays de VaucluseAvignonFrance
  2. 2.École Polytechnique de MontréalMontréalCanada

Personalised recommendations