Cross-Lingual Korean Speech-to-Text Summarization

  • HyoJeon Yoon
  • Dinh Tuyen Hoang
  • Ngoc Thanh Nguyen
  • Dosam HwangEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11431)


The development of a cross-lingual text summarization of a language differing from that of the source document has been a challenge in recent years. This paper describes a summarization system built to auto-translate Korean speech into an English summary text. Recent studies have discussed two separate tasks in this area, namely, obtaining the analysis information from one of the two languages by providing early or late translation approaches. The early translation tries to translate the original documents into the target language, and then summarizes the results by considering the information of the translated texts, whereas the late translation approach attempts to summarize the original documents and then translate them into the target language. We propose a method for automatically converting Korean speech into an English summary text. The Korean transcript is segmented and analyzed for sentence clustering. A word-graph is then used to compress and generate a unique, concise, and informative compression. Experiments prove that our method achieves better accuracy in comparison with other methods.


Cross-lingual Text summarization Speech to text 



This research was funded by the Basic Science Research Program through the National Research Foundation (NRF) of Korea, funded by the Ministry of Science, ICT, and Future Planning (2017R1A2B4009410).


  1. 1.
    Aggarwal, C.C., Zhai, C.: Mining Text Data. Springer, New York (2012). Scholar
  2. 2.
    Banerjee, S., Mitra, P., Sugiyama, K.: Multi-document abstractive summarization using ILP based multi-sentence compression. In: IJCAI, pp. 1208–1214 (2015)Google Scholar
  3. 3.
    Bednár, P.: Cross-language dependency parsing using part-of-speech patterns. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2016. LNCS (LNAI), vol. 9924, pp. 117–124. Springer, Cham (2016). Scholar
  4. 4.
    Best, C.T.: A direct realist view of cross-language speech perception. In: Speech Perception and Linguistic Experience: Issues in Cross-language Research, pp. 171–204 (1995). Chap. 6Google Scholar
  5. 5.
    Durgunoğlu, A.Y., Nagy, W.E., Hancin-Bhatt, B.J.: Cross-language transfer of phonological awareness. J. Educ. Psychol. 85(3), 453 (1993)CrossRefGoogle Scholar
  6. 6.
    Evans, D.K., Klavans, J.L., McKeown, K.R.: Columbia newsblaster: multilingual news summarization on the web. In: Demonstration Papers at HLT-NAACL 2004, pp. 1–4. Association for Computational Linguistics (2004)Google Scholar
  7. 7.
    Filippova, K.: Multi-sentence compression: finding shortest paths in word graphs. In: Proceedings of the 23rd International Conference on Computational Linguistics, pp. 322–330. Association for Computational Linguistics (2010)Google Scholar
  8. 8.
    Hoang, D.T., Tran, V.C., Nguyen, V.D., Nguyen, N.T., Hwang, D.: Improving academic event recommendation using research similarity and interaction strength between authors. Cybern. Syst. 48(3), 210–230 (2017)CrossRefGoogle Scholar
  9. 9.
    Hori, C., Furui, S., Malkin, R., Yu, H., Waibel, A.: Automatic speech summarization applied to English broadcast news speech. In: 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), vol. 1, pp. I-9. IEEE (2002)Google Scholar
  10. 10.
    Hovy, E., Lin, C.Y.: Automated text summarization and the SUMMARIST system. In: Proceedings of a Workshop, Baltimore, Maryland, 13–15 October 1998, pp. 197–214. Association for Computational Linguistics (1998)Google Scholar
  11. 11.
    Kim, J.H., Kim, J.H., Hwang, D.: Korean text summarization using an aggregate similarity. In: Proceedings of the Fifth International Workshop on Information Retrieval with Asian Languages, pp. 111–118. ACM (2000)Google Scholar
  12. 12.
    Kuhl, P.K., et al.: Cross-language analysis of phonetic units in language addressed to infants. Science 277(5326), 684–686 (1997)CrossRefGoogle Scholar
  13. 13.
    Lin, C.Y.: Rouge: a package for automatic evaluation of summaries. In: Text Summarization Branches Out (2004)Google Scholar
  14. 14.
    Liu, X.Y., Zhou, Y.M., Zheng, R.S.: Measuring semantic similarity within sentences. In: 2008 International Conference on Machine Learning and Cybernetics, vol. 5, pp. 2558–2562. IEEE (2008)Google Scholar
  15. 15.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)Google Scholar
  16. 16.
    Narayan, S., et al.: Document modeling with external attention for sentence extraction. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 2020–2030 (2018)Google Scholar
  17. 17.
    Nguyen, N.T.: Advanced Methods for Inconsistent Knowledge Management. Springer, London (2008). Scholar
  18. 18.
    Linhares Pontes, E., González-Gallardo, C.-E., Torres-Moreno, J.-M., Huet, S.: Cross-lingual speech-to-text summarization. In: Choroś, K., Kopel, M., Kukla, E., Siemiński, A. (eds.) MISSI 2018. AISC, vol. 833, pp. 385–395. Springer, Cham (2019). Scholar
  19. 19.
    Rott, M., Červa, P.: Speech-to-text summarization using automatic phrase extraction from recognized text. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2016. LNCS (LNAI), vol. 9924, pp. 101–108. Springer, Cham (2016). Scholar
  20. 20.
    Torres-Moreno, J.M.: Automatic Text Summarization. Wiley, Hoboken (2014)CrossRefGoogle Scholar
  21. 21.
    Vanderwende, L., Suzuki, H., Brockett, C., Nenkova, A.: Beyond sumbasic: task-focused summarization with sentence simplification and lexical expansion. Inf. Process. Manag. 43(6), 1606–1618 (2007)CrossRefGoogle Scholar
  22. 22.
    Wan, X.: Co-training for cross-lingual sentiment classification. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, vol. 1, pp. 235–243. Association for Computational Linguistics (2009)Google Scholar
  23. 23.
    Wan, X., Luo, F., Sun, X., Huang, S., Yao, J.G.: Cross-language document summarization via extraction and ranking of multiple summaries. Knowl. Inf. Syst. 1–19 (2018)Google Scholar
  24. 24.
    Werker, J.F., Tees, R.C.: Cross-language speech perception: evidence for perceptual reorganization during the first year of life. Infant Behav. Dev. 7(1), 49–63 (1984)CrossRefGoogle Scholar
  25. 25.
    Zhang, J., Zhou, Y., Zong, C.: Abstractive cross-language summarization via translation model enhanced predicate argument structure fusing. IEEE/ACM Trans. Audio Speech Lang. Process. 24(10), 1842–1853 (2016)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • HyoJeon Yoon
    • 1
  • Dinh Tuyen Hoang
    • 1
  • Ngoc Thanh Nguyen
    • 2
  • Dosam Hwang
    • 1
    Email author
  1. 1.Department of Computer EngineeringYeungnam UniversityGyeongsanSouth Korea
  2. 2.Faculty of Computer Science and ManagementWroclaw University of Science and TechnologyWrocławPoland

Personalised recommendations