Skip to main content

Multilingualization of Speech Processing

  • Chapter
  • First Online:
  • 761 Accesses

Part of the book series: SpringerBriefs in Computer Science ((BRIEFSCOMPUTER))

Abstract

Speech-to-speech translation is a technology that connects people of different languages together and its multilingualization dramatically expands the circle of people connected. “Population” in Table 1.1a shows the potential number of people who can be part of the circle, when the corresponding language benefits from the technology. However, the same table also tells us that the languages of the world are incredibly diverse, and therefore multilingualization is not an easy task. Nevertheless, methods of processing speech sounds have been devised and developed uniformly regardless of language differences. What made this possible, is the wide commonality across languages due to the nature of language—it is a spontaneous tool created for the single purpose of mutual communication between humans who basically share the same biological hardware. This chapter will describe the multilingualization of automatic speech recognition (ASR) and text-to-speech synthesis (TTS); the two speech-related components of the three that constitute the speech-to-speech translation technology.

S. Harada and T. Kitade belonged to NICT at the time of writing.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Kanji is assumed to be the only logographic system that occupies the status of the standard grapheme in modern languages. It has been popular in East Asia; in addition to modern Chinese and Japanese, it was also the standard grapheme in Korean (still used occasionally) and Vietnamese in the past.

  2. 2.

    Three distinct major groups are found in phonograms all of which are said to have direct or indirect roots in Mesopotamia. First, alphabetical systems, which spread out to the West, mainly developed in Europe and then to the world, assign separate characters (not diacritic marks) for both vowels and consonants, exemplified by Cyrillic and Latin scripts. Next, the Brahmic family, which first prevailed in India, and onto other parts of South Asia and Southeast Asia, basically has only consonant characters with designation of vowels and tones (if any) as diacritic marks, including Khmer, Thai and Myanmar scripts. The last group, which remained in the Middle Eastern region, is represented by the Arabic script and is basically composed of only consonant characters, and the designation of vowels is optional.

  3. 3.

    The term “phone” is occasionally used instead of “phoneme” when any variant of linguistic phonemes, e.g., allophone, is implied.

  4. 4.

    http://www.speech.sri.com/projects/srilm/.

Reference

  1. Handbook of the International Phonetic Association: A Guide to the Use of the International Phonetic Alphabet. Cambridge University Press (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hiroaki Kato .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Kato, H., Harada, S., Kitade, T., Shiga, Y. (2020). Multilingualization of Speech Processing. In: Kidawara, Y., Sumita, E., Kawai, H. (eds) Speech-to-Speech Translation. SpringerBriefs in Computer Science. Springer, Singapore. https://doi.org/10.1007/978-981-15-0595-9_1

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-0595-9_1

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-0594-2

  • Online ISBN: 978-981-15-0595-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics