Advertisement

Circuits, Systems, and Signal Processing

, Volume 38, Issue 12, pp 5717–5733 | Cite as

Transform-Domain Speech Bandwidth Extension

  • Prasad NizampatnamEmail author
  • G. R. L. V. N. S. Raju
Article
  • 51 Downloads

Abstract

The limited speech bandwidth used in narrowband telephone systems degrades both the quality and the intelligibility of speech. This paper proposes a new transform-domain speech bandwidth extension method. The method uses discrete wavelet transform–fast Fourier transform-based data hiding technique to provide a better quality wideband speech signal. The code-excited linear prediction parameters of the missing speech are hidden within the narrowband speech signal. The hidden information is recovered at the receiving end to produce a wideband speech signal. Theoretical and simulation analyses show that the proposed method is robust to quantization and channel noises. Obtained results confirm the excellent reconstructed wideband speech quality of the proposed method over traditional methods.

Keywords

Public switched telephone network (PSTN) Speech bandwidth extension CELP Data hiding Speech quality Spread spectrum 

Notes

References

  1. 1.
    J. Abel, T. Fingscheidt, Artificial speech bandwidth extension using deep neural networks for wideband spectral envelope estimation. IEEE Trans. Audio Speech Lang. Process. 26(1), 71–83 (2018)Google Scholar
  2. 2.
    P. Bauer, T. Fingscheidt, An HMM based artificial bandwidth extension evaluated by cross-language training and test, in Proceedings of ICASSP, Las Vegas, NV (April 2008), pp. 4589–4592Google Scholar
  3. 3.
    N. Bhatt, Y. Kosta, A novel approach for artificial bandwidth extension of speech signals by LPC technique over proposed GSM FR NB coder using high band feature extraction and various extension of excitation methods. Int. J. Speech Technol. 18(1), 57–64 (2015)Google Scholar
  4. 4.
    S. Chen, H. Leung, Artificial bandwidth extension of telephony speech by data hiding, in Proceedings of IEEE International Symposium on Circuits and Systems, Kobe, Japan (May 2005), pp. 3151–3154Google Scholar
  5. 5.
    S. Chen, H. Leung, Concurrent data transmission through analog speech channel using data hiding. IEEE Signal Process. Lett. 12(8), 581–584 (2005)Google Scholar
  6. 6.
    S. Chen, H. Leung, Speech bandwidth extension by data hiding and phonetic classification, in Proceedings of ICASSP, Honolulu, Hawaii, USA (April 2007), pp. 593–596Google Scholar
  7. 7.
    S. Chen, H. Leung, H. Ding, Telephony speech enhancement by data hiding. IEEE Trans. Instrum. Meas. 56(1), 63–74 (2007)Google Scholar
  8. 8.
    S. Chen, H. Leung, A bandwidth extension technique for signal transmission using chaotic data hiding. Circuits Syst. Signal Process. 27(6), 893–913 (2008)MathSciNetzbMATHGoogle Scholar
  9. 9.
    Z. Chen, C. Zhao, G. Geng, F. Yin, An audio watermark based speech bandwidth extension method. EURASIP J. Audio Speech Music Process. 2013(10), 1–8 (2013)Google Scholar
  10. 10.
    H. Ding, Wideband audio over narrowband low-resolution media, in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Montreal, Quebec, Canada (March 2004), pp. 489–492Google Scholar
  11. 11.
    J. Epps, W.H. Holmes, A new technique for wideband enhancement of coded narrowband speech, in Proceedings of IEEE Workshop on Speech Coding, Porvoo (June 1999), pp. 174–176Google Scholar
  12. 12.
    European Telecommunications Standards Institute (ETSI) Standard. Speech Processing, Transmission and Quality Aspects (STQ); Distributed speech recognition; Front-end feature extraction algorithm; Compression algorithms, ETSI ES 201 108 V1.1.2, April 2000Google Scholar
  13. 13.
    J.S. Garofolo, Getting Started with the DARPA TIMIT CD-ROM: An Acoustic Phonetic Continuous Speech Database (National Institute of Standards and Technology (NIST), Gaithersburg, 1988)Google Scholar
  14. 14.
    B. Geiser, P. Jax, P. Vary, H. Taddei, S. Schandl, M. Gartner, C. Guillaumé, S. Ragot, Bandwidth extension for hierarchical speech and audio coding in ITU-T Rec. G.729.1. IEEE Trans. Audio Speech Lang. Process. 15(8), 1124–1137 (2007)Google Scholar
  15. 15.
    B. Geiser, P. Vary, Backwards compatible wideband telephony in mobile networks: CELP watermarking and bandwidth extension, in Proceedings of ICASSP, Honolulu, Hawaii, USA (April 2007), pp. 533–536Google Scholar
  16. 16.
    B. Geiser, P. Vary, Speech bandwidth extension based on in-band transmission of higher frequencies, in Proceedings of ICASSP, Vancouver, Canada (May 2013), pp. 7507–7511.Google Scholar
  17. 17.
    A. Goldsmith, Wireless Communications (Cambridge University Press, New York, 2005)Google Scholar
  18. 18.
    A.A. Hassan, J.E. Hershey, G.J. Saulnier, Perspectives in Spread Spectrum (Kluwer Academic Publishers, Boston, 1998)Google Scholar
  19. 19.
    R. Hu, V. Krishnan, D.V. Anderson, Speech bandwidth extension by improved codebook mapping towards increased phonetic classification, in Proceedings of Interspeech, Lisbon, Portugal (Sept 2005), pp. 1501–1504Google Scholar
  20. 20.
    International Telecommunications Union, Perceptual evaluation of speech quality (PESQ): an objective method for end to-end speech quality assessment of narrow-band telephone networks and speech codecs, ITU-T Recommendation P.862, February 2001Google Scholar
  21. 21.
    P. Jax, Enhancement of bandlimited speech signals: algorithms and theoretical bounds. Ph.D. Thesis, RWTH Aachen University, Aachen, Germany (2002)Google Scholar
  22. 22.
    P. Jax, P. Vary, An upper bound on the quality of artificial bandwidth extension of narrowband speech signals, in Proceedings of ICASSP, Orlando, USA (May 2002), pp. 237–240Google Scholar
  23. 23.
    P. Jax, P. Vary, Bandwidth extension of speech signals: a catalyst for the introduction of wideband speech coding? IEEE Commun. Mag. 44(5), 106–111 (2006)Google Scholar
  24. 24.
    B.E. Keiser, E. Strange, Digital Telephony and Network Integration (Van Nostrand Reinhold, New York, 1995)Google Scholar
  25. 25.
    Y. Kosta, Simulation and overall comparative evaluation of performance between different techniques for high band feature extraction based on artificial bandwidth extension of speech over proposed global system for mobile full rate narrow band coder. Int. J. Speech Technol. 19(4), 881–893 (2015)Google Scholar
  26. 26.
    L. Laaksonen, H. Pulakka, V. Myllylä, P. Alku, Development, evaluation and implementation of an artificial bandwidth extension method of telephone speech in mobile terminal. IEEE Trans. Consum. Electron. 55(2), 780–787 (2009)Google Scholar
  27. 27.
    Y. Li, S. Kang, Artificial bandwidth extension using deep neural network-based spectral envelope estimation and enhanced excitation estimation. IET Signal Proc. 10(4), 422–427 (2016)Google Scholar
  28. 28.
    X. Liu, C. Bao, Audio bandwidth extension based on ensemble echo state networks with temporal evolution. IEEE Trans. Audio Speech Lang. Process. 24(3), 594–607 (2016)Google Scholar
  29. 29.
    Y. Nakatoh, M. Tsushima, T. Norimatsu, Generation of broadband speech from narrowband speech using piecewise linear mapping, in Proceedings of EUROSPEECH, Rhodes, Greece (Sept 1997), pp. 1643–1646Google Scholar
  30. 30.
    N. Prasad, T. Kishore Kumar, Speech bandwidth extension aided by spectral magnitude data hiding. Circuits Syst. Signal Process. 36(11), 4512–4540 (2017)zbMATHGoogle Scholar
  31. 31.
    N. Prasad, T. Kishore Kumar, Bandwidth extension of narrowband speech using integer Wavelet transform. IET Signal Process. 11(4), 437–445 (2017)Google Scholar
  32. 32.
    N. Prasad, T. Kishore Kumar, Bandwidth extension of telephone speech using magnitude spectrum data hiding. Int. J. Speech Technol. 20(1), 151–162 (2017)zbMATHGoogle Scholar
  33. 33.
    H. Pulakka, P. Alku, Bandwidth extension of telephone speech using a neural network and a filter bank implementation for highband Melspectrum. IEEE Trans. Audio Speech Lang. Process. 19(7), 2170–2183 (2011)Google Scholar
  34. 34.
    H. Pulakka, L. Laaksonen, M. Vainio, J. Pohjalainen, P. Alku, Evaluation of an Artificial speech bandwidth extension method in three languages. IEEE Trans. Audio Speech Lang. Process. 16(6), 1124–1137 (2008)Google Scholar
  35. 35.
    H. Pulakka, U. Remes, K. Palomaki, M. Kurimo, P. Alku, Speech bandwidth extension using Gaussian mixture model-based estimation of the highband Mel spectrum, in Proceedings of ICASSP, Prague, Czech Republic (May 2011), pp. 5100–5103Google Scholar
  36. 36.
    H. Pulakka, U. Remes, S. Yrttiaho, K. Palomäki, M. Kurimo, P. Alku, Bandwidth extension of telephone speech to low frequencies using sinusoidal synthesis and a Gaussian mixture model. IEEE Trans. Audio Speech Lang. Process. 20(8), 2398–2409 (2012)Google Scholar
  37. 37.
    Y. Qian, P. Kabal, Dual-mode wideband speech recovery from narrowband speech, in Proceedings of EUROSPEECH, Geneva, Switzerland (Sep 2003), pp. 1433–1436Google Scholar
  38. 38.
    T. Rabie, D. Guerchi, Magnitude spectrum speech hiding, in Proceedings of IEEE International Conference on Signal Processing and Communications, Dubai, United Arab Emirates (Nov. 2007), pp. 1147–1150Google Scholar
  39. 39.
    S. Rekik, D. Guerchi, S.A. Selouani, H. Hamam, Speech steganography using Wavelet and Fourier transforms. EURASIP J. Audio Speech Music Process. 2012(20), 1–14 (2012)Google Scholar
  40. 40.
    A. Sagi, D. Malah, Bandwidth extension of telephone speech aided by data embedding. EURASIP J. Adv. Signal Process. 2007(1), 37–52 (2007)zbMATHGoogle Scholar
  41. 41.
    M.R. Schroeder, B.S. Atal, Code-excited linear prediction (CELP); high quality speech at low bit rates, in Proceedings of ICASSP, Tampa, FL, USA (Apr. 1985), pp. 937–940Google Scholar
  42. 42.
    S. Vaseghi, E. Zavarehei, Q. Yan, Speech bandwidth extension: extrapolations of spectral envelop and harmonicity quality of excitation, in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Toulouse, France (May 2006), pp. 844–847Google Scholar
  43. 43.
    Y. Wang, S. Zhao, L. Jianxin, J. Kuangand, Q. Zhu, Recurrent neural network for spectral mapping in speech bandwidth extension, in Proceedings of Global Conference on Signal and Information Processing, Washington, DC, USA (2016), pp. 242–246Google Scholar
  44. 44.
    Y. Wang, S. Zhao, D. Qu, J. Kuang, Using conditional restricted Boltzmann machines for spectral envelope modeling in speech bandwidth extension, in Proceedings of ICASSP, Shanghai, China (2016), pp. 5930–5934Google Scholar
  45. 45.
    Y. Wang, S. Zhao, D. Qu, J. Kuang, Speech bandwidth extension using recurrent temporal restricted Boltzmann machines. IET Signal Process. Lett. 23(12), 1877–1881 (2016)Google Scholar
  46. 46.
    L. Zhen-Hua, A. Yang, G. Yu, D. Li-Rong, Waveform modeling and generation using hierarchical recurrent neural networks for speech bandwidth extension. IEEE/ACM Trans. Audio Speech Lang. Process. 26(5), 883–894 (2018)Google Scholar
  47. 47.
    M. Zöhrer, R. Peharz, F. Pernkopf, Representation learning for single-channel source separation and bandwidth extension. IEEE Trans. Audio Speech Lang. Process. 23(12), 2398–2409 (2015)Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  • Prasad Nizampatnam
    • 1
    Email author
  • G. R. L. V. N. S. Raju
    • 1
  1. 1.Department of Electronics and Communication EngineeringShri Vishnu Engineering College for Women (Autonomous)Bhimavaram, West GodavariIndia

Personalised recommendations