Skip to main content

Speech Intelligibility Enhancement in Strong Mechanical Noise Based on Neural Networks

  • Conference paper
  • First Online:
Advances in Multimedia Information Processing – PCM 2017 (PCM 2017)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10736))

Included in the following conference series:

Abstract

Speech intelligibility is a significant factor for successful speech communication. To enhance the intelligibility, many methods have been proposed, mainly by operating the speech signal such as increasing the amplitude or modifying the speech spectrum. However, their effects are limited when the background noise is extremely strong. In this paper, we purpose a preprocessed noise cancellation model to enhance the speech intelligibility by predicting the cancelling signal and superimposing it into the speech signal. We build a deep neural network (DNN) model to make the prediction algorithm have better accuracy. Finally, the effectiveness of the algorithm was verified by objective and subjective tests, the average of signal-to-noise ratio (SNR) improved 4.5 dB, the average of speech intelligibility index (SII) increased 5.4% and the average of comparison mean opinion score (CMOS) rose 1.16 on a variety of test cases.

X. Wang—This work is supported by National Nature Science Foundation of China (No. 61231015, 61671335); National High Technology Research and Development Program of China (863 Program) No. 2015AA016306; Hubei Province Technological Innovation Major Project (No. 2016AAA015).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Kleijn, W.B., Crespo, J.B., Hendriks, R.C., et al.: Optimizing speech intelligibility in a noisy environment: a unified view. IEEE Signal Process. Mag. 32(2), 43–54 (2015)

    Article  Google Scholar 

  2. Niederjohn, R., Grotelueschen, J.: The enhancement of speech intelligibility in high noise levels by high-pass filtering followed by rapid amplitude compression. IEEE Trans. Acoust. Speech Signal Process. 24(4), 277–282 (1976)

    Article  Google Scholar 

  3. Niederjohn, R., Grotelueschen, J.: Speech intelligibility enhancement in a power generating noise environment. IEEE Trans. Acoust. Speech Signal Process. 26(4), 378–380 (1978)

    Article  Google Scholar 

  4. Sauert, B., Vary, P.: Near end listening enhancement: Speech intelligibility improvement in noisy environments. In: IEEE International Conference on Acoustics, Speech and Signal Processing ICASSP 2006. IEEE, vol. 1, pp. I-I (2006)

    Google Scholar 

  5. Zorila, T.C., Kandia, V., Stylianou, Y.: Speech-in-noise intelligibility improvement based on spectral shaping and dynamic range compression. In: Thirteenth Annual Conference of the International Speech Communication Association (2012)

    Google Scholar 

  6. Schepker, H.F., Rennies, J., Doclo, S.: Improving speech intelligibility in noise by SII-dependent preprocessing using frequency-dependent amplification and dynamic range compression. In: INTERSPEECH. pp. 3577–3581 (2013)

    Google Scholar 

  7. Petkov, P.N., Kleijn, W.B.: Spectral dynamics recovery for enhanced speech intelligibility in noise. IEEE/ACM Trans. Audio, Speech Lang. Process. (TASLP) 23(2), 327–338 (2015)

    Article  Google Scholar 

  8. Goli, P., Karami-mollaei, M.R.: Speech intelligibility improvement in noisy environments based on energy correlation in frequency bands. Digit. Signal Proc. 62, 238–248 (2017)

    Article  Google Scholar 

  9. ANSI A. S3.: 5–1997, Methods for the calculation of the speech intelligibility index. New York: American National Standards Institute, 19, 90–119 (1997)

    Google Scholar 

  10. Widrow, B., Glover, J.R., McCool, J.M., et al.: Adaptive noise cancelling: Principles and applications. Proc. IEEE 63(12), 1692–1716 (1975)

    Article  Google Scholar 

  11. Guarnaccia, C.: Advanced tools for traffic noise modelling and prediction. WSEAS Trans. Syst. 12(2), 121–130 (2013)

    Google Scholar 

  12. Varga, A., Steeneken, H.J.M., Tomlinson, M., et al.: The NOISEX-92 study on the effect of additive noise on automatic speech recognition. Technical Report, DRA Speech Research Unit (1992)

    Google Scholar 

  13. ETSI TS 103 224: A sound field reproduction method for terminal testing including a background noise database. European Telecommunications Standards Institute (2014)

    Google Scholar 

  14. Zue, V., Seneff, S., Glass, J.: Speech database development at MIT: TIMIT and beyond. Speech Commun. 9(4), 351–356 (1990)

    Article  Google Scholar 

  15. Recommendation I. 800: Methods for subjective determination of transmission quality. International Telecommunication Union (1996)

    Google Scholar 

  16. Khademi, S., Hendriks, R.C., Kleijn, W.B.: Jointly optimal near-end and far-end multi-microphone speech intelligibility enhancement based on mutual information. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp. 654–658 (2016)

    Google Scholar 

  17. Petkov, P.N., Stylianou, Y.: Adaptive Gain Control for Enhanced Speech Intelligibility Under Reverberation[J]. IEEE Signal Process. Lett. 23(10), 1434–1438 (2016)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaochen Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cheng, F., Wang, X., Gang, L., Tu, W., Wang, J. (2018). Speech Intelligibility Enhancement in Strong Mechanical Noise Based on Neural Networks. In: Zeng, B., Huang, Q., El Saddik, A., Li, H., Jiang, S., Fan, X. (eds) Advances in Multimedia Information Processing – PCM 2017. PCM 2017. Lecture Notes in Computer Science(), vol 10736. Springer, Cham. https://doi.org/10.1007/978-3-319-77383-4_69

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-77383-4_69

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-77382-7

  • Online ISBN: 978-3-319-77383-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics