Frame-by-Frame Speech Signal Processing and Recognition for FPGA Devices

Nakayama, Masashi; Shigekawa, Naoki; Yokouchi, Takashi; Ishimitsu, Shunsuke

doi:10.1007/978-3-319-47319-2_5

Masashi Nakayama⁶,
Naoki Shigekawa⁷,
Takashi Yokouchi⁸ &
…
Shunsuke Ishimitsu⁶

Part of the book series: Smart Sensors, Measurement and Instrumentation ((SSMI,volume 22))

1752 Accesses

Abstract

This paper discusses and experiments on frame-by-frame speech signal processing and recognition for Field Programmable Gate Array (FPGA) devices. The system proposes applications including a voice conversion system that needs signal processing and speech recognition for each frame because it requires real-time processing at each frame. Owing to the processing speed, the authors propose algorithms for FPGA as a hardware processor for Voice Activity Detection (VAD) and speech recognition decoder. However, resources for FPGA devices as gate circuits are minimal, therefore, the algorithms need to be customized in order to implement the FPGA. The algorithms are customized for VAD using a 2nd-order autocorrelation function, and for speech recognition using Euclidian distance. These methods implement an FPGA emulator that demonstrates VAD of speech and noise sections and a speech recognition experiment for discriminating Japanese vowels.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Hardcover Book: USD 179.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

FPGA-Based Novel Speech Enhancement System Using Microphone Activity Detector

FPGA Implementation of a Phase-Aware Single-Channel Speech Enhancement System

Article 28 March 2017

A Simplified Vowel-Like Speech Detection Method and Its FPGA Implementation

References

J. Benesty, M. Sondhi, Y. Huang, Springer Handbook of Speech Processing (Springer, 2008)
Google Scholar
T.Z. Hua, L. Boerge, Automatic Speech Recognition on Mobile Devices and Over Communication Networks (Springer, 2008)
Google Scholar
D. Yu, L. Deng, Automatic Speech Recognition: A Deep Learning Approach (Springer, 2015)
Google Scholar
K. Kokubo, N. Hataoka, T. Lee, T. Kawahara, K. Shikano, Computational reduction of contenious speech recognition software “Julius” on super microprocessor. J. Inf. Process. 50, 2597–2606 (2009) (in Japanese)
Google Scholar
C.G. Concejero, V. Rodellar, A.A. Marquina, E. Martinez, P. Gomez, Designing an independent speaker isolated speech recognition system on an FPGA. Res. Microelectron. Electron. 81–84 (2006)
Google Scholar
S.P. Nedevischi, R.K. Patra, E.A. Brewer, Hardware speech recognition for user interfaces in low cost, low power devices. Proc. Des. Autom. Conf. 684–689 (2005)
Google Scholar
K. Okamoto, H. Tamukoh, M. Sekine, Sound preprocessing circuit by consonant and vowel recognition system. IEICE Technical Report VLD2011-93 (CPSY2011-56, RECONF2011-52), 13–18 (2012) (in Japanese)
Google Scholar
S.J. Melnikoff, S.F. Quigley, M.J. Russell, Implementing a simple continuous speech recognition system on an FPGA, in Proceedings of the 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, 275–276 (2002)
Google Scholar
S.J. Melnikoff, S.F. Quigley, M.J. Russell, Speech recognition on an FPGA using continuous hidden Markov models, in Proceedings of 12th International Conference on Field-Programmable Logic and Applications, 201–211 (2002)
Google Scholar
M. Nakayama, Japan Patent JP2011-84323 (JP2012-220607A) (2011)
Google Scholar
M. Nakayama, N. Shigekawa, T. Yokouchi, Hardware speech recognition system for processing and recognition at moment. IEICE Technical Report, EA2010-99 (2010–12) (2010) (in Japanese)
Google Scholar
M. Nakayama, N. Shigekawa, T. Yokouchi, S. Ishimitsu, Frame-by-frame speech recognition as hardware decoding on FPGA devices, in The 9th International Conference on Sensing Technology, ICST 2015, Auckland, New Zealand, 860–863 (2015)
Google Scholar
T. Chiba, M. Kajiyama, The Vowel: Its Nature and Structure (Tokyo-Kaiseikan Pub. Co., Ltd., Tokyo, 1941)
Google Scholar
B. Kavanagh, The phonemes of Japanese and English: a contrastive analysis study. J. Aomori Univ. Health Welfare 8, 283–292 (2007)
Google Scholar
J. Sundberg, The Science of the Singing Voice (Northern Illinois University Press, 1989)
Google Scholar
C.T. Herbst, S. Ternström, A comparison of different methods to measure the EGG contact quotient. Logoped. Phoniatr. Vocol. 31, 126–138 (2006)
Article Google Scholar
G. Fant, Acoustic Theory of Speech Production (Mouton & Co., The Hague, Netherlands, 1960)
Google Scholar
L.R. Rabiner, On the use of autocorrelation analysis for pitch detection. IEEE Trans. Sig. Process. 25, 24–33 (1977)
Article Google Scholar
LabVIEW, National Instruments Corporation. http://www.ni.com/labview/
K. Kato, K. Fujii, K. Kawai, Y. Ando, T. Yano, Blending vocal music with a given sound field due to the characteristics of the running autocorrelation function of singing voices. J. Acoust. Soc. Am. 115, 2437 (2004)
Article Google Scholar
ATR 503 sentences, Speech Resources Consortium (in Japanese). http://research.nii.ac.jp/src/ATR503.html

Download references

Author information

Authors and Affiliations

Graduate School of Information Sciences, Hiroshima City University, 3-4-1 Ozuka-higashi, Asaminami-ku, Hiroshima, 731-3194, Japan
Masashi Nakayama & Shunsuke Ishimitsu
Graduate School of Engineering, University of Fukui, 3-9-1 Bunkyo, Fukui, 910-8507, Japan
Naoki Shigekawa
National Institute of Technology, Kagawa College, 551 Khoda, Takuma, Mitoyo, Kagawa, 769-1192, Japan
Takashi Yokouchi

Authors

Masashi Nakayama
View author publications
You can also search for this author in PubMed Google Scholar
Naoki Shigekawa
View author publications
You can also search for this author in PubMed Google Scholar
Takashi Yokouchi
View author publications
You can also search for this author in PubMed Google Scholar
Shunsuke Ishimitsu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Masashi Nakayama .

Editor information

Editors and Affiliations

Instituto de Telecomunicações and ISCTE-IUL, Lisbon, Portugal
Octavian Adrian Postolache
Department of Engineering, Faculty of Science and Engineering, Macquarie University, Sydney, New South Wales, Australia
Subhas Chandra Mukhopadhyay
Institute of Fundamental Sciences, Massey University, Palmerston North, New Zealand
Krishanthi P. Jayasundera
Department of Electrical and Computer Engineering, University of Auckland, Auckland, New Zealand
Akshya K. Swain

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Nakayama, M., Shigekawa, N., Yokouchi, T., Ishimitsu, S. (2017). Frame-by-Frame Speech Signal Processing and Recognition for FPGA Devices. In: Postolache, O., Mukhopadhyay, S., Jayasundera, K., Swain, A. (eds) Sensors for Everyday Life. Smart Sensors, Measurement and Instrumentation, vol 22. Springer, Cham. https://doi.org/10.1007/978-3-319-47319-2_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-47319-2_5
Published: 28 October 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-47318-5
Online ISBN: 978-3-319-47319-2
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Frame-by-Frame Speech Signal Processing and Recognition for FPGA Devices

Abstract

Access this chapter

Similar content being viewed by others

FPGA-Based Novel Speech Enhancement System Using Microphone Activity Detector

FPGA Implementation of a Phase-Aware Single-Channel Speech Enhancement System

A Simplified Vowel-Like Speech Detection Method and Its FPGA Implementation

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Frame-by-Frame Speech Signal Processing and Recognition for FPGA Devices

Abstract

Access this chapter

Similar content being viewed by others

FPGA-Based Novel Speech Enhancement System Using Microphone Activity Detector

FPGA Implementation of a Phase-Aware Single-Channel Speech Enhancement System

A Simplified Vowel-Like Speech Detection Method and Its FPGA Implementation

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation