Federated Acoustic Model Optimization for Automatic Speech Recognition

Tan, Conghui; Jiang, Di; Mo, Huaxiao; Peng, Jinhua; Tong, Yongxin; Zhao, Weiwei; Chen, Chaotao; Lian, Rongzhong; Song, Yuanfeng; Xu, Qian

doi:10.1007/978-3-030-59419-0_54

Conghui Tan¹⁴,
Di Jiang¹⁴,
Huaxiao Mo¹⁴,
Jinhua Peng¹⁴,
Yongxin Tong¹⁵,
Weiwei Zhao¹⁴,
Chaotao Chen¹⁴,
Rongzhong Lian¹⁴,
Yuanfeng Song¹⁴ &
…
Qian Xu¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12114))

Included in the following conference series:

International Conference on Database Systems for Advanced Applications

2146 Accesses
2 Citations

Abstract

Traditional Automatic Speech Recognition (ASR) systems are usually trained with speech records centralized on the ASR vendor’s machines. However, with data regulations such as General Data Protection Regulation (GDPR) coming into force, sensitive data such as speech records are not allowed to be utilized in such a centralized approach anymore. In this demonstration, we propose and show the method of federated acoustic model optimization in order to solve this problem. This demonstration does not only vividly show the underlying working mechanisms of the proposed method but also provides an interface for the user to customize its hyperparameters. With this demonstration, the audience can experience the effect of federated learning in an interactive fashion and we wish this demonstration would inspire more research on GDPR-compliant ASR technologies.

The video of this paper can be found in https://youtu.be/H29PUN-xFxM.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Huang, Y., Yu, D., Liu, C., Gong, Y.: Multi-accent deep neural network acoustic model with accent-specific top layer using the KLD-regularized model adaptation. In: Fifteenth Annual Conference of the International Speech Communication Association (2014)
Google Scholar
Konečnỳ, J., McMahan, H.B., Yu, F.X., Richtárik, P., Suresh, A.T., Bacon, D.: Federated learning: strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492 (2016)
Mitchell, M.: An Introduction to Genetic Algorithms. MIT Press, Cambridge (1998)
Book Google Scholar
Povey, D., et al.: The Kaldi speech recognition toolkit. In: IEEE 2011 Workshop on Automatic Speech Recognition and Understanding (2011)
Google Scholar
Stolcke, A.: SRILM-an extensible language modeling toolkit. In: Seventh International Conference on Spoken Language Processing (2002)
Google Scholar
Yang, Q., Liu, Y., Chen, T., Tong, Y.: Federated machine learning: concept and applications. ACM Trans. Intell. Syst. Technol. (TIST) 10(2), 12 (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

AI Group, WeBank Co., Ltd., Shenzhen, China
Conghui Tan, Di Jiang, Huaxiao Mo, Jinhua Peng, Weiwei Zhao, Chaotao Chen, Rongzhong Lian, Yuanfeng Song & Qian Xu
BDBC, SKLSDE Lab and IRI, Beihang University, Beijing, China
Yongxin Tong

Authors

Conghui Tan
View author publications
You can also search for this author in PubMed Google Scholar
Di Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Huaxiao Mo
View author publications
You can also search for this author in PubMed Google Scholar
Jinhua Peng
View author publications
You can also search for this author in PubMed Google Scholar
Yongxin Tong
View author publications
You can also search for this author in PubMed Google Scholar
Weiwei Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Chaotao Chen
View author publications
You can also search for this author in PubMed Google Scholar
Rongzhong Lian
View author publications
You can also search for this author in PubMed Google Scholar
Yuanfeng Song
View author publications
You can also search for this author in PubMed Google Scholar
Qian Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Conghui Tan .

Editor information

Editors and Affiliations

Dankook University, Yongin, Korea (Republic of)
Yunmook Nah
Peking University, Haidian, China
Bin Cui
Sungkyunkwan University, Suwon, Korea (Republic of)
Sang-Won Lee
Department of Systems Engineering and En, The Chinese University of Hong Kong, Hong Kong, Hong Kong
Jeffrey Xu Yu
Kangwon National University, Chunchon, Korea (Republic of)
Yang-Sae Moon
Korea Advanced Institute of Science and, Daejeon, Korea (Republic of)
Steven Euijong Whang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tan, C. et al. (2020). Federated Acoustic Model Optimization for Automatic Speech Recognition. In: Nah, Y., Cui, B., Lee, SW., Yu, J.X., Moon, YS., Whang, S.E. (eds) Database Systems for Advanced Applications. DASFAA 2020. Lecture Notes in Computer Science(), vol 12114. Springer, Cham. https://doi.org/10.1007/978-3-030-59419-0_54

Download citation

DOI: https://doi.org/10.1007/978-3-030-59419-0_54
Published: 22 September 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59418-3
Online ISBN: 978-3-030-59419-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics