Advertisement

New Era for Robust Speech Recognition

Exploiting Deep Learning

  • Shinji Watanabe
  • Marc Delcroix
  • Florian Metze
  • John R. Hershey

Table of contents

  1. Front Matter
    Pages i-xvii
  2. Introduction

    1. Front Matter
      Pages 1-1
    2. Shinji Watanabe, Marc Delcroix, Florian Metze, John R. Hershey
      Pages 3-17
  3. Approaches to Robust Automatic Speech Recognition

    1. Front Matter
      Pages 19-19
    2. Marc Delcroix, Takuya Yoshioka, Nobutaka Ito, Atsunori Ogawa, Keisuke Kinoshita, Masakiyo Fujimoto et al.
      Pages 21-49
    3. Michael I. Mandel, Jon P. Barker
      Pages 51-77
    4. Xiong Xiao, Shinji Watanabe, Hakan Erdogan, Michael Mandel, Liang Lu, John R. Hershey et al.
      Pages 79-104
    5. Tara N. Sainath, Ron J. Weiss, Kevin W. Wilson, Arun Narayanan, Michiel Bacchiani, Bo Li et al.
      Pages 105-133
    6. John R. Hershey, Jonathan Le Roux, Shinji Watanabe, Scott Wisdom, Zhuo Chen, Yusuf Isik
      Pages 135-164
    7. Hakan Erdogan, John R. Hershey, Shinji Watanabe, Jonathan Le Roux
      Pages 165-186
    8. Vikramjit Mitra, Horacio Franco, Richard M. Stern, Julien van Hout, Luciana Ferrer, Martin Graciarena et al.
      Pages 187-217
    9. Khe Chai Sim, Yanmin Qian, Gautam Mantena, Lahiru Samarakoon, Souvik Kundu, Tian Tan
      Pages 219-243
    10. Martin Karafiát, Karel Veselý, Kateřina Žmolíková, Marc Delcroix, Shinji Watanabe, Lukáš Burget et al.
      Pages 245-260
    11. Yu Zhang, Dong Yu, Guoguo Chen
      Pages 261-279
    12. Guoguo Chen, Yu Zhang, Dong Yu
      Pages 281-297
    13. Yajie Miao, Florian Metze
      Pages 299-323
  4. Resources

    1. Front Matter
      Pages 325-325
    2. Jon P. Barker, Ricard Marxer, Emmanuel Vincent, Shinji Watanabe
      Pages 327-344
    3. Keisuke Kinoshita, Marc Delcroix, Sharon Gannot, Emanuël A. P. Habets, Reinhold Haeb-Umbach, Walter Kellermann et al.
      Pages 345-354
    4. Steve Renals, Pawel Swietojanski
      Pages 355-368
    5. Shinji Watanabe, Takaaki Hori, Yajie Miao, Marc Delcroix, Florian Metze, John R. Hershey
      Pages 369-382
  5. Applications

    1. Front Matter
      Pages 383-383
    2. Michiel Bacchiani, Françoise Beaufays, Alexander Gruenstein, Pedro Moreno, Johan Schalkwyk, Trevor Strohman et al.
      Pages 385-399
    3. Yifan Gong, Yan Huang, Kshitiz Kumar, Jinyu Li, Chaojun Liu, Guoli Ye et al.
      Pages 401-417
    4. Yuuki Tachioka, Toshiyuki Hanazawa, Tomohiro Narita, Jun Ishii
      Pages 419-429
  6. Back Matter
    Pages 431-436

About this book

Introduction

This book covers the state-of-the-art in deep neural-network-based methods for noise robustness in distant speech recognition applications. It provides insights and detailed descriptions of some of the new concepts and key technologies in the field, including novel architectures for speech enhancement, microphone arrays, robust features, acoustic model adaptation, training data augmentation, and training criteria. The contributed chapters also include descriptions of real-world applications, benchmark tools and datasets widely used in the field.

This book is intended for researchers and practitioners working in the field of speech processing and recognition who are interested in the latest deep learning techniques for noise robustness. It will also be of interest to graduate students in electrical engineering or computer science, who will find it a useful guide to this field of research.


Keywords

Speech Recognition Speech Processing Natural Language Processing (NLP) Automatic Speech Recognition (ASR) Signal Processing Deep Learning Noise Robustness Neural Networks (NNs) Distant Speech Acoustic Model Adaptation

Editors and affiliations

  • Shinji Watanabe
    • 1
  • Marc Delcroix
    • 2
  • Florian Metze
    • 3
  • John R. Hershey
    • 4
  1. 1.Mitsubishi Electric Research Laboratories (MERL)CambridgeUSA
  2. 2.NTT Communication Science LaboratoriesNTT CorporationKyotoJapan
  3. 3.Language Technologies InstituteCarnegie Mellon UniversityPittsburghUSA
  4. 4.Mitsubishi Electric Research Laboratories (MERL)CambridgeUSA

Bibliographic information

  • DOI https://doi.org/10.1007/978-3-319-64680-0
  • Copyright Information Springer International Publishing AG 2017
  • Publisher Name Springer, Cham
  • eBook Packages Computer Science
  • Print ISBN 978-3-319-64679-4
  • Online ISBN 978-3-319-64680-0
  • Buy this book on publisher's site