Speech Separation by Humans and Machines

  • Pierre Divenyi

Table of contents

  1. Front Matter
    Pages i-xxiv
  2. Elyse S. Sussman
    Pages 5-12
  3. Claude Alain
    Pages 13-30
  4. Bhiksha Raj, Michael Seltzer, Manuel Jesus Reyes-Gomez
    Pages 65-82
  5. Toshio Irino, Roy D. Patterson, Hideki Kawakhara
    Pages 155-165
  6. Malcolm Slaney
    Pages 199-211
  7. Nat Durlach, Steve Colburn, Gerald Kidd, Chris Mason, Barbara Shinn-Cunningham, Tanya Arborgast et al.
    Pages 221-243
  8. Alain de Cheveigné
    Pages 245-259
  9. Ziyou Xiong, Thomas S. Huang
    Pages 283-293
  10. Daniel P. W. Ellis
    Pages 295-304
  11. Back Matter
    Pages 315-319

About this book


The "cocktail-party effect" - the ability to focus on one voice in a sea of noises - is a highly sophisticated skill that is usually effortless to listeners but largely impossible for machines. Investigating and unraveling this capacity spans numerous fields including psychology, physiology, engineering, and computer science. All these perspectives are brought together in this volume which, for the first time, provides a comprehensive and authoritative discussion of our understanding of how humans separate speech, and the state of the art in approaching these abilities with machines.

This material is drawn from an October 2003 workshop, sponsored by the National Science Foundation, on speech separation. Leading authorities from around the world were invited to present their perspectives and discuss the points of contact to other perspectives. The result is a clear and uniform overview of this problem, and a primer in what is emerging as an important, active and successful area for the development of new techniques and applications.

Chapters include historical and current summaries of relevant research in behavioral science, neuroscience and engineering, along with more in-depth descriptions of several of the most exciting current research projects and techniques, including the latest experimental results illuminating how listeners organize the mixtures of sound they hear, and the most powerful and successful signal processing and machine learning techniques for the separation of real-world recordings of sound mixtures by one or more microphones.

There is no comparable collection that seeks to bring together the underlying experimental science and the wide variety of technical approaches to give an integrated picture of the problem and solutions to speech separation. Those specializing in speech science, hearing science, neuroscience, or computer science and engineers working on applications such as automatic speech recognition, cochlear implants, hands-free telephones, sound recording, multimedia indexing and retrieval will find Speech Separation by Humans and Machines a useful and inspiring read.


Information Neuroscience cognition computer science development multimedia quality science speech processing speech recognition

Editors and affiliations

  • Pierre Divenyi
    • 1
  1. 1.East Bay Institute for Research and EducationUSA

Bibliographic information

  • DOI
  • Copyright Information Springer Science + Business Media, Inc. 2005
  • Publisher Name Springer, Boston, MA
  • eBook Packages Engineering
  • Print ISBN 978-1-4020-8001-2
  • Online ISBN 978-0-387-22794-8
  • Buy this book on publisher's site