Audio plays an important role in our daily life. From speech to music, from FM radios to Podcast services, from lectures to audio books, audio is simply ubiquitous. Through audio, we sense the environment, acquire knowledge, exchange information, enjoy melodies, and so on. Nowadays, with the ease of audio creation, audio archiving, and audio distribution, the amount of audio data is unprecedented, and it far exceeds the capacity of individuals to consume. The value of audio data relies not only on its intrinsic merit, but also on how easy it is to access. Very often, we want to search for a piece of audio that we either heard before, for example, a specific song or a conference recording, or one that we are not aware of, for example, a speech of President Kennedy or a piece of country music. Obviously, metadata, including the title of the audio clip, the name of the producer, the category, a short summarization, etc. is extremely useful for searching audio. But in many cases, we desire an automatic mechanism to discover the audio content since the associated metadata is not sufficient. The same method can also enhance metadata based audio query approaches, because it is able to pinpoint the segment of interest within an audio clip more precisely. Content based audio indexing is a promising approach. With the support of content-based audio indexing and retrieval services, locating the desired audio information among a nearly infinite amount of audio data, is no longer a daunting task.
KeywordsSpeech Recognition Audio Signal Automatic Speech Recognition Speaker Recognition Speaker Verification
Unable to display preview. Download preview PDF.