International Journal of Speech Technology

, Volume 16, Issue 2, pp 161–169

An efficient lattice-based phonetic search method for accelerating keyword spotting in large speech databases

  • Ella Tetariy
  • Michal Gishri
  • Baruch Har-Lev
  • Vered Aharonson
  • Ami Moyal
Article

Abstract

This paper describes an algorithm for the reduction of computational complexity in phonetic search KeyWord Spotting (KWS). This reduction is particularly important when searching for keywords within very large speech databases and aiming for rapid response time. The suggested algorithm consists of an anchor-based phoneme search that reduces the search space by generating hypotheses only around phonemes recognized with high reliability. Three databases have been used for the evaluation: IBM Voicemail I and Voicemail II, consisting of long spontaneous utterances and the Wall Street Journal portion of the MACROPHONE database, consisting of read speech utterances. The results indicated a significant reduction of nearly 90 % in the computational complexity of the search while improving the false alarm rate, with only a small decrease in the detection rate in both databases. Search space reduction, as well as, performance gain or loss can be controlled according to the user preferences via the suggested algorithm parameters and thresholds.

Keywords

Keyword spotting Phonetic search Anchor-based search Searching large speech databases Efficient phonetic search 

References

  1. Alon, G. (2005). Key-word spotting—the base technology for speech analytics. Rishon Lezion: NSC—Natural Speech Communication. Google Scholar
  2. Amir, A., Efrat, A., & Srinivasan, S. (2001). Advances in phonetic word spotting. In Proceedings of the tenth international conference on information and knowledge management (pp. 580–582). Atlanta. Google Scholar
  3. Bernstein, J., Taussig, K., & Godfrey, J. (1994). MACROPHONE. Philadelphia, USA: Linguistic Data Consortium (LDC). Google Scholar
  4. Clements, M., Cardillo, P., & Miller, M. (2001). Phonetic searching of digital audio. In Proceedings of the broadcast engineering conference (pp. 131–140). Washington. Google Scholar
  5. Gishri, M., & Silber-Varod, V. (2010). Lexicon design for transcription of spontaneous voice messages. In Proceedings of the seventh conference on international language resources and evaluation. Valetta. Google Scholar
  6. Gusfield, D. (1997). Algorithms on strings, trees, and sequences: computer science and computational biology. Cambridge: Cambridge University Press. MATHCrossRefGoogle Scholar
  7. Hermelin, D., Landau, G. M., Landau, S., & Weimann, O. (2009). A unified algorithm for accelerating edit distance computation via text compression. In Proceedings of the 26th international symposium on theoretical aspects of computer science. Google Scholar
  8. James, D. A., & Young, S. J. (1994). A fast lattice-based approach to vocabulary independent wordspotting. In Proceedings of the international conference on acoustics, speech, and signal processing (Vol. 1, pp. 337–380). Adelaide: IEEE Comput. Soc. Google Scholar
  9. Padmanabhan, M., Ramaswamy, G., Ramabhadran, B., Gopalakrishnan, P. S., & Dunn, C. (1998). Voicemail Corpus I. Philadelphia, USA: Linguistic Data Consortium (LDC). Google Scholar
  10. Padmanabhan, M., Kingsbury, B., Ramabhadran, B., Huang, J., Stanley, C., Saon, G., et al. (2002). Voicemail Corpus Part II. Philadelphia, USA: Linguistic Data Consortium (LDC). Google Scholar
  11. Pucher, M., Türk, A., Ajmera, J., & Fecher, N. (2007). Phonetic distance measures for speech recognition vocabulary and grammar optimization. In Proceedings of the tenth international conference on spoken language processing. Antwerp. Google Scholar
  12. Szöke, I., Schwarz, P., Matějka, P., Burget, L., Karfiát, M., & Fapšo, M., et al. (2005). Comparison of keyword spotting approaches for informal continuous speech. In Proceedings of interspeech (pp. 633–636). Lisbon. Google Scholar
  13. Tetariy, E., Aharonson, V., & Moyal, A. (2010). Phonetic search using an anchor-based algorithm. In Proceedings of the 26th convention of electrical and electronics engineers in Israel. Eilat. Google Scholar
  14. Thambiratnam, K., & Sridharan, S. (2005). Dynamic match phone-lattice searches for very fast and accurate unrestricted vocabulary keyword spotting. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP). Philadelphia. Google Scholar
  15. Wilpon, J. G., Rabiner, L. R., Lee, C. H., & Goldman, E. R. (1990). Automatic recognition of keywords in unconstrained speech using hidden Markov models. IEEE Transactions on Acoustics, Speech, and Signal Processing, 38(11), 1870–1878. CrossRefGoogle Scholar
  16. Yu, P., & Seide, F. (2004). A hybrid word/phoneme-based approach for improved vocabulary-independent search in spontaneous speech. In Proceedings of the first international conference on logistics strategy for ports. Dalian. Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  • Ella Tetariy
    • 1
  • Michal Gishri
    • 1
  • Baruch Har-Lev
    • 1
  • Vered Aharonson
    • 1
  • Ami Moyal
    • 1
  1. 1.ACLP—Afeka Center for Language ProcessingAfeka Academic College of EngineeringTel AvivIsrael

Personalised recommendations