Skip to main content

Visual Speaker Identification with Spatiotemporal Directional Features

  • Conference paper
Image Analysis and Recognition (ICIAR 2013)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7950))

Included in the following conference series:

Abstract

In this paper, a novel local spatiotemporal directional descriptor is proposed for speaker identification by analyzing mouth movements. For this new descriptor, the directional local binary pattern features in three orthogonal planes are coded. In addition, besides sign features, magnitude information encoded as weight for the bins with the same sign value is developed to improve the discriminative ability. Moreover, decorrelation is exploited to remove the redundancy of features. Experimental results on the challenging XM2VTS database show the effectiveness of the proposed representation for this problem.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ahonen, T., Hadid, A., Pietikäinen, M.: Face description with local binary patterns: Application to face recognition. IEEE Trans. Pattern Analysis and Machine Intelligence 28(12), 2037–2041 (2006)

    Article  Google Scholar 

  2. Aleksic, P., Katsaggelos, A.: Audio-visual Biometrics. Proceedings of the IEEE 94, 2025–2044 (2006)

    Article  Google Scholar 

  3. Cetingul, H., Yemez, Y., Erzin, E., Tekalp, A.: Discriminative Lip-motion Features for Biometric Speaker Identification. In: Proc. of ICIP (2004)

    Google Scholar 

  4. Chan, C.H., Goswami, B., Kittler, J., Christmas, W.: Local Ordinal Contrast Pattern Histograms for Spatiotemporal, Lip-Based Speaker Authentication. IEEE Trans. on Information Forensics and Security 7(2), 602–612 (2012)

    Article  Google Scholar 

  5. Faundez-Zanuy, M., Satue-Villar, A.: Speaker Recognition Experiments on A Bilingual Database. In: Proc. of EUSIPCO (2006)

    Google Scholar 

  6. Fox, N., Gross, R., Chazal, P., Cohn, J., Reilly, R.: Person Identification Using Automatic Integration of Speech, Lip and Face Experts. In: Proc. of the ACM SIGMM Workshop on Biometrics Methods and Applications, pp. 25–32 (2003)

    Google Scholar 

  7. Furui, S.: Fifty Years of Progress in Speech and Speaker Recognition. Acoustical Society of America Journal 116(4), 2497–2498 (2004)

    MathSciNet  Google Scholar 

  8. Guo, Z., Zhang, L., Zhang, D.: A Completed Modeling of Local Binary Pattern Operator for Texture Classification. IEEE Transactions on Image Processing 19(6), 1657–1663 (2010)

    Article  MathSciNet  Google Scholar 

  9. Liu, M., Zhang, Z., Hasegawa-Johnson, M., Huang, T.: Exploring Discriminative Learning for Text-Independent Speaker Recognition. In: Proc. of ICME, pp. 56–59 (2007)

    Google Scholar 

  10. Luettin, J., Thacher, N., Beet, S.: Speaker Identification by Lipreading. In: Proc. of International Conference on Spoken Language Proceedings, pp. 62–64 (1996)

    Google Scholar 

  11. Niu, Z., Shan, S., Yan, S., Chen, X., Gao, W.: 2d Cascaded Adaboost for Eye Localization. In: Proc. of ICPR, pp. 1216–1219 (2006)

    Google Scholar 

  12. Ojala, T., Pietikäinen, M., Mäenpää, T.: Multiresolution Gray-scale and Rotation Invariant Texture Classification with Local Binary Patterns. IEEE PAMI 24(7), 971–987 (2002)

    Article  Google Scholar 

  13. Ojansivu, V., Heikkilä, J.: Blur insensitive texture classification using local phase quantization. In: Elmoataz, A., Lezoray, O., Nouboud, F., Mammass, D. (eds.) ICISP 2008. LNCS, vol. 5099, pp. 236–243. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  14. Ouyang, H., Lee, T.: A New Lip Feature Representation Method for Video-based Bimodal Authentication. In: NICTA-HCSNet Multimodal User Interaction Workshop (2006)

    Google Scholar 

  15. Viola, P., Jones, M.: Rapid Object Detection Using A Boosted Cascade of Simple Features. In: Proc. of CVPR, pp. 511–518 (2001)

    Google Scholar 

  16. Zhao, G., Pietikäinen, M.: Dynamic Texture Recognition Using Local Binary Patterns with An Application to Facial Expressions. IEEE PAMI 29(6), 915–928 (2007)

    Article  Google Scholar 

  17. Zhao, G., Barnard, M., Pietikäinen, M.: Lipreading with Local Spatiotemporal Descriptors. IEEE Transactions on Multimedia 11(7), 1254–1265 (2009)

    Article  Google Scholar 

  18. Zhao, G., Huang, X., Gizatdinova, Y., Pietikäinen, M.: Combining Dynamic Texture and Structural Features for Speaker Identification. In: ACM Multimedia 2010 Workshop on Multimedia in Forensics, Security and Intelligence, pp. 93–98 (2010)

    Google Scholar 

  19. Zhou, X., Fu, Y., Liu, M., Hasegawa-Johnson, M., Huang, T.: Robust Analysis and Weighing on MFCC Components for Speech Recognition and Speaker Identification. In: Proc. of ICME, pp. 188–191 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhao, G., Pietikäinen, M. (2013). Visual Speaker Identification with Spatiotemporal Directional Features. In: Kamel, M., Campilho, A. (eds) Image Analysis and Recognition. ICIAR 2013. Lecture Notes in Computer Science, vol 7950. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39094-4_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-39094-4_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-39093-7

  • Online ISBN: 978-3-642-39094-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics