Skip to main content

One-Shot-Learning for Visual Lip-Based Biometric Authentication

  • Conference paper
  • First Online:
Advances in Visual Computing (ISVC 2019)

Abstract

Lip-based biometric authentication is the process of verifying an individual’s identity based on visual information taken from lips whilst speaking. To date research in this area has involved more traditional approaches and inconsistent results that are difficult to compare. This work aims to push the field forward through the application of deep learning. A deep artificial neural network using spatiotemporal convolutional and bidirectional gated recurrent unit layers is trained end-to-end. For the first time one-shot-learning is applied to lip-based biometric authentication by implementing a siamese network architecture, meaning the model only needs a single prior example in order to authenticate new users. This approach sets a new state-of-the-art performance for lip-based biometric authentication on the XM2VTS dataset and Lausanne protocol with an equal error rate of 0.93% on the evaluation set and a false acceptance rate of 1.07% at a 1% false rejection rate.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ahmed, E., Jones, M., Marks, T.K.: An improved deep learning architecture for person re-identification. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3908–3916, June 2015. https://doi.org/10.1109/CVPR.2015.7299016

  2. Assael, Y.M., Shillingford, B., Whiteson, S., de Freitas, N.: Lipnet: sentence-level lipreading. CoRR abs/1611.01599 (2016). http://arxiv.org/abs/1611.01599

  3. Brand, J.: Visual speech for speaker recognition and robust face detection. Ph.D. thesis, University of Wales, Swansea, UK (2001)

    Google Scholar 

  4. Cetingul, H.E., Yemez, Y., Erzin, E., Tekalp, A.M.: Discriminative analysis of lip motion features for speaker identification and speech-reading. Trans. Img. Proc. 15(10), 2879–2891 (2006). https://doi.org/10.1109/TIP.2006.877528

    Article  MATH  Google Scholar 

  5. Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. CoRR abs/1412.3555 (2014)

    Google Scholar 

  6. Faraj, M., Bigun, J.: Motion features from lip movement for person authentication. In: 2006 18th International Conference on Pattern Recognition, ICPR 2006, vol. 3, pp. 1059–1062 (2006). https://doi.org/10.1109/ICPR.2006.814

  7. Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional LSTM networks. In: Proceedings 2005 IEEE International Joint Conference on Neural Networks, vol. 4, pp. 2047–2052, July 2005. https://doi.org/10.1109/IJCNN.2005.1556215

  8. Kazemi, V., Sullivan, J.: One millisecond face alignment with an ensemble of regression trees. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1867–1874, June 2014

    Google Scholar 

  9. Kittler, J., Li, Y.P., Matas, J., Sánchez, M.U.R.: Lip-shape dependent face verification. In: Bigün, J., Chollet, G., Borgefors, G. (eds.) AVBPA 1997. LNCS, vol. 1206, pp. 61–68. Springer, Heidelberg (1997). https://doi.org/10.1007/BFb0015980

    Chapter  Google Scholar 

  10. Koch, G., Zemel, R., Salakhutdinov, R.: Siamese neural networks for one-shot image recognition. In: ICML deep learning workshop, vol. 2 (2015)

    Google Scholar 

  11. Lu, L., et al.: Lip reading-based user authentication through acoustic sensing on smartphones. IEEE/ACM Trans. Networking 27(1), 447–460 (2019). https://doi.org/10.1109/TNET.2019.2891733

    Article  Google Scholar 

  12. Lu, Z., Wu, X., He, R.: Person identification from lip texture analysis. In: 2016 IEEE International Conference on Digital Signal Processing (DSP), pp. 472–476, October 2016. https://doi.org/10.1109/ICDSP.2016.7868602

  13. Lucey, S.: An evaluation of visual speech features for the tasks of speech and speaker recognition. In: Kittler, J., Nixon, M.S. (eds.) AVBPA 2003. LNCS, vol. 2688, pp. 260–267. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-44887-X_31

    Chapter  MATH  Google Scholar 

  14. Luettin, J., Maître, G.: Evaluation protocol for the extended M2VTS database (XM2VTSDB). Idiap-Com Idiap-Com-05-1998, IDIAP (1998)

    Google Scholar 

  15. Messer, K., Matas, J., Kittler, J., Jonsson, K.: Xm2vtsdb: the extended m2vts database. In: In Second International Conference on Audio and Video-based Biometric Person Authentication, pp. 72–77 (1999)

    Google Scholar 

  16. Morikawa, S., Ito, S., Ito, M., Fukumi, M.: Personal authentication by lips EMG using dry electrode and CNN. In: 2018 IEEE International Conference on Internet of Things and Intelligence System (IOTAIS), pp. 180–183, November 2018. https://doi.org/10.1109/IOTAIS.2018.8600859

  17. Nakata, T., Kashima, M., Sato, K., Watanabe, M.: Lip-sync personal authentication system using movement feature of lip. In: 2013 International Conference on Biometrics and Kansei Engineering (ICBAKE), pp. 273–276, July 2013. https://doi.org/10.1109/ICBAKE.2013.53

  18. Sanchez, M.U.R.: Aspects of facial biometrics for verification of personal identity. Ph.D. thesis, University of Surrey, Guilford, UK (2000)

    Google Scholar 

  19. Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: a unified embedding for face recognition and clustering. CoRR abs/1503.03832 (2015). http://arxiv.org/abs/1503.03832

  20. Shang, D., Zhang, X., Xu, X.: Face and lip-reading authentication system based on android smart phones. In: 2018 Chinese Automation Congress (CAC), pp. 4178–4182, November 2018. https://doi.org/10.1109/CAC.2018.8623298

  21. Shi, X., Wang, S., Lai, J.: Visual speaker authentication by ensemble learning over static and dynamic lip details. In: 2016 IEEE International Conference on Image Processing (ICIP), pp. 3942–3946, September 2016. https://doi.org/10.1109/ICIP.2016.7533099

  22. Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: Deepface: closing the gap to human-level performance in face verification. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2014

    Google Scholar 

  23. Wright, C., Stewart, D., Miller, P., Campbell-West, F.: Investigation into DCT feature selection for visual lip-based biometric authentication. In: Dahyot, R., Lacey, G., Dawson-Howe, K., Pitié, F., Moloney, D. (eds.) Irish Machine Vision & Image Processing Conference Proceedings 2015, pp. 11–18. Irish Pattern Recognition & Classification Society, Dublin, Ireland (2015), winner of Best Student Paper Award

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Carrie Wright .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wright, C., Stewart, D. (2019). One-Shot-Learning for Visual Lip-Based Biometric Authentication. In: Bebis, G., et al. Advances in Visual Computing. ISVC 2019. Lecture Notes in Computer Science(), vol 11844. Springer, Cham. https://doi.org/10.1007/978-3-030-33720-9_31

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-33720-9_31

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-33719-3

  • Online ISBN: 978-3-030-33720-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics