Abstract
Lip-based biometric authentication is the process of verifying an individual’s identity based on visual information taken from lips whilst speaking. To date research in this area has involved more traditional approaches and inconsistent results that are difficult to compare. This work aims to push the field forward through the application of deep learning. A deep artificial neural network using spatiotemporal convolutional and bidirectional gated recurrent unit layers is trained end-to-end. For the first time one-shot-learning is applied to lip-based biometric authentication by implementing a siamese network architecture, meaning the model only needs a single prior example in order to authenticate new users. This approach sets a new state-of-the-art performance for lip-based biometric authentication on the XM2VTS dataset and Lausanne protocol with an equal error rate of 0.93% on the evaluation set and a false acceptance rate of 1.07% at a 1% false rejection rate.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ahmed, E., Jones, M., Marks, T.K.: An improved deep learning architecture for person re-identification. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3908–3916, June 2015. https://doi.org/10.1109/CVPR.2015.7299016
Assael, Y.M., Shillingford, B., Whiteson, S., de Freitas, N.: Lipnet: sentence-level lipreading. CoRR abs/1611.01599 (2016). http://arxiv.org/abs/1611.01599
Brand, J.: Visual speech for speaker recognition and robust face detection. Ph.D. thesis, University of Wales, Swansea, UK (2001)
Cetingul, H.E., Yemez, Y., Erzin, E., Tekalp, A.M.: Discriminative analysis of lip motion features for speaker identification and speech-reading. Trans. Img. Proc. 15(10), 2879–2891 (2006). https://doi.org/10.1109/TIP.2006.877528
Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. CoRR abs/1412.3555 (2014)
Faraj, M., Bigun, J.: Motion features from lip movement for person authentication. In: 2006 18th International Conference on Pattern Recognition, ICPR 2006, vol. 3, pp. 1059–1062 (2006). https://doi.org/10.1109/ICPR.2006.814
Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional LSTM networks. In: Proceedings 2005 IEEE International Joint Conference on Neural Networks, vol. 4, pp. 2047–2052, July 2005. https://doi.org/10.1109/IJCNN.2005.1556215
Kazemi, V., Sullivan, J.: One millisecond face alignment with an ensemble of regression trees. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1867–1874, June 2014
Kittler, J., Li, Y.P., Matas, J., Sánchez, M.U.R.: Lip-shape dependent face verification. In: Bigün, J., Chollet, G., Borgefors, G. (eds.) AVBPA 1997. LNCS, vol. 1206, pp. 61–68. Springer, Heidelberg (1997). https://doi.org/10.1007/BFb0015980
Koch, G., Zemel, R., Salakhutdinov, R.: Siamese neural networks for one-shot image recognition. In: ICML deep learning workshop, vol. 2 (2015)
Lu, L., et al.: Lip reading-based user authentication through acoustic sensing on smartphones. IEEE/ACM Trans. Networking 27(1), 447–460 (2019). https://doi.org/10.1109/TNET.2019.2891733
Lu, Z., Wu, X., He, R.: Person identification from lip texture analysis. In: 2016 IEEE International Conference on Digital Signal Processing (DSP), pp. 472–476, October 2016. https://doi.org/10.1109/ICDSP.2016.7868602
Lucey, S.: An evaluation of visual speech features for the tasks of speech and speaker recognition. In: Kittler, J., Nixon, M.S. (eds.) AVBPA 2003. LNCS, vol. 2688, pp. 260–267. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-44887-X_31
Luettin, J., Maître, G.: Evaluation protocol for the extended M2VTS database (XM2VTSDB). Idiap-Com Idiap-Com-05-1998, IDIAP (1998)
Messer, K., Matas, J., Kittler, J., Jonsson, K.: Xm2vtsdb: the extended m2vts database. In: In Second International Conference on Audio and Video-based Biometric Person Authentication, pp. 72–77 (1999)
Morikawa, S., Ito, S., Ito, M., Fukumi, M.: Personal authentication by lips EMG using dry electrode and CNN. In: 2018 IEEE International Conference on Internet of Things and Intelligence System (IOTAIS), pp. 180–183, November 2018. https://doi.org/10.1109/IOTAIS.2018.8600859
Nakata, T., Kashima, M., Sato, K., Watanabe, M.: Lip-sync personal authentication system using movement feature of lip. In: 2013 International Conference on Biometrics and Kansei Engineering (ICBAKE), pp. 273–276, July 2013. https://doi.org/10.1109/ICBAKE.2013.53
Sanchez, M.U.R.: Aspects of facial biometrics for verification of personal identity. Ph.D. thesis, University of Surrey, Guilford, UK (2000)
Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: a unified embedding for face recognition and clustering. CoRR abs/1503.03832 (2015). http://arxiv.org/abs/1503.03832
Shang, D., Zhang, X., Xu, X.: Face and lip-reading authentication system based on android smart phones. In: 2018 Chinese Automation Congress (CAC), pp. 4178–4182, November 2018. https://doi.org/10.1109/CAC.2018.8623298
Shi, X., Wang, S., Lai, J.: Visual speaker authentication by ensemble learning over static and dynamic lip details. In: 2016 IEEE International Conference on Image Processing (ICIP), pp. 3942–3946, September 2016. https://doi.org/10.1109/ICIP.2016.7533099
Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: Deepface: closing the gap to human-level performance in face verification. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2014
Wright, C., Stewart, D., Miller, P., Campbell-West, F.: Investigation into DCT feature selection for visual lip-based biometric authentication. In: Dahyot, R., Lacey, G., Dawson-Howe, K., Pitié, F., Moloney, D. (eds.) Irish Machine Vision & Image Processing Conference Proceedings 2015, pp. 11–18. Irish Pattern Recognition & Classification Society, Dublin, Ireland (2015), winner of Best Student Paper Award
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Wright, C., Stewart, D. (2019). One-Shot-Learning for Visual Lip-Based Biometric Authentication. In: Bebis, G., et al. Advances in Visual Computing. ISVC 2019. Lecture Notes in Computer Science(), vol 11844. Springer, Cham. https://doi.org/10.1007/978-3-030-33720-9_31
Download citation
DOI: https://doi.org/10.1007/978-3-030-33720-9_31
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33719-3
Online ISBN: 978-3-030-33720-9
eBook Packages: Computer ScienceComputer Science (R0)