Skip to main content
Log in

Scalable Semi-Supervised Clustering for Face Recognition with Insufficient Labelled Samples

  • APPLICATION PROBLEMS
  • Published:
Pattern Recognition and Image Analysis Aims and scope Submit manuscript

Abstract

Face recognition is an effortless job for humans; however, it is computationally challenging as it is difficult to develop a computational model for recognizing faces. It becomes more challenging especially when the number of labeled examples of the face to be recognized is insufficient. Moreover, the size of the face databases is increasing rapidly and most of the efficient face recognition systems fail in finding similar faces in the database. This paper addresses the problem of performing effective face recognition when labeled examples for the face we wish to recognize is very few. We address the problem as a semi-supervised classification based on available labeled data. We make use of the holistic approach for face recognition, where the entire face is taken into account, and in our approach the facial images are represented based on a probabilistic Gaussian mixture model (GMM). We have also proposed a less expensive scaling technique for scaling the algorithm to large databases. The proposed method is experimentally evaluated on benchmark datasets. The results show the effectiveness of using semi-supervised learning to aid the recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.
Fig. 7.

Similar content being viewed by others

REFERENCES

  1. E. Bair, “Semi-supervised clustering methods,” WIREs Comput. Stat. 5, 349–361 (2013). https://doi.org/10.1002/wics.1270

    Article  Google Scholar 

  2. E. Bair and R. Tibshirani, “Semi-supervised methods to predict patient survival from gene expression data,” PLoS Biol. 2, e108 (2004). https://doi.org/10.1371/journal.pbio.0020108

    Article  Google Scholar 

  3. S. Basu, M. Bilenko, and R. J. Mooney, “A probabilistic framework for semi-supervised clustering”, in Proc. Tenth ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, Seattle, Wash., 2004 (Association for Computing Machinery, New York, 2004), pp. 59–68. https://doi.org/10.1145/1014052.1014062

  4. V. Bruce and A. Young, “Understanding face recognition,” Br. J. Psychol. 77, 305–327 (1986). https://doi.org/10.1111/j.2044-8295.1986.tb02199.x

    Article  Google Scholar 

  5. A. Dahmouni, N. Aharrane, K. El Moutaouakil, and K. Satori, “A face recognition based biometric solution in education,” Pattern Recognit. Image Anal. 28, 758–770 (2018). https://doi.org/10.1134/S1054661818040065

    Article  Google Scholar 

  6. C. F. Eick, N. Zeidat, and Z. Zhao, “Supervised clustering-algorithms and benefits,” in IEEE Int. Conf. on Tools with Artificial Intelligence, Boca Raton, Fla., 2004 (IEEE, 2004), pp. 774–776. https://doi.org/10.1109/ICTAI.2004.111

  7. Face Database. http://cswww.essex.ac.uk/mv/allfaces/faces94.html

  8. Y. Gao, J. Ma, and A. L. Yuille, “Semi-supervised sparse representation-based classification for face recognition with insufficient labeled samples,” IEEE Trans. Image Process. 26, 2545–2560 (2017). https://doi.org/10.1109/TIP.2017.2675341

    Article  MathSciNet  MATH  Google Scholar 

  9. H. Greenspan, J. Goldberger, and L. Ridel, “A continuous probabilistic framework for image matching,” Comput. Vision Image Understanding 84, 384–406 (2001). https://doi.org/10.1006/cviu.2001.0946

    Article  MATH  Google Scholar 

  10. J. Goldberger, H. Greenspan, and S. Gordon, “Unsupervised image clustering using the information bottleneck method,” in Pattern Recognition. DAGM 2002, Ed. by L. Van Gool, Lecture Notes in Computer Science, vol. 2449 (Springer, Berlin, 2002), pp. 158–165. https://doi.org/10.1007/3-540-45783-6_20

    Book  MATH  Google Scholar 

  11. J. Goldberger, S. Gordon, and H. Greenspan, “An efficient image similarity measure based on approximations of KL-divergence between two Gaussian mixtures,” in Proc. Ninth IEEE Int. Conf. on Computer Vision, Nice, 2003 (IEEE, 2003), pp. 487–493. https://doi.org/10.1109/ICCV.2003.1238387

  12. L. Gu, T. Zhang, and X. Ding, “Clustering consumer photos based on face recognition,” in IEEE Int. Conf. on Multimedia and Expo, Beijing, 2007 (IEEE, 2007), pp. 1998–2001. https://doi.org/10.1109/ICME.2007.4285071

  13. A. K. Jain and S. Z. Li, Handbook of Face Recognition, 2nd ed. (Springer, London, 2011). https://doi.org/10.1007/978-0-85729-932-1

    Book  MATH  Google Scholar 

  14. A. K. Jain, and R. C. Dubes, Algorithms for Clustering Data (Prentice-Hall, Englewood Cliffs, N.J., 1988).

    MATH  Google Scholar 

  15. M. Kyperountas, A. Tefas, and I. Pitas, “Dynamic training using multistage clustering for face recognition,” Pattern Recognit. 41, 894–905 (2008). https://doi.org/10.1016/j.patcog.2007.06.017

    Article  MATH  Google Scholar 

  16. J. Li and D. Zhang, “Face gesture recognition based on clustering algorithm”, in Chinese Control and Decision Conf. (CCDC), Nanchang, China, 2019 (IEEE, 2019), pp. 2008–2012.  https://doi.org/10.1109/CCDC.2019.8833105

  17. J. Lu, X. Yuan, and T. Yahagi, “A method of face recognition based on fuzzy C-means clustering and associated sub-NNs,” IEEE Trans. Neural Networks 18, 150–160 (2007).  https://doi.org/10.1109/TNN.2006.884678

    Article  Google Scholar 

  18. S. Miyamoto and A. Terami, “Semi-supervised agglomerative hierarchical clustering algorithms with pairwise constraints,” in Int. Conf. on Fuzzy Systems, Barcelona, 2010 (IEEE, 2010), pp. 1–6.  https://doi.org/10.1109/FUZZY.2010.5584625

  19. R. T. Ng, and J. Han, “CLARANS: A method for clustering objects for spatial data mining,” IEEE Trans. Knowl. Data Eng. 14, 1003–1016 (2012).  https://doi.org/10.1109/TKDE.2002.1033770

    Article  Google Scholar 

  20. C. R. Palmer, and C. Faloutsos, “Density biased sampling: An improved method for data mining and clustering,” in Proc. 2000 ACM SIGMOD Int. Conf. on Management of Data, Dallas, 2000 (Association for Computing Machinery, New York, 2000), pp. 82–92.  https://doi.org/10.1145/342009.335384

  21. Parallel computing toolbox documentation. http://in.mathworks.com/

  22. D. Rim, K. Hassan, and C. J. Pal, “Semi-supervised learning for wild faces and video,” in Proc. 22nd British Machine Vision Conference (BMVC), Dundee, UK, 2011 (BMVA Press, 2011), pp. 3.1–3.12.  https://doi.org/10.5244/C.25.3

  23. L. Rokach, and O. Maimon, “Clustering methods,” Data Mining and Knowledge Discovery Handbook, Ed. by O. Maimon and L. Rokach (Springer, Boston, 2005), pp. 321–352.  https://doi.org/10.1007/0-387-25465-X_15

    Book  MATH  Google Scholar 

  24. D. Sculley, “Web-scale k-means clustering,” in Proc. 19th Int. Conf. on World Wide Web, Raleigh, N.C., 2010 (Association for Computing Machinery, New York, 2010), pp. 1177–1178.  https://doi.org/10.1145/1772690.1772862

  25. B. Thiesson, C. Meek, and D. Heckerman, “Accelerating EM for large databases,” Mach. Learn. 45, 279–299 (2001).  https://doi.org/10.1023/A:1017986506241

    Article  MATH  Google Scholar 

  26. K. Wagstaff and C. Cardi, “Constrained k-means clustering with background knowledge”, in Proc. Eighteenth Int. Conf. on Machine Learning, Williamstown, Mass., 2001, Ed. by C. E. Brodley and A. P. Danyluk (Morgan Kaufmann, San Francisco, 2001), pp. 577–584.

  27. B. Zhang, Y. Gao, S. Zhao, and J. Liu, “Local derivative pattern versus local binary pattern: Face recognition with high-order local pattern descriptor,” IEEE Trans. Image Process. 19, 533–544 (2010). https://doi.org/10.1109/TIP.2009.2035882

    Article  MathSciNet  MATH  Google Scholar 

  28. T. Zhang, R. Ramakrishnan, and M. Livny, “BIRCH: An efficient data clustering method for very large databases,” ACM SIGMOD Record 25, 103–114 (1996).  https://doi.org/10.1145/235968.233324

    Article  Google Scholar 

  29. W. Zhao, R. Chellappa, P. J. Phillips, and A. Rosenfeld, “Face recognition: A literature survey,” ACM Comput. Surv. 35, 399–458 (2003).  https://doi.org/10.1145/954339.954342

    Article  Google Scholar 

  30. Y. Zhao, and G. Karypsis, “Evaluation of hierarchical clustering algorithms for document datasets”, in Proc. Eleventh Int. Conf. on Information and Knowledge Management, McLean, Va., 2002 (Association for Computing Machinery, New York, 2002), pp. 515–524.  https://doi.org/10.1145/584792.584877

  31. S. Zeng, R. Huang, Z. Kang, and N. Sang, “Image segmentation using spectral clustering of Gaussian mixture models,” Neurocomputing 144, 346–356 (2014).  https://doi.org/10.1016/j.neucom.2014.04.037

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to S. Nish Chandran or Durgaprasad Gangodkar.

Ethics declarations

COMPLIANCE WITH ETHICAL STANDARDS

This article is a completely original work of its authors; it has not been published before and will not be sent to other publications until the PRIA Editorial Board decides not to accept it for publication.

Conflict of Interest

The authors declare that they have no conflicts of interest.

Additional information

Durgaprasad Gangodkar received the B.E. degree in electronics and communication engineering from Karnatak University, Dharwad, India; M.Tech. degree in computer network engineering from Visvesvaraya Technological University, Belgaum, Karnataka, India; and a Ph.D. degree in Electronics and Computer Engineering from the Indian Institute of Technology (IIT), Roorkee, India. For around six years, he was with Siemens Ltd., Mumbai, India, where he worked on many prestigious projects related to industrial automation. He was involved in engineering and software development for computerized process plants using distributed control systems and real-time operating systems. He has been teaching for about nine years. He is currently with the Department of Computer Science and Engineering, Graphic Era University, Dehradun, India. He is the author or a co-author of papers that have been published in IEEE Transactions. His research interests include high-performance computing, computer vision, video analytics, and mobile agents.

Nisha Chandran S. received her Master of Computer Applications (MCA) degree from Amrita University, Coimbatore, India; M.Tech. and a PhD degree in Computer Science and Engineering from Graphic Era University, Dehradun, India. She is currently working with the School of Computing, Graphic Era Hill University. She has been the author or a co-author of many research papers in journals and conference proceedings of high repute with a significant impact factor, many of which are Springer or Elsevier journals. Her research interests include image and video processing, content-based retrieval, pattern recognition, machine learning, and high-performance computing.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nish Chandran, S., Durgaprasad Gangodkar Scalable Semi-Supervised Clustering for Face Recognition with Insufficient Labelled Samples. Pattern Recognit. Image Anal. 32, 373–383 (2022). https://doi.org/10.1134/S1054661822020055

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1134/S1054661822020055

Keywords:

Navigation