Skip to main content
Log in

Sparse Illumination Learning and Transfer for Single-Sample Face Recognition with Image Corruption and Misalignment

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Single-sample face recognition is one of the most challenging problems in face recognition. We propose a novel algorithm to address this problem based on a sparse representation based classification (SRC) framework. The new algorithm is robust to image misalignment and pixel corruption, and is able to reduce required gallery images to one sample per class. To compensate for the missing illumination information traditionally provided by multiple gallery images, a sparse illumination learning and transfer (SILT) technique is introduced. The illumination in SILT is learned by fitting illumination examples of auxiliary face images from one or more additional subjects with a sparsely-used illumination dictionary. By enforcing a sparse representation of the query image in the illumination dictionary, the SILT can effectively recover and transfer the illumination and pose information from the alignment stage to the recognition stage. Our extensive experiments have demonstrated that the new algorithms significantly outperform the state of the art in the single-sample regime and with less restrictions. In particular, the single-sample face alignment accuracy is comparable to that of the well-known Deformable SRC algorithm using multiple gallery images per class. Furthermore, the face recognition accuracy exceeds those of the SRC and Extended SRC algorithms using hand labeled alignment initialization.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. In this paper, we use Viola-Jones face detector to initialize the face image location. As a result, we do not consider scenarios where the face may contain a large 3D transformation or large expression change. These more severe conditions can be addressed in the face detection stage using more sophisticated face models as we previously mentioned.

  2. In our previous work (Zhuang et al. 2013), this simple extension was in fact used as the solution to transfer both the alignment and illumination information from the alignment stage to the recognition stage. However, the assumption was valid because the illumination dictionary used in Zhuang et al. (2013) was constructed by concatenating the auxiliary images themselves, namely, \(D\) in this paper. Therefore, the problem of warping a learned dictionary was mitigated.

  3. The training are illuminations {0,1,7,13,14,16,18} in Multi-PIE Session 1.

  4. The implementation of SVDL was provided by their authors at: http://www4.comp.polyu.edu.hk/~cslzhang/code/SVDL.zip.

References

  • Aharon, M., Elad, M., & Bruckstein, A. (2006). The K-SVD: An algorithm for designing of overcomplete dictionaries for sparse representation. IEEE Transactions on Signal Processing, 54(11), 4311–4322.

    Article  Google Scholar 

  • Basri, R., & Jacobs, D. (2003). Lambertian reflectance and linear subspaces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(2), 218–233.

    Article  Google Scholar 

  • Belhumeur, P., Hespanda, J., & Kriegman, D. (1997). Eigenfaces vs. fisherfaces: Recognition using class specific linear projection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7), 711–720.

    Article  Google Scholar 

  • Boyd, S., Parikh, N., Chu, E., Peleato, B., & Eckstein, J. (2011). Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning, 3(1), 1–122.

    Article  Google Scholar 

  • Chen, X., Chen, M., Jin, X., & Zhao, Q. (2011). Face illumination transfer through edge-preserving filters. In Proceedings of the IEEE international conference on computer vision and pattern recognition.

  • Cootes, T., Edwards, G., & Taylor, C. (1998). Active appearance models. In Proceedings of the European conference on computer vision.

  • Cootes, T., Taylor, C., & Graham, J. (1995). Active shape models—Their training and application. Computer Vision and Image Understanding, 61, 38–59.

    Article  Google Scholar 

  • Deng, W., Hu, J., & Guo, J. (2012). Extended SRC: Undersampled face recognition via intraclass variant dictionary. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34, 1864–1870.

    Article  Google Scholar 

  • Do, C., & Ng, A. (2005). Transfer learning for text classification. In Proceedings of NIPS.

  • Elhamifar, E., & Vidal, R. (2012). Block-sparse recovery via convex optimization. IEEE Transactions on Signal Processing, 60, 4094–4107.

    Article  MathSciNet  Google Scholar 

  • Gabay, D., & Mercier, B. (1976). A dual algorithm for the solution of nonlinear variational problems via finite-element approximations. Computers and Mathematics with Applications, 2, 17–40.

    Article  MATH  Google Scholar 

  • Ganesh, A., Wagner, A., Wright, J., Yang, A., Zhou, Z., & Ma, Y. (2011). Face recognition by sparse representation. In Compressed Sensing: Theory and Applications. Cambridge University Press.

  • Gross, R., Mathews, I., Cohn, J., Kanade, T., & Baker, S. (2008). Multi-PIE. In Proceedings of the eighth IEEE international conference on automatic face and gesture recognition.

  • Gu, L., & Kanade, T. (2008). A generative shape regularization model for robust face alignment. In Proceedings of the European conference on computer vision.

  • Hager, G., & Belhumeur, P. (1998). Efficient region tracking with parametric models of geometry and illumination. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(10), 1025–1039.

    Article  Google Scholar 

  • Ho, J., Yang, M., Lim, J., Lee, K., & Kriegman, D. (2003). Clustering appearances of objects under varying illumination conditions. In Proceedings of the IEEE international conference on computer vision and pattern recognition.

  • Horn, R. A., & Johnson, C. R. (1985). Matrix analysis. New York: Cambridge University Press.

    Book  MATH  Google Scholar 

  • Huang, J., Huang, X., & Metaxas, D. (2008). Simultaneous image transformation and sparse representation recovery. In Proceedings of the IEEE international conference on computer vision and pattern recognition.

  • Lampert, C., Nickisch, H., & Harmeling, S. (2009). Learning to detect unseen object classes by between-class attribute transfer. In Proceedings of the IEEE international conference on computer vision and pattern recognition.

  • Lee, K., Ho, J., & Kriegman, D. (2005). Acquiring linear subspaces for face recognition under variable lighting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(5), 684–698.

    Article  Google Scholar 

  • Liang, L., Xiao, R., Wen, F., & Sun, J. (2008). Face alignment via component-based discriminative search. In Proceedings of the European conference on computer vision.

  • Lucas, B., & Kanade, T. (1981). An iterative image registration technique with an application to stereo vision. In Proceedings of international joint conference on artificial intelligence.

  • Peers, P., Tamura, N., Matusik, W., Debevec, P. (2007). Post-production facial performance relighting using reflectance transfer. In Proceedings of ACM SIGGRAPH.

  • Quattoni, A., Collins, M., & Darrell, T. (2008). Transfer learning for image classification with sparse prototype representations. In Proceedings of the IEEE international conference on computer vision and pattern recognition.

  • Saragih, J., Lucey, S., & Cohn, J. (2009). Face alignment through subspace constrained mean-shifts. In Proceedings of the IEEE international conference on computer vision.

  • Shashua, A., & Riklin-Raviv, T. (2001). The quotient image: Class-based re-rendering and recognition with varying illuminations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(2), 129–139.

    Article  Google Scholar 

  • Spielman, D. A., Wang, H., & Wright, J. (2012). Exact recovery of sparsely-used dictionaries. Journal of Machine Learning Research, 23(18), 1–37.

    Google Scholar 

  • Tseng, P. (1991). Applications of a splitting algorithm to decomposition in convex programming and variational inequalities. SIAM Journal on Control and Optimization, 29, 119–138.

    Article  MathSciNet  MATH  Google Scholar 

  • Viola, P., & Jones, J. (2004). Robust real-time face detection. International Journal on Computer Vision, 57, 137–154.

    Article  Google Scholar 

  • Wagner, A., Wright, J., Ganesh, A., Zhou, Z., Mobahi, H., & Ma, Y. (2012). Toward a practical face recognition: Robust pose and illumination via sparse representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(2), 372–386.

    Article  Google Scholar 

  • Wright, J., & Ma, Y. (2010). Dense error correction via \(\ell ^1\)-minimization. IEEE Transactions on Information Theory, 56(7), 3540– 3560.

  • Wright, J., Yang, A., Ganesh, A., Sastry, S., & Ma, Y. (2009). Robust face recognition via sparse representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(2), 210–227.

    Article  Google Scholar 

  • Yan, S., Liu, C., Li, S., Zhang, H., Shum, H., & Cheng, Q. (2003). Face alignment using texture-constrained active shape models. Image and Vision Computing, 21, 69–75.

    Article  Google Scholar 

  • Yan, S., Wang, H., Liu, J., Tang, X., & Huang, T. (2010). Misalignment-robust face recognition. IEEE Transactions on Image Processing, 19, 1087–1096.

    Article  MathSciNet  Google Scholar 

  • Yang, A., Zhou, Z., Ganesh, A., Sastry, S., & Ma, Y. (2013). Fast \(\ell _1\)-minimization algorithms for robust face recognition. IEEE Transactions on Image Processing, 22(8), 3234–3246.

  • Yang, M., Gool, L.V., & Zhang, L. (2013). Sparse variation dictionary learning for face recognition with a single training sample per person. In Proceedings of the IEEE international conference on computer vision.

  • Yang, M., Zhang, L., & Zhang, D. (2012). Efficient misalignment-robust representation for real-time face recognition. In Proceedings of the European conference on computer vision.

  • Zhang, L., Feng, M. Y. X., Ma, Y., & Feng, X. (2012). Collaborative representation based classification for face recognition. Technical report, arXiv:1204.2358.

  • Zhao, W., Chellappa, R., Phillips, J., & Rosenfeld, A. (2003). Face recognition: A literature survey. ACM Computing Surveys, 35(4), 399–458.

  • Zhuang, L., Yang, A. Y., Zhang, Z., Sastry, S., & Ma, Y. (2013). Single-sample face recognition with image corruption and misalignment via sparse illumination transfer. In Proceedings of the IEEE international conference on computer vision and pattern recognition, Portland, Oregon.

Download references

Acknowledgments

The work was supported in part by ARO 63092-MA-II, DARPA FA8650-11-1-7153, ONR N00014-09-1-0230, NSF CCF09-64215, NSFC No. 61103134 and 61371192, and the Science Foundation for Outstanding Young Talent of Anhui Province (BJ2101020001).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Allen Y. Yang.

Additional information

Communicated by Julien Mairal, Francis Bach, and Michael Elad.

A preliminary version of the results was published in Zhuang et al. (2013).

Appendix

Appendix

We proof Theorem 1 in this Appendix. First, eliminating the variable \(E\) of problem (10) with

$$\begin{aligned} E= {D} - \overline{{V}}\otimes {\mathbf {1}}^T - \overline{{C}}{H}, \end{aligned}$$
(21)

Problem (10) can then be equivalently written as

$$\begin{aligned} \min _{\overline{{V}},\overline{{C}}, {H}} \,\,\Vert {D} - \overline{{V}}\otimes {\mathbf {1}}^T - \overline{{C}}{H}\Vert _F^2,\,\mathrm{s.t.\,} \overline{{C}}^T\overline{{C}} = {I}. \end{aligned}$$
(22)

As a basic result in least squares (Horn and Johnson 1985), the optimal \({H}\) can be written as

$$\begin{aligned} {H}^* = \overline{{C}}^T({D} - \overline{{V}}\otimes {\mathbf {1}}^T), \end{aligned}$$
(23)

for any \(\overline{{V}}\in \mathbb {R}^{m\times p}\) and any \(\overline{{C}}\in \mathbb {R}^{m\times k}\) such that \(\overline{{C}}^T\overline{{C}} = {I}\). Substituting \({H}^*\) into (22) yields

$$\begin{aligned} \min _{\overline{{V}},\overline{{C}}} \,\,\Vert {P}_{\overline{{C}}}^\perp ({D} - \overline{{V}}\otimes {\mathbf {1}}^T)\Vert _F^2,\,\mathrm{s.t.\,} \overline{{C}}^T\overline{{C}} = {I}, \end{aligned}$$
(24)

where \({P}_{\overline{{C}}}^\perp = {I} - \overline{{C}}\overline{{C}}^T\) denotes the orthogonal complement projector of \(\overline{{C}}\). It is also easy to show from (24) that a solution of \(\overline{{V}}\) is

$$\begin{aligned}{}[\overline{{V}}^*]_i = \frac{1}{n} {D}_i{\mathbf {1}},\,i=1,...,p. \end{aligned}$$
(25)

Note that the solution \([\overline{{V}}^*]_i\) presents the mean vector of the data matrix \({D}_i\) corresponding to subject \(i\). Furthermore, by letting \({U} = {D} - \overline{{V}}^*\otimes {\mathbf {1}}^T\), problem (24) becomes \(\min _{\overline{{C}}^T\overline{{C}} = {I}} \mathrm{trace}({U}^T{P}_{\overline{{C}}}^\perp {U})\), and it is equivalent to

$$\begin{aligned} \overline{{C}}^* = \mathrm{arg}\max _{\overline{{C}}^T\overline{{C}} = {I}} \mathrm{trace}(\overline{{C}}^T{U}{U}^T\overline{{C}}). \end{aligned}$$
(26)

By Horn and Johnson (1985), an optimal solution \(\overline{{C}}^*\) is known to be the \(k\) principal eigenvector matrix of \({U}{U}^T\); i.e.,

$$\begin{aligned} \overline{{C}}^* = [~ \varvec{q}_1( {U}{U}^T ), \varvec{q}_2( {U}{U}^T ), \ldots , \varvec{q}_{k}( {U}{U}^T ) ~]. \end{aligned}$$
(27)

Hence, the problem solution (13) simply follows from (21), (23) (25), and (27). \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhuang, L., Chan, TH., Yang, A.Y. et al. Sparse Illumination Learning and Transfer for Single-Sample Face Recognition with Image Corruption and Misalignment. Int J Comput Vis 114, 272–287 (2015). https://doi.org/10.1007/s11263-014-0749-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-014-0749-x

Keywords

Navigation