Probabilistic robust regression with adaptive weights — a case study on face recognition

Li, Jin; Chen, Quan; Leng, Jingwen; Zhang, Weinan; Guo, Minyi

doi:10.1007/s11704-019-9097-x

Probabilistic robust regression with adaptive weights — a case study on face recognition

Research Article
Published: 20 January 2020

Volume 14, article number 145314, (2020)
Cite this article

Frontiers of Computer Science Aims and scope Submit manuscript

Jin Li¹,
Quan Chen¹,
Jingwen Leng¹,
Weinan Zhang¹ &
…
Minyi Guo¹

75 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Robust regression plays an important role in many machine learning problems. A primal approach relies on the use of Huber loss and an iteratively reweighted ℓ₂ method. However, because the Huber loss is not smooth and its corresponding distribution cannot be represented as a Gaussian scale mixture, such an approach is extremely difficult to handle using a probabilistic framework. To address those limitations, this paper proposes two novel losses and the corresponding probability functions. One is called Soft Huber, which is well suited for modeling non-Gaussian noise. Another is Nonconvex Huber, which can help produce much sparser results when imposed as a prior on regression vector. They can represent any ℓ_q loss (\({1 \over 2}\) ⩽ q < 2) with tuning parameters, which makes the regression model more robust. We also show that both distributions have an elegant form, which is a Gaussian scale mixture with a generalized inverse Gaussian mixing density. This enables us to devise an expectation maximization (EM) algorithm for solving the regression model. We can obtain an adaptive weight through EM, which is very useful to remove noise data or irrelevant features in regression problems. We apply our model to the face recognition problem and show that it not only reduces the impact of noise pixels but also removes more irrelevant face images. Our experiments demonstrate the promising results on two datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Comprehensive Survey of Loss Functions in Machine Learning

Article 12 April 2020

Overfitting, Model Tuning, and Evaluation of Prediction Performance

Machine Learning Optimization Techniques: A Survey, Classification, Challenges, and Future Research Issues

Article 29 March 2024

References

Andersen R. Modern Methods for Robust Regression. Sage, 2008
Ben-Gal I. Outlier Detection. Data Mining and Knowledge Discovery Handbook. Springer, 2010, 117–130
Stigler S M. Gauss and the invention of least squares. The Annals of Statistics, 1981, 9(3): 465–474
Article MathSciNet Google Scholar
Rousseeuw P J, Hubert M. Robust statistics for outlier detection. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2011, 1(1): 73–79
Google Scholar
Huber P J. Robust regression: asymptotics, conjectures and monte carlo. The Annals of Statistics, 1973, 1(5): 799–821
Article MathSciNet Google Scholar
Huber P J, Ronchetti E M. Robust Statistics. 2nd ed. New Jersey: John Wiley & Sons, 2009
Book Google Scholar
Hartley R, Zisserman A. Multiple View Geometry in Computer Vision. 2nd ed. Cambridge: Cambridge University Press, 2004
Book Google Scholar
Figueiredo M. Adaptive sparseness using jeffreys prior. In: Dietterich T G, Becker S, Ghahramani Z, eds. Advances in Neural Information Processing Systems. MIT Press, 2002, 697–704
Kabán A. On Bayesian classification with laplace priors. Pattern Recognition Letters, 2007, 28(10): 1271–1282
Article Google Scholar
Lange K L, Little R J A, Taylor J M G. Robust statistical modeling using the t distribution. Journal of the American Statistical Association, 1989, 84(408): 881–896
MathSciNet Google Scholar
Jylänki P, Vanhatalo J, Vehtari A. Robust gaussian process regression with a student-t likelihood. Journal of Machine Learning Research, 2011, 12: 3227–3257
MathSciNet MATH Google Scholar
Lange K, Sinsheimer J S. Normal/independent distributions and their applications in robust regression. Journal of Computational and Graphical Statistics, 1993, 2(2): 175–198
MathSciNet Google Scholar
Gao M, Wang K, He L. Probabilistic model checking and scheduling implementation of an energy router system in energy internet for green cities. IEEE Transactions on Industrial Informatics, 2018, 14(4): 1501–1510
Article Google Scholar
Bernardo J M, Smith A F M. Bayesian Theory. New York: John Willey and Sons, 1994
Book Google Scholar
Xu L, Jordan M I. On convergence properties of the EM algorithm for gaussian mixtures. Neural Computation, 1996, 8(1): 129–151
Article Google Scholar
Naseem I, Togneri R, Bennamoun M. Robust regression for face recognition. Pattern Recognition, 2012, 45(1): 104–118
Article Google Scholar
Yang M, Zhang L, Yang J, Zhang D. Robust sparse coding for face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2011, 625–632
Hu H, Wang K, Lv C, Wu J, Yang Z. Semi-supervised metric learning-based anchor graph hashing for large-scale image retrieval. IEEE Transactions on Image Processing, 2019, 28(2): 739–754
Article MathSciNet Google Scholar
Huber P J. Robust Estimation of a Location Parameter. Breakthroughs in Statistics. Springer, New York, 1992, 492–518
Google Scholar
Tibshirani R. Regression shrinkage and selection via the lasso: a retrospective. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2011, 73(3): 273–282
Article MathSciNet Google Scholar
Wu C F J. On the convergence properties of the EM algorithm. Annals of Statistics, 1983, 11: 95–103
Article MathSciNet Google Scholar
Wright J, Yang A Y, Ganesh A, Sastry S S, Ma Y. Robust face recognition via sparse representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008, 31(2): 210–227
Article Google Scholar
Martinez A M. The AR face database. CVC Technical Report, 1998
Georghiades A S, Belhumeur P N, Kriegman D J. From few to many: illumination cone models for face recognition under variable lighting and pose. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001, 23(6): 643–660
Article Google Scholar
Lee K C, Ho J, Kriegman D J. Acquiring linear subspaces for face recognition under variable lighting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(5): 684–698
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems. 2012, 1097–1105
Yang M, Zhang L, Yang J, Zhang D. Regularized robust coding for face recognition. IEEE Transactions on Image Processing, 2013, 22(5): 1753–1766
Article MathSciNet Google Scholar
Andrews D F, Mallows C L. Scale mixtures of normal distributions. Journal of the Royal Statistical Society: Series B (Methodological), 1974, 36(1): 99–102
MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Shanghai Jiao Tong University, Shanghai, 200240, China
Jin Li, Quan Chen, Jingwen Leng, Weinan Zhang & Minyi Guo

Authors

Jin Li
View author publications
You can also search for this author in PubMed Google Scholar
Quan Chen
View author publications
You can also search for this author in PubMed Google Scholar
Jingwen Leng
View author publications
You can also search for this author in PubMed Google Scholar
Weinan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Minyi Guo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Minyi Guo.

Additional information

Jin Li is now a PhD student in the department of Computer science of Shanghai Jiao Tong University, China. He received the BS degree in computer science from East China University of Science and Technology, China. His research interests include machine learning and data mining, particularly, statistical methods and deep learning techniques for real-world applications, such as face recognition, software auto-tuning and recommender systems.

Quan Chen received the BS degree in computer science from the Tongji University, China, the MS and PhD degrees in computer science from the Shanghai Jiao Tong University, China in 2007, 2009, and 2014. From 2014 to 2016, he was a postdoctoral researcher in the Department of Computer Science, University of Michigan — Ann Arbor, USA. He is now a tenure-track associate professor in the Department of Computer Science and Engineering, Shanghai Jiao Tong University, China. His research interests include parallel and distributed processing, task scheduling, cloud computing, datacenter management and accelerator management.

Jingwen Leng is an assistant professor in the John Hopcroft Computer Science Center and Computer Science & Engineering Department at Shanghai Jiao Tong University, China. He received his PhD from the University of Texas at Austin, USA, where he focused on improving the efficiency and resiliency of general-purpose GPUs. He is currently interested at the interaction among system, architecture, and deep learning.

Weinan Zhang is now a tenure-track assistant professor at Shanghai Jiao Tong University, China. His research interests include machine learning and big data mining, particularly, deep learning and reinforcement learning techniques for real-world data mining scenarios, such as computational advertising, recommender systems, text mining, web search and knowledge graphs. He has published over 70 papers on international conferences and journals including KDD, WWW, SIGIR, IJCAI, AAAI, ICML, ICLR, JMLR, TKDE etc.

Minyi Guo received the BS and ME degrees in computer science from the Nanjing University, China, and the PhD degree in information science from the University of Tsukuba, Japan in 1982, 1986, and 1998 respectively. From 1998 to 2000, he was a research associate of NEC Soft, Ltd. Japan. He was a visiting professor in the Department of Computer Science, Georgia Institute of Technology, USA. In addition, he was a full professor with The University of Aizu, Japan and is Head of the Department of Computer Science and Engineering, Shanghai Jiao Tong University, China. He is a fellow of the IEEE and has published more than 200 papers in well-known conferences and journals. His main interests include automatic parallelization and data-parallel languages, bioinformatics, compiler optimization and high-performance computing.

Electronic Supplementary Material