Skip to main content
Log in

Regularized Matrix-Pattern-Oriented Classification Machine with Universum

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Regularization has the ability to effectively improve the generalization performance, which is due to its control for model complexity via priori knowledge. Matrixized learning as one kind of regularization methods can help improving classification accuracy and reducing computational complexity when dealing with matrix data. This success is attributed to the exploitation of structural knowledge of matrix data. This paper generalizes the matrixized learning through taking the advantage of the Universum data which does not belong to any class of interest in classification problems. The generalized method can not only keep the structural knowledge of the matrix data themselves, but also acquire a priori domain knowledge from the whole data distribution. In implementation, we incorporate the previous matrixized work MatMHKS with the Universum strategy, and develop a novel regularized matrix-pattern-oriented classification machine named UMatMHKS. The subsequential experiments have validated the effectiveness of the proposed UMatMHKS. The results has shown that the proposed UMatMHKS achieved an improvement in classification accuracy of 1.52% over the MatMHKS and 3.20% over the USVM on UCI benchmark datasets. The UMatMHKS also has a shorter average running time of 0.41 s over the 0.71 s from the MatMHKS on UCI datasets. Three main characteristics of UMatMHKS lie in: (1) making full use of the domain knowledge of the whole data distribution as well as inheriting the advantages of the matrixized learning; (2) applying Universum learning into the matrixized learning framework; (3) owning a tighter generalization risk bound.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. http://sun16.cecs.missouri.edu/pgader/CECS477/NNdigits.zip.

  2. http://www.cam-orl.co.uk.

  3. http://yann.lecun.com/exdb/mnist/.

  4. http://www.cs.columbia.edu/CAVE/coil-20.html.

References

  1. Chen H, Li L, Peng J (2009) Error bounds of multi-graph regularized semi-supervised classification. Inf Sci 179(12):1960–1969

    Article  MathSciNet  MATH  Google Scholar 

  2. Hagiwara K, Kuno K (2000) Regularization learning and early stopping in linear networks. In: IEEE international joint conference on neural networks, vol 4, pp 511–516

  3. Mario R, Antonio I, Raimundas J, Rosanna V (2013) Fuzzy regularized generalized eigenvalue classifier with a novel membership function. Inf Sci 245:53–62

    Article  MathSciNet  MATH  Google Scholar 

  4. Pan Z, You X, Chen H, Tao D, Pang B (2013) Generalization performance of magnitude-preserving semi-supervised ranking with graph-based regularization. Inf Sci 221:284–296

    Article  MathSciNet  MATH  Google Scholar 

  5. Reshma K, Suresh C (2008) Regularized least squares support vector regression for the simultaneous learning of a function and its derivatives. Inf Sci 178(17):3402–3414

    Article  MATH  Google Scholar 

  6. Shi X, Yang Y, Guoa Z, Lai Z (2014) Face recognition by sparse discriminant analysis via joint l2,1-norm minimization. Pattern Recognit 47(7):2447–2453

    Article  Google Scholar 

  7. Zhao H, Wong W (2014) Regularized discriminant entropy analysis. Pattern Recognit 47(2):806–819

    Article  MATH  Google Scholar 

  8. Ivanov V, Ivanov V, Vasin V, Tanana V (2002) Theory of linear ill-posed problems and its applications. Walter de Gruyter & Co, Berlin

    Book  MATH  Google Scholar 

  9. Wang Z, Chen S, Gao D (2011) A novel multi-view learning developed from single-view patterns. Pattern Recognit 44(10):2395–2413

    Article  MATH  Google Scholar 

  10. Pumplin J, Stump D, Tung W (2002) Multivariate fitting and the error matrix in global analysis of data. Phys Rev D 65(1):014011

    Article  Google Scholar 

  11. Chen S, Wang Z, Tian Y (2007) Matrix-pattern-oriented ho-kashyap classifier with regularization learning. Pattern Recognit 40(5):1533–1543

    Article  MATH  Google Scholar 

  12. Chen S, Zhu Y, Zhang D, Yang J (2005) Feature extraction approaches based on matrix pattern: Matpca and matflda. Pattern Recognit Lett 26(8):1157–1167

    Article  Google Scholar 

  13. Nanni L, Brahnam S, Lumini A (2012) Matrix representation in pattern classification. Expert Syst Appl 39(3):3031–3036

    Article  Google Scholar 

  14. Wang H, Ahuja N (2005) Rank-r approximation of tensors using image-as-matrix representation. In: IEEE Computer Society conference on computer vision and pattern recognition, vol 2, pp 346–353

  15. Yang J, Zhang D, Frangi A, Yang J (2004) Two-dimensional pca: a new approach to appearance-based face representation and recognition. IEEE Trans Pattern Anal Mach Intell 26(1):131–137

    Article  Google Scholar 

  16. Li J, Janardan R, Li Q (2004) Two-dimensional linear discriminant analysis. Adv Neural Inf Process Syst 17:1569–1576

    Google Scholar 

  17. Abdi H, Williams L (2010) Principal component analysis. Wiley Interdiscip Rev 2(4):433–459

    Article  Google Scholar 

  18. Zhang D, Jing X, Yang J (2006) Biometric image discrimination technologies. Idea Group Publishing, Hershey

    Book  Google Scholar 

  19. Felzenszwalb P, McAuley J (2011) Fast inference with min-sum matrix product. IEEE Trans Pattern Anal Mach Intell 33(12):2549–2554

    Article  Google Scholar 

  20. Seung D, Lee L (2001) Algorithms for non-negative matrix factorization. Adv Neural Inf Process Syst 13:556–562

    Google Scholar 

  21. Gao B, Woo W, Dlay S (2012) Variational regularized 2-d nonnegative matrix factorization. IEEE Trans Neural Netw Learn Syst 23(5):703–716

    Article  Google Scholar 

  22. Cai D, He X, Han J, Huang T (2011) Graph regularized nonnegative matrix factorization for data representation. IEEE Trans Pattern Anal Mach Intell 33(8):1548–1560

    Article  Google Scholar 

  23. Guan N, Tao D, Luo Z, Yuan B (2012) Online nonnegative matrix factorization with robust stochastic approximation. IEEE Trans Neural Netw Learn Syst 23(7):1087–1099

    Article  Google Scholar 

  24. Li H, Chen N, Li L (2012) Error analysis for matrix elastic-net regularization algorithms. IEEE Trans Neural Netw Learn Syst 23(5):737–748

    Article  Google Scholar 

  25. Weston J, Collobert R, Sinz F, Bottou L, Vapnik V (2006) Inference with the universum. In: International conference on machine learning, pp 1009–1016

  26. Yang L, Hanneke S, Carbonell J (2013) A theory of transfer learning with applications to active learning. Mach Learn 90(2):161–189

    Article  MathSciNet  MATH  Google Scholar 

  27. Luo Y, Liu T, Tao D, Xu C (2014) Decomposition-based transfer distance metric learning for image classification. IEEE Trans Image Process 23(9):3789–3801

    Article  MathSciNet  Google Scholar 

  28. Luo Y, Wen Y, Tao D, Gui J, Xu C (2016) Large margin multi-modal multi-task feature extraction for image classification. IEEE Trans Image Process 25(1):414–427

    Article  MathSciNet  Google Scholar 

  29. Liu W, Tao D, Cheng J, Tang Y (2014) Multiview hessian discriminative sparse coding for image annotation. Comput Vis Image Underst 118:50–60

    Article  Google Scholar 

  30. Vapnik V, Vashist A (2009) A new learning paradigm: learning using privileged information. Neural Netw 22(5–6):544–557

    Article  MATH  Google Scholar 

  31. Vapnik V, Izmailov R (2015) Learning using privileged information: similarity control and knowledge transfer. J Mach Learn Res 16(1):2023–2049

    MathSciNet  MATH  Google Scholar 

  32. Vapnik V, Kotz S (1982) Estimation of dependences based on empirical data. Springer, New York

    MATH  Google Scholar 

  33. Vapnik V (1999) The nature of statistical learning theory. Springer, New York

    MATH  Google Scholar 

  34. Sinz F, Chapelle O, Agarwal A, Schölkopf B (2008) An analysis of inference with the universum. Adv Neural Inf Process Syst 20:1369–1376

    Google Scholar 

  35. Chen S, and Zhang C (2009) Selecting informative universum sample for semi-supervised learning. In: International joint conference on artificial intelligence, pp 1016–1021

  36. Qi Z, Tian Y, Shi Y (2012) Twin support vector machine with universum data. Neural Netw 36:112–119

    Article  MATH  Google Scholar 

  37. Qi Z, Tian Y, Shi Y (2014) A nonparallel support vector machine for a classification problem with universum learning. J Comput Appl Math 263:288–298

    Article  MathSciNet  MATH  Google Scholar 

  38. Zhang D, Wang J, Si L (2011) Document clustering with universum. In: International conference on research and development in information retrieval, pp 873–882

  39. Peng B, Qian G, Ma Y (2008) View-invariant pose recognition using multilinear analysis and the universum. Adv Visual Comput 5359:581–591

    Article  Google Scholar 

  40. Shen C, Wang P, Shen F, Wang H (2012) Uboost: boosting with the universum. IEEE Trans Pattern Anal Mach Intell 34(4):825–832

    Article  Google Scholar 

  41. Chen X, Chen S, Xue H (2012) Universum linear discriminant analysis. Electron Lett 48(22):1407–1409

    Article  Google Scholar 

  42. Wang Z, Zhu Y, Liu W, Chen Z, Gao D (2014) Multi-view learning with universum. Knowl-Based Syst 70:376–391

    Article  Google Scholar 

  43. Leski J (2003) Ho-kashyap classifier with generalization control. Pattern Recognit Lett 24(14):2281–2290

    Article  MATH  Google Scholar 

  44. Bache K, Lichman M (2013) UCI machine learning repository, University of California, Irvine, School of Information and Computer Sciences. http://archive.ics.uci.edu/ml

  45. Tsang I, Kocsor A, Kwok J (2006) Efficient kernel feature extraction for massive data sets. In: International conference on knowledge discovery and data mining, pp 724–729

  46. Kreßel U (1999) Pairwise classification and support vector machines. In: Advances in kernel methods, pp. 255–268

  47. Xu Q, Liang Y (2001) Monte carlo cross validation. Chemometr Intell Lab Syst 56(1):1–11

    Article  Google Scholar 

  48. Hollander M, Wolfe D, Chicken E (2013) Nonparametric statistical methods. Wiley, New York

    MATH  Google Scholar 

  49. Ye J (2005) Generalized low rank approximations of matrices. Mach Learn 61(1–3):167–191

    Article  MATH  Google Scholar 

  50. Koltchinskii V (2001) Rademacher penalties and structural risk minimization. IEEE Trans Inf Theory 47(5):1902–1914

    Article  MathSciNet  MATH  Google Scholar 

  51. Vapnik V, Chervonenkis A (1971) On the uniform convergence of relative frequencies of events to their probabilities. Theory Probab Appl 16(2):264–280

    Article  MATH  Google Scholar 

  52. Koltchinskii V, Panchenko D (2000) Rademacher processes and bounding the risk of function learning. High dimensional probability II. Springer, New York, pp. 443–457

Download references

Acknowledgements

This work was partially supported by Natural Science Foundations of China under Grant Nos. 61672227 and 61272198, the 863 Plan of China Ministry of Science and Technology under Grant No. 2015AA020107, and the Fundamental Research Funds for the Central Universities.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Zhe Wang or Daqi Gao.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, D., Zhu, Y., Wang, Z. et al. Regularized Matrix-Pattern-Oriented Classification Machine with Universum. Neural Process Lett 45, 1077–1098 (2017). https://doi.org/10.1007/s11063-016-9567-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-016-9567-1

Keywords

Navigation