DC programming and DCA for sparse Fisher linear discriminant analysis

Le Thi, Hoai An; Phan, Duy Nhat

doi:10.1007/s00521-016-2216-9

DC programming and DCA for sparse Fisher linear discriminant analysis

Original Article
Published: 11 February 2016

Volume 28, pages 2809–2822, (2017)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Hoai An Le Thi^1,2 &
Duy Nhat Phan²

490 Accesses
14 Citations
Explore all metrics

Abstract

We consider the supervised pattern classification in the high-dimensional setting, in which the number of features is much larger than the number of observations. We present a novel approach to the sparse Fisher linear discriminant problem using the \(\ell _0\)-norm. The resulting optimization problem is nonconvex, discontinuous and very hard to solve. We overcome the discontinuity by using appropriate approximations to the \(\ell _0\)-norm such that the resulting problems can be formulated as difference of convex functions (DC) programs to which DC programming and DC Algorithms (DCA) are investigated. The experimental results on both simulated and real datasets demonstrate the efficiency of the proposed algorithms compared to some state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A DC Programming Approach for Sparse Linear Discriminant Analysis

Sparse overlapped linear discriminant analysis

Article 24 November 2022

Linear Discriminant Analysis with Adherent Regularization

Notes

http://cran.r-project.org/.

References

Alon U, Barkai N, Notterman DA, Gish K, Ybarra S, Mack D, Levine AJ (1999) Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci USA 96(12):6745–6750
Article Google Scholar
Bickel PJ, Levina E (2004) Some theory for Fisher’s linear discriminant function, naive Bayes, and some alternatives when there are many more variables than observations. Bernoulli 10(6):989–1010
Article MathSciNet MATH Google Scholar
Boyd S, Parikh N, Chu E, Peleato B, Eckstein J (2011) Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3(1):1–124
Article MATH Google Scholar
Bradley PS, Mangasarian OL (1998) Feature selection via concave minimization and support vector machines. In: Proceeding of international conference on machine learning ICML98
Chen X, Xu FM, Ye Y (2010) Lower bound theory of nonzero entries in solutions of l2-lp minimization. SIAM J Sci Comput 32(5):2832–2852
Article MathSciNet MATH Google Scholar
Cheng S, Le Thi HA (2013) Learning sparse classifiers with difference of convex functions algorithms. Optim Methods Softw 28(4):830–854
Article MathSciNet MATH Google Scholar
Clemmensen L, Hansen M, Ersboll B, Frisvad J (2007) A method for comparison of growth media in objective identification of penicillium based on multi-spectral imaging. J Microbiol Methods 69:249–255
Article Google Scholar
Clemmensen L, Hastie T, Witten D, Ersbøll B (2011) Sparse discriminant analysis. Technometrics 53(4):406–413
Article MathSciNet Google Scholar
Collobert R, Sinz F, Weston J, Bottou L (2006) Trading convexity for scalability. In Proceedings of the 23rd international conference on machine learning, NY, USA, pp 201–208
Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7:179–188
Article Google Scholar
Friedman J, Hastie T, Hoefling H, Tibshirani R (2007) Pathwise coordinate optimization. An Appl Stat 1:302–332
Article MathSciNet MATH Google Scholar
Gasso G, Rakotomamonjy A, Canu S (2009) Recovering sparse signals with a certain family of nonconvex penalties and dc programming. IEEE Trans Signal Process 57:4686–4698
Article MathSciNet Google Scholar
Gordon GJ, Jensen RV, Hsiao LL, Gullans SR, Blumenstock JE, Ramaswamy S, Richards WG, Sugarbaker DJ, Bueno R (2002) Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. Cancer Res 62:4963–4967
Google Scholar
Grosenick L, Greer S, Knutson B (2008) Interpretable classifiers for fmri improve prediction of purchases. IEEE Trans Neural Syst Rehabil Eng 16(6):539–547
Article Google Scholar
Guo Y, Hastie T, Tibshirani R (2007) Regularized linear discriminant analysis and its application in microarrays. Biostatistics 8(1):86–100
Article MATH Google Scholar
Hastie T, Buja A, Tibshirani R (1995) Penalized discriminant analysis. Ann Stat 23(1):73–102
Article MathSciNet MATH Google Scholar
Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning. Springer Verlag, New York
Book MATH Google Scholar
Khan J, Wei JS, Ringner M, Saal LH, Ladanyi M, Westermann F, Berthold F, Schab M, Antonescu CR, Peterson C, Meltzer PS (2001) Classification and diagnostic prediction of cancers using expression profiling and artificial neural networks. Nat Med 7:673–679
Article Google Scholar
Krause N, Singer Y (2004) Leveraging the margin more carefully. In: Proceedings of the twenty first international conference on machine learning, NY, USA
Krzanowski W, Jonathan P, Mccarthy W, Thomas M (1995) Discriminant analysis with singular covariance matrices: methods and applications to spectroscopic data. J R Stat Soc 44(1):101–115
MATH Google Scholar
Le Hoai M, Le Thi HA, Pham Dinh T, Huynh VN (2013) Block clustering based on difference of convex functions (DC) programming and DC algorithms. Neural Comput 25:259–278
Article MathSciNet Google Scholar
Le Thi HA (2000) An efficient algorithm for globally minimizing a quadratic function under convex quadratic constraints. Math Program 87:401–426
Article MathSciNet MATH Google Scholar
Le Thi HA, Le Hoai M, Nguyen VV, Pham Dinh T (2008) A DC programming approach for feature selection in support vector machines learning. J Adv Data Anal Classif 2(3):259–278
Article MathSciNet MATH Google Scholar
Le Thi HA, Le Hoai M, Pham Dinh T (2007) Optimization based DC programming and DCA for hierarchical clustering. Eur J Oper Res 183:1067–1085
Article MathSciNet MATH Google Scholar
Le Thi HA, Le HM, Dinh TP (2014a) New and efficient DCA based algorithms for minimum sum-of-squares clustering. Pattern Recognit 47:388–401
Article MATH Google Scholar
Le Thi HA, Le Hoai M, Pham Dinh T (2015a) Feature selection in machine learning: an exact penalty approach using a difference of convex function algorithm. Mach Learn 101:163–186
Article MathSciNet MATH Google Scholar
Le Thi HA, Nguyen MC (2014) Self-organizing maps by difference of convex functions optimization. Data Min Knowl Discov 28:1336–1365
Article MathSciNet MATH Google Scholar
Le Thi HA, Nguyen VV, Ouchani S (2009) Gene selection for cancer classification using DCA. J Front Comput Sci Technol 3:612–620
Google Scholar
Le Thi HA, Pham Dinh T (2005) The DC (difference of convex functions) programming and DCA revisited with DC models of real world nonconvex optimization problems. Ann Oper Res 133:23–46
Article MathSciNet MATH Google Scholar
Le Thi HA, Pham Dinh T, Huynh VN (2012) Exact penalty and error bounds in DC programming. J Glob Optim 52(3):509–535
Article MathSciNet MATH Google Scholar
Le Thi HA, Pham Dinh T, Le Hoai M, Vo Xuan T (2015b) DC approximation approaches for sparse optimization. Eur J Oper Res 244:26–44
Article MathSciNet MATH Google Scholar
Le Thi HA, Vo Xuan T, Pham Dinh T (2014b) Feature selection for linear SVMs under uncertain data: robust optimization based on difference of convex functions algorithms. Neural Netw 59:36–50
Article MATH Google Scholar
Leng C (2008) Sparse optimal scoring for multiclass cancer diagnosis and biomarker detection using microarray data. Comput Biol Chem 32:417–425
Article MathSciNet MATH Google Scholar
Liu Y, Shen X, Doss H (2005) Multicategory \(\psi \)-learning and support vector machine: computational tools. J Comput Graph Stat 14:219–236
Article MathSciNet Google Scholar
Mai Q, Zou H (2013) A note on the connection and equivalence of three sparse linear discriminant analysis methods. Technometrics 55(2):243–246
Article MathSciNet Google Scholar
Mai Q, Zou H, Yuan M (2012) A direct approach to sparse discriminant analysis in ultra-high dimensions. Biometrika 99(1):29–42
Article MathSciNet MATH Google Scholar
Mardia KV, Kent JT, Bibby JM (1979) Multivariate Analysis. Academic Press, London, New York, Toronto, Sydney, San Francisco
Nakayama R, Nemoto T, Takahashi H, Ohta T, Kawai A, Yoshida T, Toyama Y, Ichikawa H, Hasegama T (2007) Gene expression analysis of soft tissue sarcomas: characterization and reclassification of malignant fibrous histiocytoma. Mod Pathol 20(7):749–759
Article Google Scholar
Neumann J, Schnorr G, Steidl G (2005) Combined SVM-based feature selection and classification. Mach Learn 61:129–150
Article MATH Google Scholar
Peleg D, Meir R (2008) A bilinear formulation for vector sparsity optimization. Signal Process 88(2):375–389
Article MATH Google Scholar
Pham Dinh T, Le Thi HA (1997) Convex analysis approach to D.C. programming: theory, algorithms and applications. Acta Math Vietnam 22(1):289–355
MathSciNet MATH Google Scholar
Pham Dinh T, Le Thi HA (1998) A DC optimization algorithm for solving the trust-region subproblem. SIAM J Optim 8(2):476–505
Article MathSciNet MATH Google Scholar
Pham Dinh T, Le Thi HA (2014) Recent advances in dc programming and dca. Trans Comput Collect Intell 8342:1–37
Google Scholar
Sun L, Hui A, Su Q, Vortmeyer A, Kotliarov Y, Pastorino S, Passaniti A, Menon J, Wlling J, Bailey R, Rosenblum M, Mikkelsen T, Fine H (2006) Neuronal and glioma-derived stem cell factor induces angiogenesis within the brain. Cancer Cell 9:287–300
Article Google Scholar
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc 58:267–288
MathSciNet MATH Google Scholar
Tibshirani R, Hastie T, Narasimhan B, Chu G (2002) Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc Natl Acad Sci 99:6567–6572
Article Google Scholar
Tibshirani R, Hastie T, Narasimhan B, Chu G (2003) Class prediction by nearest shrunken centroids, with applications to DNA microarrays. Stat Sci 18(1):104–117
Article MathSciNet MATH Google Scholar
Trendafilov NT, Jolliffe IT (2007) Dalass: Variable selection in discriminant analysis via the lasso. Comput Stat Data Anal 51:3718–3736
Article MathSciNet MATH Google Scholar
Witten D, Tibshirani R (2011) Penalized classification using Fisher’s linear discriminant. J R Stat Soc B 73:753–772
Article MathSciNet MATH Google Scholar
Wu M, Zhang L, Wang Z, Christiani D, Lin X (2009) Sparse linear discriminant analysis for simultaneous testing for the significance of a gene set/pathway and gene selection. Bioinformatics 25:1145–1151
Article Google Scholar
Xu P, Brock GN, Parrish RS (2009) Modified linear discriminant analysis approaches for classification of high-dimensional microarray data. Comput Stat Data Anal 53:1674–1687
Article MathSciNet MATH Google Scholar
Yeoh EJ, Ross ME, Shurtleff SA, Williams WK, Patel D, Mahfouz R, Behm FG, Raimondi SC, Relling MV, Patel A et al (2002) Classification, subtype discovery, and prediction of outcome in pediatric lymphoblastic leukemia by gene expression profiling. Cancer Cell 1:133–143
Article Google Scholar

Download references

Acknowledgments

This research is funded by Foundation for Science and Technology Development of Ton Duc Thang University (FOSTECT), website: http://fostect.tdt.edu.vn, under Grant FOSTECT.2015.BR.15. The authors would like to thank the referees for their valuable comments which helped to improve the manuscript.

Author information

Authors and Affiliations

Department for Management of Science and Technology Development, Faculty of Mathematics Statistics, Ton Duc Thang University, Ho Chi Minh City, Vietnam
Hoai An Le Thi
Laboratory of Theoretical and Applied Computer Science, University of Lorraine, Ile du Saulcy, 57045, Metz, France
Hoai An Le Thi & Duy Nhat Phan

Authors

Hoai An Le Thi
View author publications
You can also search for this author in PubMed Google Scholar
Duy Nhat Phan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hoai An Le Thi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Le Thi, H.A., Phan, D.N. DC programming and DCA for sparse Fisher linear discriminant analysis. Neural Comput & Applic 28, 2809–2822 (2017). https://doi.org/10.1007/s00521-016-2216-9

Download citation

Received: 12 June 2015
Accepted: 19 January 2016
Published: 11 February 2016
Issue Date: September 2017
DOI: https://doi.org/10.1007/s00521-016-2216-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DC programming and DCA for sparse Fisher linear discriminant analysis

Abstract

Access this article

Similar content being viewed by others

A DC Programming Approach for Sparse Linear Discriminant Analysis

Sparse overlapped linear discriminant analysis

Linear Discriminant Analysis with Adherent Regularization

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

DC programming and DCA for sparse Fisher linear discriminant analysis

Abstract

Access this article

Similar content being viewed by others

A DC Programming Approach for Sparse Linear Discriminant Analysis

Sparse overlapped linear discriminant analysis

Linear Discriminant Analysis with Adherent Regularization

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation