Computational Optimization and Applications

, Volume 58, Issue 2, pp 409–421 | Cite as

A unified algorithm for mixed \(l_{2,p}\)-minimizations and its application in feature selection

  • Liping Wang
  • Songcan Chen
  • Yuanping Wang


Recently, matrix norm \(l_{2,1}\) has been widely applied to feature selection in many areas such as computer vision, pattern recognition, biological study and etc. As an extension of \(l_1\) norm, \(l_{2,1}\) matrix norm is often used to find jointly sparse solution. Actually, computational studies have showed that the solution of \(l_p\)-minimization (\(0<p<1\)) is sparser than that of \(l_1\)-minimization. The generalized \(l_{2,p}\)-minimization (\(p\in (0,1]\)) is naturally expected to have better sparsity than \(l_{2,1}\)-minimization. This paper presents a type of models based on \(l_{2,p}\ (p\in (0, 1])\) matrix norm which is non-convex and non-Lipschitz continuous optimization problem when \(p\) is fractional (\(0<p<1\)). For all \(p\) in \((0, 1]\), a unified algorithm is proposed to solve the \(l_{2,p}\)-minimization and the convergence is also uniformly demonstrated. In the practical implementation of algorithm, a gradient projection technique is utilized to reduce the computational cost. Typically different \(l_{2,p}\ (p\in (0,1])\) are applied to select features in computational biology.


Mixed matrix norm Non-Lipschitz continuous Unified algorithm Gradient projection 



The first author thanks Dr. Zhang Hongchao for his helpful suggestions on this paper.


  1. 1.
    Nie, F.P., Huang, H., Cai, X., and Ding, C.: Efficient and robust feature selection via joint \(l_{2,1}\)-norms minimization. Twenty-Fourth Annual Conference on Neural Information Processing Systems, pp. 1–9. (2010)Google Scholar
  2. 2.
    Candès, Emmanuel J., Wakin, Michael B., Boyd, Stephen P.: Enhancing sparsity by reweighed \(l_1\) minimization. J. Fourier Anal. Appl. 14(5), 877–905 (2008)zbMATHMathSciNetGoogle Scholar
  3. 3.
    Chartrand, R.: Exact reconstructions of sparse signals via nonconvex minimization. IEEE Signal Process. Lett. 14(10), 707–710 (2007)CrossRefGoogle Scholar
  4. 4.
    Chartrand, R., and Yin, W.: Iteratively reweighed algorithms for compressive sensing. 33rd International Conference on Acoustics, Speech, and Signal Processing, pp. 3869–3872 ( 2008)Google Scholar
  5. 5.
    Chen, X.J., Xu, F.M., Ye, Y.Y.: Lower bound theory of nonzero entries in solutions of \(l_2-l_p\) minimization. SIAM J. Sci. Comput. 32(5), 2832–2852 (2010)CrossRefzbMATHMathSciNetGoogle Scholar
  6. 6.
    Ding, C., Zhou, D., He, X.F., and Zha, H.Y.: \(R1-\)PCA: Rotational invariant \(L_1-\)norm principal component analysis for robust subspace factorization. Proceedings of the 23th International Conference on Machine Learning, pp. 281–288 (2006)Google Scholar
  7. 7.
    Dudoit, S., Fridly, J., Speed, T.P.: Comparison of discrimination methods for the classification of tumors using gene expression data. J. Am. Stat. Assoc. 97(457), 77–87 (2002)CrossRefzbMATHGoogle Scholar
  8. 8.
    Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D., Lander, E.S.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439), 531–537 (1999)CrossRefGoogle Scholar
  9. 9.
    Nutt, C.L., Mani, D.R., Betensky, R.A., Tamayo, P., Caincross, J.G., Ladd, C., Pohl, U., Hartmann, C., Mclaughlin, M.E.: Gene expression-based classification of malignant gliomas correlates better with servival than histological classification. Cancer Res. 63, 1602–1607 (2003)Google Scholar
  10. 10.
    Gordon, G.J., Jensen, R.V., Hsiao, L.L., Gullans, S.R., Blumenstock, J.E., Ramaswamy, S., Richards, W.G., Sugarbaker, D.J., Bueno, R.: Translation of microarray data into clinically relevant cancer diagnoistic tests using gene expression ratios in lung cancer and mesothelioma. Cancer Res. 62(17), 4963–4967 (2002)Google Scholar
  11. 11.
    Singh, D., Febbo, P.G., Ross, K., Jackson, D.G., Manola, J., Ladd, C., Tamayo, P., Renshaw, A.A., D’Amico, A.V., Richie, J.P., Lander, E.S., Loda, M., Kantoff, P.W., Golub, T.R., Sellers, W.R.: Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1(2), 203–209 (2002)CrossRefGoogle Scholar
  12. 12.
    Xu, Z.B., Zhang, H., Wang, Y., Chang, X.Y., Yong, L.: \(L_{\frac{1}{2}}\) regularizer. Sci. China 52(6), 1159–1169 (2010)Google Scholar
  13. 13.
    Rakotomamonjy, A., Flamary, R., Gasso, G., Canu, S.: \(l_p-l_q\) Penalty for sparse linear and sparse multiple kernel multitask learning. IEEE Transac. Neural Netw. 22(8), 1307–1320 (2011)CrossRefGoogle Scholar
  14. 14.
    Rosen, J.B.: The gradient projection method for nonlinear programming. Part 1 Linear constraints. J. SIAM 8, 181–217 (1960)zbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  1. 1.Department of MathematicsNanjing University of Aeronautics and AstronauticsNanjingChina
  2. 2.Department of Computer Science and EngineeringNanjing University of Aeronautics and AstronauticsNanjingChina

Personalised recommendations