An Unbiased Penalty for Sparse Classification with Application to Neuroimaging Data

  • Li Zhang
  • Dana Cobzas
  • Alan Wilman
  • Linglong Kong
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10435)


We present a novel formulation for discriminative anatomy detection in high dimensional neuroimaging data. While most studies solve this problem using mass univariate approaches, recent works show better accuracy and variable selection using a sparse classification model. Such methods typically use an \(l_1\) penalty for imposing sparseness and a graph net (GN) or a total variation (TV) penalty for ensuring spatial continuity and interpretability of the results. However it is known that the \(l_1\) and TV penalties have inherent bias that leads to less stable region detection and less accurate prediction. To overcome these limitations, we propose a novel variable selection method in the context of classification, based on the Smoothly Clipped Absolute Deviation (SCAD) penalty. We experimentally show superiority of three models based on the SCAD and SCADTV penalties when compared to the classical \(l_1\) and TV penalties in both simulated and real MRI data from a multiple sclerosis study.


Sparse classification Variable selection Localized statistics \(l_1\) optimization SCAD penalty 


  1. 1.
  2. 2.
    Ashburner, J., Friston, K.: Voxel-based morphometry - the methods. NeuroImage 11(6), 805–821 (2000)CrossRefGoogle Scholar
  3. 3.
    Avants, B., Epstein, C., Grossman, M., Gee, J.: Symmetric diffeomorphic image registration with cross-correlation: evaluating automated labeling of elderly and neurodegenerative brain. Med. Image Anal. 12(1), 26–41 (2008)CrossRefGoogle Scholar
  4. 4.
    Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3(1), 1–122 (2011)CrossRefzbMATHGoogle Scholar
  5. 5.
    Chopra, A., Lian, H.: Total variation, adaptive total variation and nonconvex smoothly clipped absolute deviation penalty for denoising blocky images. Pattern Recogn. 43(8), 2609–2619 (2010)CrossRefzbMATHGoogle Scholar
  6. 6.
    Davatzikos, C.: Why voxel-based morphometric analysis should be used with great caution when characterizing group differences. Neuroimage 23(1), 17–20 (2004)CrossRefGoogle Scholar
  7. 7.
    Eickenberg, M., Dohmatob, E., Thirion, B., Varoquaux, G.: Grouping total variation and sparsity: statistical learning with segmenting penalties. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9349, pp. 685–693. Springer, Cham (2015). doi: 10.1007/978-3-319-24553-9_84 CrossRefGoogle Scholar
  8. 8.
    Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its Oracle properties. J. Am. Stat. Assoc. 96(456), 1348–1360 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Gramfort, A., Thirion, B., Varoquaux, G.: Identifying predictive regions from fMRI with TV-L1 prior. In: International Workshop on PRNI, pp. 17–20 (2013)Google Scholar
  10. 10.
    Grosenick, L., Klingenberg, B., Katovich, K., Knutson, B., Taylor, J.: Interpretable whole-brain prediction analysis with graphnet. Neuroimage 72, 304–21 (2013)CrossRefGoogle Scholar
  11. 11.
    Kandel, B., Avants, B., Gee, J., Wolk, D.: Predicting cognitive data from medical images using sparse linear regression. In: IPMI, pp. 86–97 (2013)Google Scholar
  12. 12.
    Krishnapuram, B., Carin, L., Figueiredo, M., Hartemink, A.: Sparse multinomial logistic regression: fast algorithms and generalization bounds. IEEE Trans. Pattern Anal. Mach. Intell. 27(6), 957–68 (2005)CrossRefGoogle Scholar
  13. 13.
    Mehranian, A., Rad, H.S., Rahmim, A., Ay, M.R., Zaidi, H.: Smoothly clipped absolute deviation (SCAD) regularization for compressed sensing MRI using an augmented lagrangian scheme. Magn. Reson. Imaging 31(8), 1399–1411 (2013)CrossRefGoogle Scholar
  14. 14.
    Schenck, J., Zimmerman, E.: High-field MRI of brain iron: birth of a biomarker? NMR Biomed. 17, 433–45 (2004)CrossRefGoogle Scholar
  15. 15.
    Stephenson, E., Nathoo, N., Mahjoub, Y., et al.: Iron in multiple sclerosis: roles in neurodegeneration and repair. Nat. Rev. Neurol. 10(8), 459–68 (2014)CrossRefGoogle Scholar
  16. 16.
    Tibshirani, R.: Regression shrinkage and selection via the Lasso. J. Royal Stat. Soc. 58(1), 267–288 (1996)MathSciNetzbMATHGoogle Scholar
  17. 17.
    Tustison, N., Avants, B., Cook, P., et al.: N4ITK: improved N3 bias correction. IEEE Trans. Med. Imaging 29(6), 1310–20 (2010)CrossRefGoogle Scholar
  18. 18.
    Wang, Y., Yin, W., Zeng, J.: Global convergence for ADMM in nonconvex nonsmooth optimization, arXiv:1551.06324

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Li Zhang
    • 1
  • Dana Cobzas
    • 1
  • Alan Wilman
    • 1
  • Linglong Kong
    • 1
  1. 1.University of AlbertaEdmontonCanada

Personalised recommendations