Brain Imaging and Behavior

, Volume 13, Issue 1, pp 138–153 | Cite as

Robust multi-label transfer feature learning for early diagnosis of Alzheimer’s disease

  • Bo Cheng
  • Mingxia Liu
  • Daoqiang ZhangEmail author
  • Dinggang ShenEmail author
  • Alzheimer’s Disease Neuroimaging Initiative


Transfer learning has been successfully used in the early diagnosis of Alzheimer’s disease (AD). In these methods, data from one single or multiple related source domain(s) are employed to aid the learning task in the target domain. However, most of the existing methods utilize data from all source domains, ignoring the fact that unrelated source domains may degrade the learning performance. Also, previous studies assume that class labels for all subjects are reliable, without considering the ambiguity of class labels caused by slight differences between early AD patients and normal control subjects. To address these issues, we propose to transform the original binary class label of a particular subject into a multi-bit label coding vector with the aid of multiple source domains. We further develop a robust multi-label transfer feature learning (rMLTFL) model to simultaneously capture a common set of features from different domains (including the target domain and all source domains) and to identify the unrelated source domains. We evaluate our method on 406 subjects from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database with baseline magnetic resonance imaging (MRI) and cerebrospinal fluid (CSF) data. The experimental results show that the proposed rMLTFL method can effectively improve the performance of AD diagnosis, compared with several state-of-the-art methods.


Transfer learning Multi-label learning Feature learning Alzheimer’s disease (AD) 



Data collection and sharing for this project was funded by the Alzheimer’s Disease Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: Abbott, AstraZeneca AB, Bayer Schering Pharma AG, Bristol-Myers Squibb, Eisai Global Clinical Development, Elan Corporation, Genentech, GE Healthcare, GlaxoSmithKline, Innogenetics, Johnson and Johnson, Eli Lilly and Co., Medpace, Inc., Merck and Co., Inc., Novartis AG, Pfizer Inc, F. Hoffman-La Roche, Schering-Plough, Synarc, Inc., as well as non-profit partners the Alzheimer’s Association and Alzheimer’s Drug Discovery Foundation, with participation from the U.S. Food and Drug Administration. Private sector contributions to ADNI are facilitated by the Foundation for the National Institutes of Health ( The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer’s Disease Cooperative Study at the University of California, San Diego. ADNI data are disseminated by the Laboratory for Neuron Imaging at the University of California, Los Angeles. This work was supported by the National Natural Science Foundation of China (Nos. 61602072, 61573023, 61732006, and 61473149), Chongqing Cutting-edge and Applied Foundation Research Program (Grant No. cstc2016jcyjA0063), Scientific and Technological Research Program of Chongqing Municipal Education Commission (Grant Nos. KJ1401010, KJ1601003, KJ1601015, KJ1710248, KJ1710257), NIH grants (AG041721, AG049371, AG042599, AG053867), Key Laboratory of Chongqing Municipal Institutions of Higher Education (Grant No. [2017]3), and Program of Chongqing Development and Reform Commission (Grant No. 2017[1007]).


  1. Argyriou, A., Evgeniou, T., & Pontil, M. (2008). Convex multi-task feature learning. Machine Learning, 73, 243–272.CrossRefGoogle Scholar
  2. Association, A. s. (2015). 2015 Alzheimer’s disease facts and figures. Alzheimer’s & Dement, 11, 332–384.Google Scholar
  3. Bouwman, F. H., Schoonenboom, S. N. M., van der Flier, W. M., van Elk, E. J., Kok, A., Barkhof, F., Blankenstein, M. A., & Scheltens, P. (2007). CSF biomarkers and medial temporal lobe atrophy predict dementia in mild cognitive impairment. Neurobiology of Aging, 28, 1070–1074.CrossRefPubMedGoogle Scholar
  4. Chang, C. C., & Lin, C. J. (2001). LIBSVM: a library for support vector machines.
  5. Chao, L. L., Buckley, S. T., Kornak, J., Schuff, N., Madison, C., Yaffe, K., Miller, B. L., Kramer, J. H., & Weiner, M. W. (2010). ASL perfusion MRI predicts cognitive decline and conversion from MCI to dementia. Alzheimer Disease and Associated Disorders, 24, 19–27.CrossRefPubMedPubMedCentralGoogle Scholar
  6. Chen, X., Pan, W., Kwok, J. T., & Carbonell, J. G. (2009). Accelerated gradient method for multi-task sparse learning problem. Proceeding of Ninth IEEE International Conference on Data Mining and Knowledge Discovery, 746–751.Google Scholar
  7. Cheng, B., Liu, M., Shen, D., Zuoyong, L., & Zhang, D. (2017). Multi-domain transfer learning for early diagnosis of Alzheimer’s disease. Neuroinformatics, 15, 115–132.CrossRefPubMedPubMedCentralGoogle Scholar
  8. Cheng, B., Liu, M., Suk, H., Shen, D., & Zhang, D. (2015a). Multimodal manifold-regularized transfer learning for MCI conversion prediction. Brain Imaging and Behavior, 9, 913–926.CrossRefPubMedPubMedCentralGoogle Scholar
  9. Cheng, B., Liu, M., Zhang, D., Munsell, B. C., & Shen, D. (2015b). Domain transfer learning for MCI conversion prediction. IEEE Transactions on Biomedical Engineering, 62, 1805–1817.CrossRefPubMedGoogle Scholar
  10. Chetelat, G., Landeau, B., Eustache, F., Mezenge, F., Viader, F., de la Sayette, V., Desgranges, B., & Baron, J. C. (2005). Using voxel-based morphometry to map the structural changes associated with rapid conversion in MCI: a longitudinal MRI study. NeuroImage, 27, 934–946.CrossRefPubMedGoogle Scholar
  11. Cho, Y., Seong, J. K., Jeong, Y., & Shin, S. Y. (2012). Individual subject classification for Alzheimer’s disease based on incremental learning using a spatial frequency representation of cortical thickness data. NeuroImage, 59, 2217–2230.CrossRefPubMedGoogle Scholar
  12. CIT, (2012). Medical image processing, analysis and visualization (MIPAV)
  13. Coupé, P., Eskildsen, S. F., Manjón, J. V., Fonov, V. S., Pruessner, J. C., Allard, M., & Collins, D. L. (2012). Scoring by nonlocal image patch estimator for early detection of Alzheimer’s disease. NeuroImage: Clinical, 1, 141–152.CrossRefGoogle Scholar
  14. Cuingnet, R., Gerardin, E., Tessieras, J., Auzias, G., Lehericy, S., Habert, M. O., Chupin, M., Benali, H., & Colliot, O. (2011). Automatic classification of patients with Alzheimer’s disease from structural MRI: a comparison of ten methods using the ADNI database. NeuroImage 56, 766–781.CrossRefPubMedGoogle Scholar
  15. Da, X., Toledo, J. B., Zee, J., Wolk, D. A., Xie, S. X., Ou, Y., Shacklett, A., Parmpi, P., Shaw, L., Trojanowski, J. Q., & Davatzikos, C. (2014). Integration and relative value of biomarkers for prediction of MCI to AD progression: Spatial patterns of brain atrophy, cognitive scores, APOE genotype and CSF biomarkers. NeuroImage: Clinical, 4, 164–173.CrossRefGoogle Scholar
  16. Davatzikos, C., Bhatt, P., Shaw, L. M., Batmanghelich, K. N., & Trojanowski, J. Q. (2011). Prediction of MCI to AD conversion, via MRI, CSF biomarkers, and pattern classification. Neurobiology of Aging, 32, 2322.e2319–2322.e2327.Google Scholar
  17. DeLong, E. R., DeLong, D. M., & Clarke-Pearson, D. L. (1988). Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics, 44, 837–845.CrossRefPubMedGoogle Scholar
  18. deToledo-Morrell, L., Stoub, T. R., Bulgakova, M., Wilson, R. S., Bennett, D. A., Leurgans, S., Wuu, J., & Turner, D. A. (2004). MRI-derived entorhinal volume is a good predictor of conversion from MCI to AD. Neurobiology of Aging, 25, 1197–1203.CrossRefPubMedGoogle Scholar
  19. Duan, L. X., Tsang, I. W., & Xu, D. (2012). Domain transfer multiple kernel learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34, 465–479.CrossRefPubMedGoogle Scholar
  20. Dukart, J., Sambataro, F., & Bertolino, A. (2016). Accurate prediction of conversion to Alzheimer’s disease using imaging, genetic, and neuropsychological biomarkers. Journal of Alzheimer’s disease, 49, 1143–1159.CrossRefPubMedGoogle Scholar
  21. Eskildsen, S. F., Coupé, P., García-Lorenzo, D., Fonov, V., Pruessner, J. C., & Collins, D. L. (2013). Prediction of Alzheimer’s disease in subjects with mild cognitive impairment from the ADNI cohort using patterns of cortical thinning. NeuroImage, 65, 511–521.CrossRefPubMedGoogle Scholar
  22. Filipovych, R., & Davatzikos, C. (2011). Semi-supervised pattern classification of medical images: application to mild cognitive impairment (MCI). NeuroImage, 55, 1109–1119.CrossRefPubMedGoogle Scholar
  23. Gong, P., Ye, J., & Zhang, C. (2012). Robust Multi-Task Feature Learning. Proceeding of the 18th ACM SIGKDD conference on knowledge discovery and data mining.Google Scholar
  24. Hao, X., Yao, X., Yan, J., Risacher, S. L., Saykin, A. J., Zhang, D., & Shen, L. (2016). Identifying multimodal intermediate phenotypes between genetic risk factors and disease status in Alzheimer’s disease. Neuroinformatics, 14, 439–452.CrossRefPubMedPubMedCentralGoogle Scholar
  25. Hinrichs, C., Singh, V., Xu, G. F., & Johnson, S. C. (2011). Predictive markers for AD in a multi-modality framework: an analysis of MCI progression in the ADNI population. NeuroImage, 55, 574–589.CrossRefPubMedGoogle Scholar
  26. Jie, B., Zhang, D., Cheng, B., & Shen, D. (2015). Manifold regularized multitask feature learning for multimodality disease classification. Human Brain Mapping, 36, 489–507.CrossRefPubMedGoogle Scholar
  27. Kabani, N., MacDonald, D., Holmes, C. J., & Evans, A. (1998). A 3D atlas of the human brain. Neuroimage, 7, S717.CrossRefGoogle Scholar
  28. Lehmann, M., Koedam, E. L., Barnes, J., Bartlett, J. W., Barkhof, F., Wattjes, M. P., Schott, J. M., Scheltens, P., & Fox, N. C. (2012). Visual ratings of atrophy in MCI: prediction of conversion and relationship with CSF biomarkers. Neurobiology of Aging.Google Scholar
  29. Liu, F., Wee, C. Y., Chen, H. F., & Shen, D. G. (2014). Inter-modality relationship constrained multi-modality multi-task feature selection for Alzheimer’s Disease and mild cognitive impairment identification. NeuroImage, 84, 466–475.CrossRefPubMedGoogle Scholar
  30. Liu, J., Chen, J., & Ye, J. (2009a). Large-scale sparse logistic regression. Proceeding of the 15th ACM SIGKDD conference on knowledge discovery and data mining.Google Scholar
  31. Liu, J., Ji, S., & Ye, J. (2009b). Multi-task feature learning via efficient ℓ2,1 -norm minimization. UAI, 339–348.Google Scholar
  32. Liu, J., Ji, S., & Ye, J. (2009c). SLEP: sparse learning with efficient projections. Arizona State University,
  33. Liu, M., Zhang, D., Chen, S., & Xue, H. (2016a). Joint binary classifier learning for ECOC-based Multi-class classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38, 2335–2341.CrossRefGoogle Scholar
  34. Liu, M., Zhang, D., & Shen, D. (2016b). Relationship induced multi-template learning for diagnosis of Alzheimer’s disease and mild cognitive impairment. IEEE Transactions on Medical Imaging, 35, 1463–1474.CrossRefPubMedPubMedCentralGoogle Scholar
  35. Liu, M., Zhang, J., Yap, P. T., & Shen, D. (2017). View-aligned hypergraph learning for Alzheimer’s disease diagnosis with incomplete multi-modality data. Medical Image Analysis, 36, 123–134.CrossRefPubMedGoogle Scholar
  36. Misra, C., Fan, Y., & Davatzikos, C. (2009). Baseline and longitudinal patterns of brain atrophy in MCI patients, and their use in prediction of short-term conversion to AD: results from ADNI. NeuroImage 44, 1415–1422.CrossRefPubMedGoogle Scholar
  37. Nesterov, Y. (2004). Introductory Lectures on Convex Optimization: A Basic Course. Springer Netherlands.Google Scholar
  38. Nesterov, Y. (2007). Gradient methods for minimizing composite objective function. Center for Operations Research and Econometrics (CORE), Catholic University of Louvain, Technical Report, 76.Google Scholar
  39. Obozinski, G., Taskar, B., & Jordan, M. I. (2006). Multi-task feature selection. Technical report, Statistics Department, UC Berkeley.Google Scholar
  40. Ota, K., Oishi, N., Ito, K., & Fukuyama, H. (2015). Effects of imaging modalities, brain atlases and feature selection on prediction of Alzheimer’s disease. Journal of Neuroscience Methods, 256, 168–183.CrossRefPubMedGoogle Scholar
  41. Pan, S. J., & Yang, Q. (2010). A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22, 1345–1359.CrossRefGoogle Scholar
  42. Pujol, O., Radeva, P., Vitria, J.,. Discriminant, E. C. O. C. (2006). A heuristic method for application dependent design of error correcting output codes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28, 1007–1012.CrossRefPubMedGoogle Scholar
  43. Querbes, O., Aubry, F., Pariente, J., Lotterie, J.-A., Demonet, J.-F., Duret, V., Puel, M., Berry, I., Fort, J.-C., Celsis, P., ADNI (2009). Early diagnosis of Alzheimer’s disease using cortical thickness: impact of cognitive reserve. Brain: A Journal of Neurology 132, 2036–2047.Google Scholar
  44. Risacher, S. L., Saykin, A. J., West, J. D., Shen, L., Firpi, H. A., & McDonald, B. C. (2009). Baseline MRI predictors of conversion from MCI to probable AD in the ADNI cohort. Current Alzheimer Research, 6, 347–361.CrossRefPubMedPubMedCentralGoogle Scholar
  45. Schwartz, Y., Varoquaux, G., Pallier, C., Pinel, P., Poline, J., & Thirion, B. (2012). Improving Accuracy and Power with Transfer Learning Using a Meta-analytic Database. Proceeding of International Conference on Medical Image Computing and Computer-Assisted Intervention-MICCAI 2012 7512, 248–255.Google Scholar
  46. Shen, D., & Davatzikos, C. (2002). HAMMER: Hierarchical attribute matching mechanism for elastic registration. IEEE Transactions on Medical Imaging, 21, 1421–1439.CrossRefPubMedGoogle Scholar
  47. Sled, J. G., Zijdenbos, A. P., & Evans, A. C. (1998). A nonparametric method for automatic correction of intensity nonuniformity in MRI data. IEEE Transactions on Medical Imaging, 17, 87–97.CrossRefPubMedGoogle Scholar
  48. Suk, H., Lee, S. W., & Shen, D. (2014). Hierarchical feature representation and multimodal fusion with deep learning for AD/MCI diagnosis. NeuroImage 101, 569–582.CrossRefPubMedPubMedCentralGoogle Scholar
  49. Tibshirani, R. J. (1996). Regression shrinkage and selection via the LASSO. Journal of the Royal Statistical Society, Series B, 58, 267–288.Google Scholar
  50. Vemuri, P., Wiste, H. J., Weigand, S. D., Shaw, L. M., Trojanowski, J. Q., Weiner, M. W., Knopman, D. S., Petersen, R. C., & Jack, C. R. (2009a). MRI and CSF biomarkers in normal, MCI, and AD subjects Diagnostic discrimination and cognitive correlations. Neurology, 73, 287–293.CrossRefPubMedPubMedCentralGoogle Scholar
  51. Vemuri, P., Wiste, H. J., Weigand, S. D., Shaw, L. M., Trojanowski, J. Q., Weiner, M. W., Knopman, D. S., Petersen, R. C., & Jack, C. R. (2009b). MRI and CSF biomarkers in normal, MCI, and AD subjects predicting future clinical change. Neurology, 73, 294–301.CrossRefPubMedPubMedCentralGoogle Scholar
  52. Wang, L., Wee, C. Y., Tang, X., Yap, P. T., & Shen, D. (2016). Multi-task feature selection via supervised canonical graph matching for diagnosis of autism spectrum disorder. Brain Imaging and Behavior, 10, 33–40.CrossRefPubMedPubMedCentralGoogle Scholar
  53. Wang, Y., Nie, J., Yap, P.-T., Shi, F., Guo, L., & Shen, D. (2011). Robust Deformable-Surface-Based Skull-Stripping for Large-Scale Studies. In G. Fichtinger, A. Martel & T. Peters (Eds.), Medical Image Computing and Computer-Assisted Intervention (pp. 635–642). Berlin / Heidelberg: Springer.Google Scholar
  54. Wee, C. Y., Yap, P. T., & Shen, D. (2013). Prediction of Alzheimer’s disease and mild cognitive impairment using cortical morphological patterns. Human Brain Mapping, 34, 3411–3425.CrossRefPubMedGoogle Scholar
  55. Westman, E., Aguilar, C., Muehlboeck, J. S., & Simmons, A. (2013). Regional magnetic resonance imaging measures for multivariate analysis in Alzheimer’s Disease and Mild cognitive impairment. Brain Topography, 26, 9–23.CrossRefPubMedGoogle Scholar
  56. Westman, E., Muehlboeck, J. S., & Simmons, A. (2012). Combining MRI and CSF measures for classification of Alzheimer’s disease and prediction of mild cognitive impairment conversion. NeuroImage, 62, 229–238.CrossRefPubMedGoogle Scholar
  57. Wolz, R., Julkunen, V., Koikkalainen, J., Niskanen, E., Zhang, D. P., Rueckert, D., Soininen, H., & Lotjonen, J. (2011). Multi-method analysis of MRI images in early diagnostics of Alzheimer’s disease. Plos One, 6, e25446.CrossRefPubMedPubMedCentralGoogle Scholar
  58. Yang, J., Yan, R., & Hauptmann, A. G. (2007). Cross-domain video concept detection using adaptive SVMs. Proceedings of the 15th international conference on Multimedia, 188–197.Google Scholar
  59. Ye, J., Farnum, M., Yang, E., Verbeeck, R., Lobanov, V., Raghavan, N., Novak, G., DiBernardo, A., Narayan, V. A., ADNI (2012). Sparse learning and stability selection for predicting MCI to AD conversion using baseline ADNI data. Bmc Neurology, 12, 1471-2377-1412-1446.Google Scholar
  60. Young, J., Modat, M., Cardoso, M. J., Mendelson, A., Cash, D., & Ourselin, S. (2013). Accurate multimodal probabilistic prediction of conversion to Alzheimer’s disease in patients with mild cognitive impairment. NeuroImage: Clinical, 2, 735–745.CrossRefGoogle Scholar
  61. Zhang, D., & Shen, D. (2012a). Multi-modal multi-task learning for joint prediction of multiple regression and classification variables in Alzheimer’s disease. NeuroImage, 59, 895–907.CrossRefPubMedGoogle Scholar
  62. Zhang, D., & Shen, D. (2012b). Predicting future clinical changes of MCI patients using longitudinal and multimodal biomarkers. PLoS One, 3, e33182.CrossRefGoogle Scholar
  63. Zhang, D., Wang, Y., Zhou, L., Yuan, H., & Shen, D. (2011). Multimodal classification of Alzheimer’s disease and mild cognitive impairment. NeuroImage, 55, 856–867.CrossRefPubMedPubMedCentralGoogle Scholar
  64. Zhang, Y., Brady, M., & Smith, S. (2001). Segmentation of brain MR images through a hidden Markov random field model and the expectation maximization algorithm. IEEE Transactions on Medical Imaging, 20, 45–57.CrossRefPubMedGoogle Scholar
  65. Zhou, J., Liu, J., Narayan, V. A., & Ye, J. (2013). Modeling disease progression via multi-task learning. NeuroImage, 78, 233–248.CrossRefPubMedGoogle Scholar
  66. Zhu, X., Suk, H., & Shen, D. (2014). A novel matrix-similarity based loss function for joint regression and classification in AD diagnosis. NeuroImage, 100, 91–105.CrossRefPubMedPubMedCentralGoogle Scholar
  67. Zhu, X., Suk, H. I., Lee, S. W., & Shen, D. (2015). Canonical feature selection for joint regression and multi-class identification in Alzheimer’s disease diagnosis. Brain Imaging and Behavior, 10, 818–828.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Bo Cheng
    • 1
    • 2
  • Mingxia Liu
    • 3
  • Daoqiang Zhang
    • 4
    Email author
  • Dinggang Shen
    • 3
    • 5
    Email author
  • Alzheimer’s Disease Neuroimaging Initiative
  1. 1.Key Laboratory of Intelligent Information Processing and Control of Chongqing Municipal Institutions of Higher EducationChongqing Three Gorges UniversityChongqingChina
  2. 2.Chongqing Engineering Research Center of Internet of Things and Intelligent Control TechnologyChongqing Three Gorges UniversityChongqingChina
  3. 3.Department of Radiology and BRICUniversity of North CarolinaChapel HillUSA
  4. 4.Department of Computer Science and EngineeringNanjing University of Aeronautics and AstronauticsNanjingChina
  5. 5.Department of Brain and Cognitive EngineeringKorea UniversitySeoulRepublic of Korea

Personalised recommendations