Cognitive Assessment Prediction in Alzheimer’s Disease by Multi-Layer Multi-Target Regression

Wang, Xiaoqian; Zhen, Xiantong; Li, Quanzheng; Shen, Dinggang; Huang, Heng

doi:10.1007/s12021-018-9381-1

Cognitive Assessment Prediction in Alzheimer’s Disease by Multi-Layer Multi-Target Regression

Original Article
Published: 25 May 2018

Volume 16, pages 285–294, (2018)
Cite this article

Neuroinformatics Aims and scope Submit manuscript

Xiaoqian Wang¹,
Xiantong Zhen¹,
Quanzheng Li²,
Dinggang Shen³ &
…
Heng Huang¹

1112 Accesses
16 Citations
10 Altmetric
1 Mention
Explore all metrics

Abstract

Accurate and automatic prediction of cognitive assessment from multiple neuroimaging biomarkers is crucial for early detection of Alzheimer’s disease. The major challenges arise from the nonlinear relationship between biomarkers and assessment scores and the inter-correlation among them, which have not yet been well addressed. In this paper, we propose multi-layer multi-target regression (MMR) which enables simultaneously modeling intrinsic inter-target correlations and nonlinear input-output relationships in a general compositional framework. Specifically, by kernelized dictionary learning, the MMR can effectively handle highly nonlinear relationship between biomarkers and assessment scores; by robust low-rank linear learning via matrix elastic nets, the MMR can explicitly encode inter-correlations among multiple assessment scores; moreover, the MMR is flexibly and allows to work with non-smooth ℓ_2,1-norm loss function, which enables calibration of multiple targets with disparate noise levels for more robust parameter estimation. The MMR can be efficiently solved by an alternating optimization algorithm via gradient descent with guaranteed convergence. The MMR has been evaluated by extensive experiments on the ADNI database with MRI data, and produced high accuracy surpassing previous regression models, which demonstrates its great effectiveness as a new multi-target regression model for clinical multivariate prediction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

New Multi-task Learning Model to Predict Alzheimer’s Disease Cognitive Assessment

Sparse Multi-kernel Based Multi-task Learning for Joint Prediction of Clinical Scores and Biomarker Identification in Alzheimer’s Disease

Multifold Bayesian Kernelization in Alzheimer’s Diagnosis

References

Agarwal, A., Gerber, S., Daume, H. (2010). Learning multiple tasks using manifold regularization. In Advances in neural information processing system (pp. 46–54).
Aho, T., ženko, B., Džeroski, S., Elomaa, T. (2012). Multi-target regression with rule ensembles. Journal of Machine Learning Research, 13(1), 2367–2407.
Google Scholar
Alvarez, M., Rosasco, L., Lawrence, N. (2012). Kernels for vector-valued functions: a review. Foundations and Trends in Machine Learning.
Argyriou, A., Evgeniou, T., Pontil, M. (2008). Convex multi-task feature learning. Machine Learning, 73(3), 243–272.
Article Google Scholar
Armijo, L. (1966). Minimization of functions having lipschitz continuous first partial derivatives. Pacific Journal of Mathematics, 16(1), 1–3.
Article Google Scholar
Association, A. et al. (2016). 2016 alzheimer’s disease facts and figures. Alzheimer’s & Dementia, 12(4), 459–509.
Article Google Scholar
Ciliberto, C., Mroueh, Y., Poggio, T., Rosasco, L. (2015). Convex learning of multiple tasks and their structure. In Internationl conference on machine learning (pp. 1548–1557).
Daumé III, H. (2009). Bayesian multitask learning with latent hierarchies. In Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence (pp. 135–142).
Dinuzzo, F. (2013). Learning output kernels for multi-task problems. Neurocomputing, 118, 119–126.
Article Google Scholar
Dinuzzo, F., Ong, C.S., Pillonetto, G., Gehler, P.V. (2011). Learning output kernels with block coordinate descent. In Internationl conference on machine learning (pp. 49–56).
Dinuzzo, F., & Schölkopf, B. (2012). The representer theorem for Hilbert spaces: a necessary and sufficient condition. In Advances in neural information processing system (pp. 189–196).
Evgeniou, T., Micchelli, C.A., Pontil, M. (2005). Learning multiple tasks with kernel methods. In Journal of machine learning research (pp. 615–637).
Falahati, F., Ferreira, D., Muehlboeck, J.S., Eriksdotter, M., Simmons, A., Wahlund, L.O., Westman, E. (2016). Longitudinal investigation of an mri-based alzheimers disease diagnostic index in adni. Alzheimer’s & Dementia: The Journal of the Alzheimer’s Association, 12(7), P732–P733.
Article Google Scholar
Feng, Y., Lv, S.G., Hang, H., Suykens, J.A. (2016). Kernelized elastic net regularization: generalization bounds, and sparse recovery. Neural Computation, 28(3), 525–562.
Article PubMed Google Scholar
Ferrarini, L., Palm, W.M., Olofsen, H., van der Landen, R., Blauw, G.J., Westendorp, R.G., Bollen, E.L., Middelkoop, H.A., Reiber, J.H., van Buchem, M.A., et al. (2008). Mmse scores correlate with local ventricular enlargement in the spectrum from cognitively normal to alzheimer disease. NeuroImage, 39(4), 1832–1838.
Article PubMed Google Scholar
Folstein, M.F., Folstein, S.E., McHugh, P.R. (1975). A mini-mental state: a practical method for grading the cognitive state of patients for the clinician. Journal of Psychiatric Research, 12(3), 189–198.
Article PubMed CAS Google Scholar
Gillberg, J., Marttinen, P., Pirinen, M., Kangas, A.J., Soininen, P., Ali, M., Havulinna, A.S., Järvelin, M. R., Ala-Korpela, M., Kaski, S. (2016). Multiple output regression with latent noise. The Journal of Machine Learning Research, 17(1), 4170–4204.
Google Scholar
Gong, P., Zhou, J., Fan, W., Ye, J. (2014). Efficient multi-task feature learning with calibration. In Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining (pp. 761–770).
Hara, K., & Chellappa, R. (2014). Growing regression forests by classification: applications to object pose estimation. In European conference on computer vision (pp. 552–567).
Jack, C.R., Bernstein, M.A., Fox, N.C., Thompson, P., Alexander, G., Harvey, D., Borowski, B., Britson, P.J., L Whitwell, J., Ward, C., et al. (2008). The alzheimer’s disease neuroimaging initiative (adni): Mri methods. Journal of Magnetic Resonance Imaging, 27(4), 685–691.
Article PubMed Google Scholar
Kabani, N.J. (1998). 3d anatomical atlas of the human brain. Neuroimage, 7, P–0717.
Article Google Scholar
Kolar, M., Lafferty, J., Wasserman, L. (2011). Union support recovery in multi-task learning. Journal of Machine Learning Research, 12, 2415–2435.
Google Scholar
Kumar, A., & Daume, H. (2012). Learning task grouping and overlap in multi-task learning. In Internationl conference on machine learning (pp. 1383–1390).
Lee, S.I., Chatalbashev, V., Vickrey, D., Koller, D. (2007). Learning a meta-level prior for feature relevance from multiple related tasks. In Internationl conference on machine learning (pp. 489–496).
Li, C., Georgiopoulos, M., Anagnostopoulos, G.C. (2015). Pareto-path multitask multiple kernel learning. IEEE Transactions on Neural Networks and Learning Systems, 26(1), 51–61.
Article PubMed Google Scholar
Li, H., Chen, N., Li, L. (2012). Error analysis for matrix elastic-net regularization algorithms. IEEE Transactions on Neural Networks and Learning Systems, 23(5), 737–748.
Article PubMed Google Scholar
Liu, H., Wang, L., Zhao, T. (2014). Multivariate regression with calibration. In Advances in neural information processing system (pp. 127–135).
Lounici, K., Pontil, M., Van De Geer, S., Tsybakov, A.B. (2011). Oracle inequalities and optimal inference under group sparsity. In The annals of statistics (pp. 2164–2204).
Molstad, A.J., & Rothman, A.J. (2015). Indirect multivariate response linear regression. arXiv:1507.04610.
Moradi, E., Hallikainen, I., Hänninen, T., Tohka, J., Initiative, A.D.N., et al. (2016). Rey’s auditory verbal learning test scores can be predicted from whole brainmri in alzheimer’s disease. NeuroImage: Clinical.
Mueller, S.G., Weiner, M.W., Thal, L.J., Petersen, R.C., Jack, C.R., Jagust, W., Trojanowski, J.Q., Toga, A.W., Beckett, L. (2005). Ways toward an early diagnosis in alzheimers disease: the alzheimers disease neuroimaging initiative (adni). Alzheimer’s & Dementia, 1(1), 55–66.
Article Google Scholar
Pan, Y., Xia, R., Yin, J., Liu, N. (2015). A divide-and-conquer method for scalable robust mul-titask learning. IEEE Transactions on Neural Networks and Learning Systems, 26(12), 3163–3175.
Article PubMed Google Scholar
Rai, P., Kumar, A., Daume, H. (2012). Simultaneously leveraging output and task structures for multiple-output regression. In Advances in neural information processing system (pp. 3185–3193).
Rakitsch, B., Lippert, C., Borgwardt, K., Stegle, O. (2013). It is all in the noise: efficient multi-task gaussian process inference with structured residuals. In NIPS (pp. 1466–1474).
Rothman, A.J., Levina, E., Zhu, J. (2010). Sparse multivariate regression with covariance estimation. Journal of Computational and Graphical Statistics, 19(4), 947–962.
Article PubMed PubMed Central Google Scholar
Schmidt, M., & et al. (1996). Rey auditory verbal learning test: a handbook. Western Psychological Services Los Angeles.
Seshadri, S., DeStefano, A.L., Au, R., Massaro, J.M., Beiser, A.S., Kelly-Hayes, M., Kase, C.S., D’Agostino, R.B., DeCarli, C., Atwood, L.D., et al. (2007). Genetic correlates of brain aging on mri and cognitive test measures: a genome-wide association and linkage analysis in the framingham study. BMC Medical Genetics, 8(1), S15.
Article PubMed PubMed Central CAS Google Scholar
Shen, D., & Davatzikos, C. (2002). Hammer: hierarchical attribute matching mechanism for elastic registration. IEEE Transactions on Medical Imaging, 21(11), 1421–1439.
Article PubMed Google Scholar
Sled, J.G., Zijdenbos, A.P., Evans, A.C. (1998). A nonparametric method for automatic correction of intensity nonuniformity in mri data. IEEE Transactions on Medical Imaging, 17(1), 87–97.
Article PubMed CAS Google Scholar
Sohn, K.A., & Kim, S. (2012). Joint estimation of structured sparsity and output structure in multiple-output regression via inverse-covariance regularization. In International conference on artificial intelligence and statistics (pp. 1081–1089).
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), pp. 267–288.
Tsoumakas, G., Spyromitros-Xioufis, E., Vrekou, A., Vlahavas, I. (2014). Multi-target regression via random linear target combinations. In Machine learning and knowledge discovery in databases (pp. 225–240). Springer.
Wang, H, Nie, F, Huang, H, Yan, J, Kim, S, Risacher, S, Saykin, A, Shen, L. (2012). High-order multi-task feature learning to identify longitudinal phenotypic markers for alzheimer’s disease progression prediction. In Advances in neural information processing systems (pp. 1277–1285).
Wang, Y., Nie, J., Yap, P.T., Li, G., Shi, F., Geng, X., Guo, L., Shen, D., Initiative, A.D.N., et al. (2014). Knowledge-guided robust mri brain extraction for diverse large-scale neuroimaging studies on humans and non-human primates. PloS One, 9(1), e77810.
Article PubMed PubMed Central CAS Google Scholar
Wang, Y., Nie, J., Yap, P.T., Shi, F., Guo, L., Shen, D. (2011). Robust deformable-surface-based skull-stripping for large-scale studies. In Medical image computing and computer-assisted intervention–MICCAI 2011 (pp. 635–642). Springer.
Yu, K., Tresp, V., Schwaighofer, A. (2005). Learning gaussian processes from multiple tasks. In International conference on machine learning (pp. 1012–1019).
Zhang, Y., Brady, M., Smith, S. (2001). Segmentation of brain mr images through a hidden markov random field model and the expectation-maximization algorithm. IEEE Transactions on Medical Imaging, 20(1), 45–57.
Article PubMed CAS Google Scholar
Zhang, Y., & Yeung, D.Y. (2013). Learning high-order task relationships in multi-task learning. In International joint conference on artificial intelligence (pp. 1917–1923).
Zhang, Y., & Yeung, D.Y. (2014). A regularization approach to learning task relationships in multitask learning. ACM Transactions on Knowledge Discovery from Data, 8(3), 12.
Article Google Scholar
Zhen, X., Yu, M., He, X., Li, S. (2017). Multi-target regression via robust low-rank learning. In IEEE transactions on pattern analysis and machine Intelligence.
Zhou, Q., & Zhao, Q. (2016). Flexible clustered multi-task learning by learning representative tasks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(2), 266–278.
Article PubMed Google Scholar
Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2), 301–320.
Article Google Scholar
Zhu, X., Li, X., Zhang, S., Ju, C., Wu, X. (2017). Robust joint graph sparse coding for unsupervised spectral feature selection. IEEE Transactions on Neural Networks and Learning systems, 28(6), 1263–1275.
Article PubMed Google Scholar
Zhu, X., Suk, H.I., Wang, L., Lee, S.W., Shen, D. (2015). Alzheimer’s disease neuroimaging initiative: a novel relational regularization feature selection method for joint regression and classification in AD diagnosis. Medical Image Analysis, 38, 205–214.
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

This work was partially supported by the following grants: NSF-DBI 1356628, NSF-IIS 1633753, NIH R01 AG049371.

Author information

Authors and Affiliations

Department of Electrical, Computer Engineering, University of Pittsburgh, Pennsylvania, PA, 15263, USA
Xiaoqian Wang, Xiantong Zhen & Heng Huang
Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, 02114, USA
Quanzheng Li
Radiology and BRIC, UNC-CH School of Medicine, 130 Mason Farm Road, Chapel Hill, NC, 27599, USA
Dinggang Shen

Authors

Xiaoqian Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xiantong Zhen
View author publications
You can also search for this author in PubMed Google Scholar
Quanzheng Li
View author publications
You can also search for this author in PubMed Google Scholar
Dinggang Shen
View author publications
You can also search for this author in PubMed Google Scholar
Heng Huang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Heng Huang.

Appendix

Proof

By the definition of the nuclear norm, we can re-write it in terms of traces as follows

$$\begin{array}{@{}rcl@{}} ||S||_{*} & =& tr(\sqrt{S^{\top} S}) = tr(\sqrt{(U{\Sigma} V^{\top})^{\top}(U{\Sigma} V^{\top})})\\ & =& tr(\sqrt{V{\Sigma}^{\top} U^{\top} U {\Sigma} V^{\top}} = tr(\sqrt{V{\Sigma}^{\top} {\Sigma} V^{\top}})\\ & =& tr(\sqrt{V{\Sigma}^{\top} {\Sigma} V^{\top}})\\ & =& tr(\sqrt{V{\Sigma} V^{\top} V{\Sigma} V^{\top}})\\ & =& tr(V {\Sigma} V^{\top})\\ & =& tr({\Sigma}) \end{array} $$

(20)

Therefore, the nuclear norm of S can be also defined as the sum of the singular value decomposition of S. From (13), we have

$$ \partial S=\partial U{\Sigma} V^{\top}+U\partial{\Sigma} V^{\top}+U{\Sigma}\partial V^{\top}, $$

(21)

which gives rise to

$$ U\partial{\Sigma} V^{\top}=\partial S-\partial U{\Sigma} V^{\top}-U{\Sigma}\partial V^{\top}. $$

(22)

Multiplying U^⊤ on both sides of (22), we have

$$ U^{\top} U\partial{\Sigma} V^{\top} V =U^{\top}\partial SV-U^{\top}\partial U{\Sigma} V^{\top} V -U^{\top} U{\Sigma}\partial V^{\top} V $$

(23)

Since U is also an orthogonal matrix, we achieve

$$ \partial{\Sigma} =U^{\top}\partial SV-U^{\top}\partial U{\Sigma} - {\Sigma}\partial V^{\top} V. $$

(24)

Note that we have the fact that

$$ 0 = \partial I = \partial (U^{\top} U) = \partial U^{\top} U + U^{\top} \partial U, $$

(25)

where I is an identity matrix, and therefore U^⊤∂U is an antisymmetric matrix. We have

$$\begin{array}{@{}rcl@{}} tr(U^{\top} \partial U {\Sigma}) & =& tr((U^{\top} \partial U {\Sigma})^{\top}) = tr({\Sigma}^{\top} \partial U^{\top} U)\\ &=& - tr({\Sigma} U^{\top} \partial U) = - tr(U^{\top} \partial U {\Sigma}) \end{array} $$

(26)

which indicates that tr(U^⊤∂UΣ) = 0. Similarly, we also have tr(Σ∂V ^⊤V ) = 0. Therefore, we achieve

$$ tr(\partial{\Sigma}) =tr(U^{\top}\partial SV) $$

(27)

By taking the derivative of ||S||_∗ w.r.t. S, we obtain

$$ \frac{\partial \|S\|_{*}}{\partial S} =\frac{ tr(\partial{\Sigma})}{\partial S}=\frac{ tr(U^{\top}\partial SV)}{\partial S}= U V^{\top} $$

(28)

which closes the proof. □

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, X., Zhen, X., Li, Q. et al. Cognitive Assessment Prediction in Alzheimer’s Disease by Multi-Layer Multi-Target Regression. Neuroinform 16, 285–294 (2018). https://doi.org/10.1007/s12021-018-9381-1

Download citation

Published: 25 May 2018
Issue Date: October 2018
DOI: https://doi.org/10.1007/s12021-018-9381-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Cognitive Assessment Prediction in Alzheimer’s Disease by Multi-Layer Multi-Target Regression

Abstract

Access this article

Similar content being viewed by others

New Multi-task Learning Model to Predict Alzheimer’s Disease Cognitive Assessment

Sparse Multi-kernel Based Multi-task Learning for Joint Prediction of Clinical Scores and Biomarker Identification in Alzheimer’s Disease

Multifold Bayesian Kernelization in Alzheimer’s Diagnosis

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

Proof

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Cognitive Assessment Prediction in Alzheimer’s Disease by Multi-Layer Multi-Target Regression

Abstract

Access this article

Similar content being viewed by others

New Multi-task Learning Model to Predict Alzheimer’s Disease Cognitive Assessment

Sparse Multi-kernel Based Multi-task Learning for Joint Prediction of Clinical Scores and Biomarker Identification in Alzheimer’s Disease

Multifold Bayesian Kernelization in Alzheimer’s Diagnosis

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation