A recursive partitioning approach for subgroup identification in brain–behaviour correlation analysis

  • Doowon Choi
  • Lin Li
  • Hanli Liu
  • Li ZengEmail author
Theoretical advances


In neural correlates studies, the goal is to understand the brain–behaviour relationship characterized by correlation between brain activation responses and human behaviour measures. Such correlation depends on subject-related covariates such as age and gender, so it is necessary to identify subgroups within the population that have different brain–behaviour correlations. The subgrouping is made by manual specification in current practice, which is inefficient and may ignore potential covariates whose effects are unknown in the literature. This study proposes a recursive partitioning approach, called correlation tree, for automatic subgroup identification in brain–behaviour correlation analysis. In constructing a correlation tree, the split variable at each node is selected through an unbiased variable selection method based on partial correlation test, and then, the optimal cutpoint of the selected split variable is determined through exhaustive search under an objective function. Three types of meaningful objective functions are considered to meet various practical needs. Results of simulation and application to real data from optical brain imaging demonstrate effectiveness of the proposed approach.


Subgroup identification Recursive partitioning Brain–behaviour correlation Partial correlation Unbiased variable selection 



The authors acknowledge Dr. Mary Cazzell at Cook Children’s Medical Center for her help on data collection.


  1. 1.
    Abend G (2017) What are neural correlates neural correlates of? BioSocieties 12(3):415–438CrossRefGoogle Scholar
  2. 2.
    Dolcos F, Iordan AD, Dolcos S (2011) Neural correlates of emotion–cognition interactions: a review of evidence from brain imaging investigations. J Cognit Psychol 23(6):669–694CrossRefGoogle Scholar
  3. 3.
    Koch C, Massimini M, Boly M, Tononi G (2016) Neural correlates of consciousness: progress and problems. Nat Rev Neurosci 17(5):307CrossRefGoogle Scholar
  4. 4.
    Li T, Luo Q, Gong H (2010) Gender-specific hemodynamics in prefrontal cortex during a verbal working memory task by near-infrared spectroscopy. Behav Brain Res 209(1):148–153CrossRefGoogle Scholar
  5. 5.
    Berchicci M, Lucci G, Perri RL, Spinelli D, Di Russo F (2014) Benefits of physical exercise on basic visuo-motor functions across age. Front Aging Neurosci 6:48CrossRefGoogle Scholar
  6. 6.
    Davis SW, Dennis NA, Daselaar SM, Fleck MS, Cabeza R (2007) Que PASA? The posterior–anterior shift in aging. Cereb Cortex 18(5):1201–1209CrossRefGoogle Scholar
  7. 7.
    Abdullah MB (1990) On a robust correlation coefficient. Statistician 39(4):455–460CrossRefGoogle Scholar
  8. 8.
    Morgan JN, Sonquist JA (1963) Problems in the analysis of survey data, and a proposal. J Am Stat Assoc 58(302):415–434CrossRefzbMATHGoogle Scholar
  9. 9.
    Breiman L, Friedman J, Olshen R, Stone C (1984) Classification and regression trees. CRC Press, Boca RatonzbMATHGoogle Scholar
  10. 10.
    Chaudhuri P, Huang MC, Loh WY, Yao R (1994) Piecewise-polynomial regression trees. Stat Sin, pp 143–167Google Scholar
  11. 11.
    Loh WY, Shih YS (1997) Split selection methods for classification trees. Stat Sin 7:815–840MathSciNetzbMATHGoogle Scholar
  12. 12.
    Shih YS (2004) A note on split selection bias in classification trees. Comput Stat Data Anal 45:457–466MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Strobl C, Boulesteix A, Augustin T (2007) Unbiased split selection for classification trees based on the Gini index. Comput Stat Data Anal 52:483–501MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Loh WY (2014) Fifty years of classification and regression trees. Int Stat Rev 82(3):329–348MathSciNetCrossRefGoogle Scholar
  15. 15.
    Levene H (1961) Robust tests for equality of variances. In: Contributions to probability and statistics: essays in Honor of harold hotelling, pp 279–292Google Scholar
  16. 16.
    Loh WY (2002) Regression tress with unbiased variable selection and interaction detection. Stat Sin 12(2):361–386zbMATHGoogle Scholar
  17. 17.
    Wackerly D, Mendenhall W, Scheaffer R (2007) Mathematical statistics with applications. Nelson Education, LondonzbMATHGoogle Scholar
  18. 18.
    Waliczek TM (1996) A primer on partial correlation coefficients. Southwest Educational Research Association, New OrleansGoogle Scholar
  19. 19.
    Anderson TW (2003) An introduction to multivariate statistical analysis, 3rd edn. Wiley, HobokenzbMATHGoogle Scholar
  20. 20.
    Kim S (2015) ppcor: an R package for a fast calculation to semi-partial correlation coefficients. Commun Stat Appl Methods 22(6):665Google Scholar
  21. 21.
    Whittaker J (2009) Graphical models in applied multivariate statistics. Wiley Publishing, LondonzbMATHGoogle Scholar
  22. 22.
    Sun S, Chen J, Kind P, Xu L, Zhang Y, Burström K (2015) Experience-based VAS values for EQ-5D-3L health states in a national general population health survey in China. Qual Life Res 24(3):693–703CrossRefGoogle Scholar
  23. 23.
    Kirk R (2007) Statistics: an introduction. Nelson Education, LondonGoogle Scholar
  24. 24.
    Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction, 2nd edn. Springer, LondonCrossRefzbMATHGoogle Scholar
  25. 25.
    Xiao W (2017) An online algorithm for nonparametric correlations, arXiv preprint arXiv:1712.01521
  26. 26.
    Knight WR (1966) A computer method for calculating Kendall’s tau with ungrouped data. J Am Stat Assoc 61(314):436–439CrossRefzbMATHGoogle Scholar
  27. 27.
    Li L, Cazzell M, Zeng L, Liu H (2017) Are there gender differences in young vs. aging brains under risk decision-making? An optical brain imaging study. Brain Imaging Behav 11(4):1085–1098CrossRefGoogle Scholar
  28. 28.
    Cazzell M, Li L, Lin Z, Patel SJ, Liu H (2012) Comparison of neural correlates of risk decision making between genders: an exploratory fNIRS study of the balloon analogue risk task (BART). Neuroimage 62(3):1896–1911CrossRefGoogle Scholar
  29. 29.
    Therneau T, Atkinson B, Ripley B (2015) rpart: Recursive partitioning and regression trees. R package version 4.1–10Google Scholar
  30. 30.
    Loh WY (2009) Improving the precision of classification trees. Ann Appl Stat 3(4):1710–1737MathSciNetCrossRefzbMATHGoogle Scholar
  31. 31.
    Rousselet GA, Pernet CR (2012) Improving standards in brain-behavior correlation analyses. Front Hum Neurosci 6:119CrossRefGoogle Scholar
  32. 32.
    Wilcox RR (2005) Introduction to robust estimation and hypothesis testing (statistical modeling and decision science). Academic Press, New YorkzbMATHGoogle Scholar
  33. 33.
    Hung C, Tsai C-F (2008) Market segmentation based on hierarchical self-organizing map for markets of multimedia on demand. Expert Syst Appl 34(1):780–787CrossRefGoogle Scholar
  34. 34.
    Xu L, Xu Y, Chow TWS (2010) PolSOM: a new method for multidimensional data visualization. Pattern Recognit 43(4):1668–1675CrossRefzbMATHGoogle Scholar
  35. 35.
    Zhang H, Wang S, Xu X, Chow TWS, Wu QMJ (2018) Tree2Vector: learning a vectorial representation for tree-structured data. IEEE Trans Neural Netw Learn Syst 29(11):5304–5318CrossRefGoogle Scholar
  36. 36.
    Zhang H, Wang S, Zhao M, Xu X, Ye Y (2018) Locality reconstruction models for book representation. IEEE Trans Knowl Data Eng 30(10):1873–1886CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London Ltd., part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of Industrial and Systems EngineeringTexas A&M UniversityCollege StationUSA
  2. 2.Department of Neurology, David Geffen School of MedicineUniversity of California at Los AngelesLos AngelesUSA
  3. 3.Department of BioengineeringUniversity of Texas at ArlingtonArlingtonUSA

Personalised recommendations