Topological Data Analysis Captures Task-Driven fMRI Profiles in Individual Participants: A Classification Pipeline Based on Persistence

Catanzaro, Michael J.; Rizzo, Sam; Kopchick, John; Chowdury, Asadur; Rosenberg, David R.; Bubenik, Peter; Diwadkar, Vaibhav A.

doi:10.1007/s12021-023-09645-3

Topological Data Analysis Captures Task-Driven fMRI Profiles in Individual Participants: A Classification Pipeline Based on Persistence

Research
Published: 04 November 2023

Volume 22, pages 45–62, (2024)
Cite this article

Neuroinformatics Aims and scope Submit manuscript

Michael J. Catanzaro^1,2,
Sam Rizzo³,
John Kopchick⁴,
Asadur Chowdury⁴,
David R. Rosenberg⁴,
Peter Bubenik⁵ &
…
Vaibhav A. Diwadkar⁴

281 Accesses
Explore all metrics

Abstract

BOLD-based fMRI is the most widely used method for studying brain function. The BOLD signal while valuable, is beset with unique vulnerabilities. The most notable of these is the modest signal to noise ratio, and the relatively low temporal and spatial resolution. However, the high dimensional complexity of the BOLD signal also presents unique opportunities for functional discovery. Topological Data Analyses (TDA), a branch of mathematics optimized to search for specific classes of structure within high dimensional data may provide particularly valuable applications. In this investigation, we acquired fMRI data in the anterior cingulate cortex (ACC) using a basic motor control paradigm. Then, for each participant and each of three task conditions, fMRI signals in the ACC were summarized using two methods: a) TDA based methods of persistent homology and persistence landscapes and b) non-TDA based methods using a standard vectorization scheme. Finally, using machine learning (with support vector classifiers), classification accuracy of TDA and non-TDA vectorized data was tested across participants. In each participant, TDA-based classification out-performed the non-TDA based counterpart, suggesting that our TDA analytic pipeline better characterized task- and condition-induced structure in fMRI data in the ACC. Our results emphasize the value of TDA in characterizing task- and condition-induced structure in regional fMRI signals. In addition to providing our analytical tools for other users to emulate, we also discuss the unique role that TDA-based methods can play in the study of individual differences in the structure of functional brain signals in the healthy and the clinical brain.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

MICROSTATELAB: The EEGLAB Toolbox for Resting-State Microstate Analysis

Article Open access 11 September 2023

The Functional Aspects of Resting EEG Microstates: A Systematic Review

Article 10 May 2023

Fractal dimensions and machine learning for detection of Parkinson’s disease in resting-state electroencephalography

Article Open access 11 February 2024

Data Availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

Notes

https://github.com/catanzaromj/PL_fMRI

References

Abdallah, H., Regalski, A., Kang, M. B., Berishaj, M., Nnadi, N., Chowdury, A., Diwadkar, V. A., & Salch, A. (2023). Statistical inference for persistent homology applied to simulated fMRI time series data. Foundations of Data Science, 5(1), 1–25. https://doi.org/10.3934/fods.2022014
Article MathSciNet Google Scholar
Amaro, E., Jr., & Barker, G. J. (2006). Study design in fMRI: Basic principles. Brain and Cognition, 60(3), 220–232.
Article PubMed Google Scholar
Anderson, K. L., Anderson, J. S., Palande, S., & Wang, B. (2018). Topological data analysis of functional MRI connectivity in time and space domains. In G. Wu, I. Rekik, M. D. Schirmer, A. W. Chung, & B. Munsell (Eds.), Connectomics in neuroImaging (Vol. 11083, pp. 67–77). Springer International Publishing. https://doi.org/10.1007/978-3-030-00755-3_8
Chapter Google Scholar
Asemi, A., Ramaseshan, K., Burgess, A., Diwadkar, V. A., & Bressler, S. L. (2015). Dorsal anterior cingulate cortex modulates supplementary motor area in coordinated unimanual motor behavior. Frontiers in Human Neuroscience, 9(309).
American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). American Psychiatric Press.
Book Google Scholar
Benecke, R., Rothwell, J. C., Day, B. L., Dick, J. P., & Marsden, C. D. (1986). Motor strategies involved in the performance of sequential movements. Experimental Brain Research Experimentelle Hirnforschung, 63(3), 585–595.
Article CAS PubMed Google Scholar
Berchicci, M., Sulpizio, V., Mento, G., Lucci, G., Civale, N., Galati, G., Pitzalis, S., Spinelli, D., & Russo, F. (2020). Prompting future events: Effects of temporal cueing and time on task on brain preparation to action. Brain and Cognition, 141, 105565.
Article PubMed Google Scholar
Bethlehem, R. A. I., Seidlitz, J., White, S. R., Vogel, J. W., Anderson, K. M., Adamson, C., & Alexander-Bloch, A. F. (2022). Brain charts for the human lifespan. Nature, 604(7906), 525–533. https://doi.org/10.1038/s41586-022-04554-y
Article CAS PubMed PubMed Central Google Scholar
Bielczyk, N. Z., Llera, A., Buitelaar, J. K., Glennon, J. C., & Beckmann, C. F. (2017). The impact of hemodynamic variability and signal mixing on the identifiability of effective connectivity structures in BOLD fMRI. Brain and Behavior: A Cognitive Neuroscience Perspective, 7(8), 00777. https://doi.org/10.1002/brb3.777
Article Google Scholar
Billings, J., Saggar, M., Hlinka, J., Keilholz, S., & Petri, G. (2021). Simplicial and topological descriptions of human brain dynamics. Network Neuroscience, 5(2), 549–568. https://doi.org/10.1162/netn_a_00190
Article PubMed PubMed Central Google Scholar
Boser, B. E., Guyon, I. M., & Vapnik, V. N. (1992). A training algorithm for optimal margin classifiers. Proceedings of the Fifth Annual Workshop on Computational Learning Theory, 144–152.
Botvinick, M., Nystrom, L. E., Fissell, K., Carter, C. S., & Cohen, J. D. (1999). Conflict monitoring versus selection-for-action in anterior cingulate cortex. Nature, 402(6758), 179–181. https://doi.org/10.1038/46035
Article ADS CAS PubMed Google Scholar
Bruin, W. B., Abe, Y., Alonso, P., Anticevic, A., Backhausen, L. L., Balachander, S., & Wingen, G. A. (2023). The functional connectome in obsessive-compulsive disorder: Resting-state mega-analysis and machine learning classification for the ENIGMA-OCD consortium. Molecular Psychiatry. https://doi.org/10.1038/s41380-023-02077-0
Article PubMed PubMed Central Google Scholar
Bubenik, P. (2015). Statistical topological data analysis using persistence landscapes. The Journal of Machine Learning Research, 16(1), 77–102.
MathSciNet Google Scholar
Bubenik, P. (2020). The persistence landscape and some of its properties. In N. A. Baas, G. E. Carlsson, G. Quick, M. Szymik, & M. Thaule (Eds.), Topological data analysis (pp. 97–117). Springer International Publishing. https://doi.org/10.1007/978-3-030-43408-3_4
Chapter Google Scholar
Calhoun, V. D. (2001). fMRI activation in a visual-perception task: Network of areas detected using the general linear model and independent components analysis. NeuroImage, 14(5), 1080–1088.
Article CAS PubMed Google Scholar
Carlsson, G. (2009). Topology and data. Bulletin of the American Mathematical Society, 46(2), 255–308. https://doi.org/10.1090/S0273-0979-09-01249-X
Article MathSciNet Google Scholar
Centeno, E. G. Z., Moreni, G., Vriend, C., Douw, L., & Santos, F. A. N. (2022). A hands-on tutorial on network and topological neuroscience. Brain Structure & Function. https://doi.org/10.1007/s00429-021-02435-0
Article Google Scholar
Chazal, F., & Michel, B. (2017). An introduction to Topological Data Analysis: Fundamental and practical aspects for data scientists. ArXiv:1710.04019 [Cs, Math, Stat]. http://arxiv.org/abs/1710.04019
Cohen-Steiner, D., Edelsbrunner, H., & Harer, J. (2007). Stability of persistence diagrams. Discrete & Computational Geometry, 37(1), 103–120. https://doi.org/10.1007/s00454-006-1276-5
Article MathSciNet Google Scholar
Comstock, D. C., & Balasubramaniam, R. (2018). Neural responses to perturbations in visual and auditory metronomes during sensorimotor synchronization. Neuropsychologia, 117, 55–66.
Article PubMed Google Scholar
Craig, A. D. (2011). Significance of the insula for the evolution of human awareness of feelings from the body. Annals of the New York Academy of Sciences, 1225, 72–82. https://doi.org/10.1111/j.1749-6632.2011.05990.x
Article ADS PubMed Google Scholar
Diwadkar, V. A., Asemi, A., Burgess, A., Chowdury, A., & Bressler, S. L. (2017). Potentiation of motor sub-networks for motor control but not working memory: Interaction of dACC and SMA revealed by resting-state directed functional connectivity. PLoS ONE, 12(3).
Edelsbrunner, H., & Harer, J. (2008). Persistent homology—A survey. In J. E. Goodman, J. Pach, & R. Pollack (Eds.), Contemporary mathematics (Vol. 453, pp. 257–282). American Mathematical Society. https://doi.org/10.1090/conm/453/08802
Chapter Google Scholar
Edelsbrunner, H., & Morozov, D. (2013). Persistent homology: Theory and practice. In R. Latała, A. Ruciński, P. Strzelecki, J. Świątkowski, D. Wrzosek, & P. Zakrzewski (Eds.), European Congress of Mathematics Kraków, 2–7 July, 2012 (pp. 31–50). European Mathematical Society Publishing House. https://doi.org/10.4171/120-1/3
Chapter Google Scholar
Eklund, A., Nichols, T. E., & Knutsson, H. (2016). Cluster failure: Why fMRI inferences for spatial extent have inflated false-positive rates. Proceedings of the National Academy of Sciences U S A, 113(28), 7900–7905. https://doi.org/10.1073/pnas.1602413113
Article ADS CAS Google Scholar
Ellis, C. T., Lesnick, M., Henselman-Petrusek, G., Keller, B., & Cohen, J. D. (2019). Feasibility of topological data analysis for event-related fMRI. Network Neuroscience, 1–12. https://doi.org/10.1162/netn_a_00095
Essen, D. C., Ugurbil, K., Auerbach, E., Barch, D., Behrens, T. E., Bucholz, R., & Yacoub, E. (2012). The Human Connectome Project: A data acquisition perspective.
Finn, E. S., & Rosenberg, M. D. (2021). Beyond fingerprinting: Choosing predictive connectomes over reliable connectomes. NeuroImage, 239, 118254. https://doi.org/10.1016/j.neuroimage.2021.118254
Article PubMed Google Scholar
Frankford, S. A., Nieto-Castañón, A., Tourville, J. A., & Guenther, F. H. (2021). Reliability of single-subject neural activation patterns in speech production tasks. Brain and Language, 212, 104881.
Article PubMed Google Scholar
Friston, K. J. (1995a). Characterizing dynamic brain responses with fMRI: A multivariate approach. NeuroImage, 2(2), 166–172.
Article CAS PubMed Google Scholar
Friston, K. J. (1995b). Statistical parametric maps in functional imaging: A general approach (Vol. 2). Human Brain Mapping.
Google Scholar
Friston, K. J. (2005). Models of brain function in neuroimaging. Annual Review of Psychology, 56, 57–87. https://doi.org/10.1146/annurev.psych.56.091103.070311
Article PubMed Google Scholar
Friston, K. J. (2012). Network discovery with DCM. NeuroImage, 56(3), 1202–1221.
Article Google Scholar
Friston, K. J., Li, B., Daunizeau, J., & Stephan, K. E. (2012). Network discovery with DCM. NeuroImage, 56(3), 1202–1221. https://doi.org/10.1016/j.neuroimage.2010.12.039
Article Google Scholar
Ghrist, R. (2008). Barcodes: The persistent topology of data. Bulletin of the American Mathematical Society, 45(1), 61–75. https://doi.org/10.1090/S0273-0979-07-01191-3
Article MathSciNet Google Scholar
Goebel, R., Esposito, F., & Formisano, E. (2006). Analysis of functional image analysis contest (FIAC) data with brainvoyager QX: From single-subject to cortically aligned group general linear model analysis and self-organizing group independent component analysis. Human Brain Mapping, 27(5).
Gonzalez, C. C., & Burke, M. R. (2018). Motor sequence learning in the brain: The long and short of it. Neuroscience, 389, 85–98.
Article CAS PubMed Google Scholar
Hensel, F., Moor, M., & Rieck, B. (2021). A Survey of Topological Machine Learning Methods. Frontiers in Artificial Intelligence, 4. https://doi.org/10.3389/frai.2021.681108
Hoffstaedter, F., Grefkes, C., Caspers, S., Roski, C., Palomero-Gallagher, N., Laird, A. R., & Eickhoff, S. B. (2014). The role of anterior midcingulate cortex in cognitive motor control: Evidence from functional connectivity analyses. Human Brain Mapping, 35(6), 2741–2753. https://doi.org/10.1002/hbm.22363
Article PubMed Google Scholar
James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning (Vol. 112). Springer.
Book Google Scholar
Lam, Y. S., Li, J., Ke, Y., & Yung, W. H. (2022). Variational dimensions of cingulate cortex functional connectivity and implications in neuropsychiatric disorders. Cereb Cortex.
Book Google Scholar
Lin, F. H., Polimeni, J. R., Lin, J. L., Tsai, K. W., Chu, Y. H., Wu, P. Y., & Kuo, W. J. (2018). Relative latency and temporal variability of hemodynamic responses at the human primary visual cortex. NeuroImage, 164, 194–201. https://doi.org/10.1016/j.neuroimage.2017.01.041
Article PubMed Google Scholar
Logothetis, N. K. (2008). What we can do and what we cannot do with fMRI. Nature, 453(7197), 869–878. https://doi.org/10.1038/nature06976
Article ADS CAS PubMed Google Scholar
Mannino, M., & Bressler, S. L. (2015). Foundational perspectives on causality in large-scale brain networks. Physics of Life Reviews, 15, 107–123. https://doi.org/10.1016/j.plrev.2015.09.002
Article ADS PubMed Google Scholar
Marek, S., Tervo-Clemmens, B., Calabro, F. J., Montez, D. F., Kay, B. P., Hatoum, A. S., & Dosenbach, N. U. F. (2022). Reproducible brain-wide association studies require thousands of individuals. Nature, 603(7902), 654–660. https://doi.org/10.1038/s41586-022-04492-9
Article ADS CAS PubMed PubMed Central Google Scholar
Meram, T. D., Chowdury, A., Easter, P., Attisha, T., Kallabat, E., Hanna, G. L., Arnold, P., Rosenberg, D. R., & Diwadkar, V. A. (2021). Evoking network profiles of the dorsal anterior cingulate in youth with Obsessive-Compulsive Disorder during motor control and working memory. Journal of Psychiatric Research, 132, 72–83.
Article PubMed Google Scholar
Monosov, I. E., Haber, S. N., Leuthardt, E. C., & Jezzini, A. (2020). Anterior cingulate cortex and the control of dynamic behavior in primates. Current Biology, 30(23), 1442–1454.
Article Google Scholar
Morris, A., Ravishankar, M., Pivetta, L., Chowdury, A., Falco, D., Damoiseaux, J. S., Rosenberg, D. R., Bressler, S. L., & Diwadkar, V. A. (2018). Response hand and motor set differentially modulate the connectivity of brain pathways during simple uni-manual motor behavior. Brain Topography, 31(6), 985–1000.
Article PubMed PubMed Central Google Scholar
Muller, V., & Anokhin, A. P. (2012). Neural synchrony during response production and inhibition. PLoS ONE, 7(6), e38931.
Article ADS PubMed PubMed Central Google Scholar
Munch, E. (2017). A user’s guide to topological data analysis. Journal of Learning Analytics, 4(2), 47–61.
Article MathSciNet Google Scholar
Muzik, O., & Diwadkar, V. A. (2016). In vivo correlates of thermoregulatory defense in humans: Temporal course of sub-cortical and cortical responses assessed with fMRI. Human Brain Mapping, 37(9), 3188–3202.
Article PubMed PubMed Central Google Scholar
Nardini, J. T., Stolz, B. J., Flores, K. B., Harrington, H. A., & Byrne, H. M. (2021). Topological data analysis distinguishes parameter regimes in the Anderson-Chaplain model of angiogenesis. PLoS Computational Biology, 17(6), 1009094. https://doi.org/10.1371/journal.pcbi.1009094
Article ADS CAS Google Scholar
Nord, C. L., Lawson, R. P., & Dalgleish, T. (2021). Disrupted dorsal mid-insula activation during interoception across psychiatric disorders. American Journal of Psychiatry, 178(8), 761–770. https://doi.org/10.1176/appi.ajp.2020.20091340
Article PubMed Google Scholar
Nowakowska, A. W., & Kotulska, M. (2022). Topological analysis as a tool for detection of abnormalities in protein-protein interaction data. Bioinformatics. https://doi.org/10.1093/bioinformatics/btac440
Article PubMed PubMed Central Google Scholar
Oliva, A., Torre, S., Taranto, P., Delvecchio, G., & Brambilla, P. (2021). Neural correlates of emotional processing in panic disorder: A mini review of functional magnetic resonance imaging studies. Journal of Affective Disorders, 282, 906–914.
Article PubMed Google Scholar
Paus, T. (2001). Primate anterior cingulate cortex: Where motor control, drive and cognition interface. Nature Reviews Neuroscience, 2(6), 417–424.
Article CAS PubMed Google Scholar
Robinson, M. (2014). Topological signal processing. Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-642-36104-3
Book Google Scholar
Saggar, M., Shine, J. M., Liegeois, R., Dosenbach, N. U. F., & Fair, D. (2022). Precision dynamical mapping using topological data analysis reveals a hub-like transition state at rest. Nature Communications, 13(1), 4791. https://doi.org/10.1038/s41467-022-32381-2
Article ADS CAS PubMed PubMed Central Google Scholar
Saggar, M., Sporns, O., Gonzalez-Castillo, J., Bandettini, P. A., Carlsson, G., Glover, G., & Reiss, A. L. (2018). Towards a new approach to reveal dynamical organization of the brain using topological data analysis. Nature Communications, 9(1), 1399. https://doi.org/10.1038/s41467-018-03664-4
Article ADS CAS PubMed PubMed Central Google Scholar
Salch, A., Regalski, A., Abdallah, H., Suryadevara, R., Catanzaro, M. J., & Diwadkar, V. A. (2021). From mathematics to medicine: A practical primer on topological data analysis (TDA) and the development of related analytic tools for the functional discovery of latent structure in fMRI data. PLoS ONE, 16(8), e0255859. https://doi.org/10.1371/journal.pone.0255859
Article CAS PubMed PubMed Central Google Scholar
Sasaki, K., Bruder, D., & Hernandez-Vargas, E. A. (2020). Topological data analysis to model the shape of immune responses during co-infections. Communications in Nonlinear Science and Numerical Simulation, 85, 105228. https://doi.org/10.1016/j.cnsns.2020.105228
Article MathSciNet PubMed Google Scholar
Schotten, M., & Forkel, S. J. (2022). The emergent properties of the connected brain. Science, 378(6619), 505–510. https://doi.org/10.1126/science.abq2591
Article ADS CAS Google Scholar
Silverstein, B., Bressler, S., & Diwadkar, V. A. (2016). Inferring the dysconnection syndrome in schizophrenia: Interpretational considerations on methods for the network analyses of fMRI data. Front Psychiatry, 7, 132.
Article PubMed PubMed Central Google Scholar
Sizemore, A. E., Phillips-Cremins, J. E., Ghrist, R., & Bassett, D. S. (2019). The importance of the whole: Topological data analysis for the network neuroscientist. Network Neuroscience, 3(3), 656–673. https://doi.org/10.1162/netn_a_00073
Article PubMed PubMed Central Google Scholar
Skaf, Y., & Laubenbacher, R. (2022). Topological data analysis in biomedicine: A review. Journal of Biomedical Informatics, 130, 104082. https://doi.org/10.1016/j.jbi.2022.104082
Article PubMed Google Scholar
Smith, A. T. (2021). Cortical visual area CSv as a cingulate motor area: A sensorimotor interface for the control of locomotion. Brain Structure and Function, 226(9), 2931–2950.
Article PubMed Google Scholar
Soloff, P. H., Abraham, K., Burgess, A., Ramaseshan, K., Chowdury, A., & Diwadkar, V. A. (2017). Impulsivity and aggression mediate regional brain responses in Borderline Personality Disorder: An fMRI study. Psychiatry Research, 260, 76–85. https://doi.org/10.1016/j.pscychresns.2016.12.009
Article PubMed Google Scholar
Stephan, K. E. (2004). On the role of general system theory for functional neuroimaging. Journal of Anatomy, 205, 443–470.
Article PubMed PubMed Central Google Scholar
Stephan, K. E., Mattout, J., David, O., & Friston, K. J. (2006). Models of functional neuroimaging data. Current Medical Imaging, 2(1), 15–34.
Article Google Scholar
Tan, C., Liu, X., & Zhang, G. (2022). Inferring brain state dynamics underlying naturalistic stimuli evoked emotion changes with dHA-HMM. Neuroinformatics, 20(3), 737–753. https://doi.org/10.1007/s12021-022-09568-5
Article PubMed Google Scholar
Tang, W., Jbabdi, S., Zhu, Z., Cottaar, M., Grisot, G., Lehman, J. F., Yendiki, A., & Haber, S. N. (2019). A connectional hub in the rostral anterior cingulate cortex links areas of emotion and cognitive control.
Tzourio-Mazoyer, N., Landeau, B., Papathanassiou, D., Crivello, F., Etard, O., Delcroix, N., Mazoyer, B., & Joliot, M. (2002). Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain. NeuroImage, 15(1), 273–289.
Article CAS PubMed Google Scholar
Von Bertalanffy, K. (1968). General systems theory. Brazilier.
Google Scholar
Wang, Y., Ombao, H., & Chung, M. K. (2018). Topological data analysis of single-trial electroencephalographic signals. The Annals of Applied Statistics, 12(3), 1506–1534. https://doi.org/10.1214/17-AOAS1119
Article MathSciNet PubMed PubMed Central Google Scholar
Welch, W. J. (1990). Construction of permutation tests. Journal of the American Statistical Association, 85(411), 693–698. https://doi.org/10.1080/01621459.1990.10474929
Article Google Scholar
Woolrich, M. W. (2012). Bayesian inference in FMRI. NeuroImage, 62(2), 801–810.
Article PubMed Google Scholar
Worsley, K. J. (1996). The geometry of random images. Chance, 9, 27–40.
Article Google Scholar
Yu, J., & Chang, X. (2021). Topological data analysis: A new method to identify genetic alterations in cancer. Asia-Pacific Journal of Oncology Nursing, 8(2), 112–114. https://doi.org/10.4103/2347-5625.308301
Article PubMed PubMed Central Google Scholar

Download references

Funding

This work was supported by the National Institute of Mental Health (MH059299), the Children’s Foundation of Michigan, Cohen Neuroscience Endowment, the Prechter World Bipolar Foundation, the Lykacki-Young Fund from the State of Michigan, the Miriam Hamburger Endowed Chair of Psychiatry, the Paul and Anita Strauss Endowment, the Donald and Mary Kosch Foundation, the Elliott Luby Endowed Professorship, and the Detroit Wayne County Authority. Additional support was provided by the Southeast Center for Mathematics and Biology, an NSF-Simons Research Center for Mathematics of Complex Biological Systems, under National Science Foundation (DMS-1764406), a Simons Foundation Grant (594594), and the Army Research Laboratory and the Army Research Office (W911NF-18–1-0307). The funding agencies played no role in the analyses or reporting of the data.

Author information

Authors and Affiliations

Iowa State University, Ames, IA, USA
Michael J. Catanzaro
Geometric Data Analytics, 343 West Main Street, Durham, NC, 27701, USA
Michael J. Catanzaro
Vanderbilt University, Nashville, TN, USA
Sam Rizzo
Wayne State University School of Medicine, Detroit, MI, USA
John Kopchick, Asadur Chowdury, David R. Rosenberg & Vaibhav A. Diwadkar
University of Florida, Gainesville, FL, USA
Peter Bubenik

Authors

Michael J. Catanzaro
View author publications
You can also search for this author in PubMed Google Scholar
Sam Rizzo
View author publications
You can also search for this author in PubMed Google Scholar
John Kopchick
View author publications
You can also search for this author in PubMed Google Scholar
Asadur Chowdury
View author publications
You can also search for this author in PubMed Google Scholar
David R. Rosenberg
View author publications
You can also search for this author in PubMed Google Scholar
Peter Bubenik
View author publications
You can also search for this author in PubMed Google Scholar
Vaibhav A. Diwadkar
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

MC, SR, and AC performed the data preparation and analysis. MC, SR, PB and VAD led the conceptual design of the analyses. MC, SR, and VAD wrote the main manuscript text and prepared the figures. VAD and DRR secured the research support for the study and designed the experiments. All authors reviewed the manuscript.

Corresponding author

Correspondence to Michael J. Catanzaro.

Ethics declarations

Competing Interests

The authors have declared that no competing interests exist.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary file1 (HTML 3412 KB)

Supplementary file2 (HTML 3856 KB)

Appendices

Appendix

In this appendix, we describe the mathematical details of our methodology and place persistent homology on a formal footing. Persistent homology calculations are frequently performed with binary arithmetic to increase the speed of calculations and we do the same throughout. In particular, our vector spaces have binary coefficients; so, 1 + 1 = 0.

Homology. Let X denote a cubical complex, e.g., a single stage F_s of a filtered cubical complex as described in the main text. The vector space of k-chains of X, denoted C_k(X), is the vector space with basis given by the set of k-cubes. Thus, C₀(X) is the vector space with basis given by vertices in X and C₁(X) is the vector space spanned by the edges of X. To connect these adjacent degrees, we use an algebraic notion of boundary inspired by the geometric version. The boundary of a k-chain is the sum of the (k-1)-dimensional faces of each of its k-cubes. By definition, we set the boundary of a 0-chain to be 0. For example, the boundary of a 1-chain is given by the sum of endpoints of each 1-cube within the 1-chain. Hence, the boundary maps links together k and (k-1)-chains for all k.

We say a k-chain is a k-cycle if its boundary is 0. Note that a sequence of edges forming a geometric cycle is a 1-cycle because each vertex is a face of two (consecutive) edges and in our binary arithmetic, 1 + 1 = 0. The set of k-cycles is denoted Z_k(X). We say a k-chain is a k-boundary if it is the boundary of a (k + 1)-chain. The set of (k + 1)-boundaries is denoted B_k (X). A fundamental observation of algebraic topology is that every boundary is also a cycle. That is, the boundary map applied to a k-boundary is zero. Thus, the vector space of cycles is a sub-vector space of the boundaries and we can form the vector space quotient

$${\text{H}}_{\text{k}}\left(\text{X}\right)={\text{Z}}_{\text{k}}\left(\text{X}\right)\;/\;{\text{B}}_{\text{k}}\;\left(\text{X}\right).$$

The quotient vector space H_k(X) is known as the cubical homology of X in degree k. We provide examples of these definitions in the following paragraph.

For example, a 0-dimensional cycle can simply be a point or vertex. A generic 0-dimensional cycle is a formal algebraic sum of such vertices. There are no 0-boundaries. If two vertices are the endpoints of a line segment, then the sum of those two vertices is a 0-boundary, and those two vertices correspond to the same element in the quotient H₀(X). Generically, vertices may be connected by consecutive 1-cubes rather than a single one. Two vertices are homologous or equivalent if there is a path in the cubical complex connecting them. The set of all vertices homologous to one another form a homology class. Every vertex lies in exactly one homology class. Elements of 0-dimensional homology are not vertices themselves, but rather collections of vertices which are homologous to one another. If any two points in the complex can be connected by a path, then every point is homologous to every other, and there is only one such collection of vertices. In this case, the zeroth Betti number b₀ = 1, reflecting the fact that there is just one connected component. Alternatively, if the cubical complex consists of N disjoint pieces, then b₀ = N. Thus, 0-dimensional homology measures the intuitive notion of being “connected”, meaning the complex consists of a single piece, and the Betti number b₀ counts the number of components or disjoint pieces.

Stepping up one dimension, we have 1-dimensional cycles, 1-dimensional boundaries, and their quotient 1-dimensional homology. Examples of 1-dimensional cycles include paths of 1-cubes which start and end at the same 0-cube and formal linear combinations of such paths. Two 1-dimensional cycles are homologous if their sum is a 1-boundary. If a 1-cycle surrounds a filled in region, so that it is the boundary of that region, then it is a 1-boundary and thus trivial in H₁. It is said to be trivial. The Betti number b₁ counts the number of non-trivial equivalence classes of 1-cycles.

In Fig. 6, two 1-cycles are highlighted in gray and blue. The grey cycle encloses only voxels which are active within the region. The cycle is therefore a boundary of these voxels and thus the gray loop is trivial. Since the cubical complex does not contain the omitted (white) squares, the blue cycle is fundamentally different from the gray—it is non-trivial, since it cannot be expressed as the boundary of a collection of 2-cubes. Any attempt to express it as such a 2-boundary would require the “missing” voxels. The blue cycle is equivalent to many other loops including the boundary of the two omitted squares. We see this by noting the difference between the original blue cycle and the boundary of the omitted square is precisely the set of 2-cubes enclosed between the two cycles. These two cycles are equivalent from a homological perspective, and persistent homology counts them as one in the same, since they both enclose the same topological hole. Thus b₁ = 1 for the region displayed in Fig. 6 since there is a single non-trivial cycle. For completeness, b₀ = 1, since the region is connected.

An Example. We illustrate our topological constructions with an example. A two-dimensional filtered cubical complex is displayed in the interactive display of (Online Resource 1), where the color of a square indicates its signal amplitude (filtration) value. (Online Resource 1) provides a slider below the figure which allows the user to vary the threshold interactively and see the resulting cubical complex. We compute the persistence diagram for b₀ and b₁ of this small example. The initial cubical complex, with threshold parameter B_max = 10, has four connected components with no non-trivial loops. Hence, b₀ = 4 and b₁ = 0. Decreasing the signal amplitude threshold to 9 introduces three additional squares which in turn changes to the resulting topology. Two of the four components from the previous threshold are now merged. Furthermore, a loop has now formed in the bottom right. Thus, b₀ = 3 and b₁ = 1 for a signal threshold of 9. As we continue to decrease the signal threshold, the topology of the cubical complex continues to change. At a signal threshold of 5, the cubical complex becomes connected, so b₀ = 1. In particular, the loop formed at a threshold of 9 lives until the threshold becomes the minimum B_min = 1. The loop surrounding this square lives for a relatively long duration. This is indicative of a square (or more generally a region) of low activation being surrounded by an area of high activation. Such features are highlighted by persistent homology and appear in the persistence diagram as points very far from the y = x line.

Our topological analysis of fMRI signals follows the same flow as in the example of (Online Resource 1). We begin with a maximum threshold determined by the largest signal amplitude. As we reduce the signal amplitude threshold, we track how the topology of the resulting cubical complex changes. An interactive visualization for fMRI signal in the ACC is shown in (Online Resource 2). Starting from the maximum threshold B_max = 1900, there are three connected components. As the signal threshold decreases, new connected components are formed and merged together. The persistence diagram summarizes these changes which are then reflected in the persistence landscape.

Persistence Landscapes. The persistence diagram provides a useful visualization of the geometric properties of a filtered cubical complex. However, standard statistical analysis applied to persistence diagrams is not always straightforward or interpretable. To remedy this, vectorization schemes were developed, allowing persistence diagrams to be embedded into vector spaces where standard statistical tests can be applied. One such vectorization scheme is known as the Persistence Landscape. Persistence landscapes provide a convenient, non-parametric vectorization of the diagram with no additional parameters to tune. They are essentially a repackaging of the same information provided by persistent homology amenable to arithmetic manipulation. Importantly, they provide a stable avenue to apply the techniques of machine and statistical learning to the output of persistent homology.

A persistence landscape constructs a sequence of piecewise-linear functions λ_k from a given diagram. Explicitly, for each point (b,d) in the diagram, first define the auxiliary function.

The graph of each f_(b,d) function is an isosceles triangle whose base is precisely the interval (b, d) and whose height is determined by d-b. Finally, the persistence landscape is the sequence of functions {λ_k}, where each λ_k is a real-valued function defined so λ_k(t) is the k^th-largest value of the collection of auxiliary functions{f_(b,d)(t)}. Therefore, λ₁(t) is the largest value of {f_(b,d)(t)} for each t, λ₂(t) is the second largest value of {f_(b,d)(t)}, and so on.

As functions can be added, subtracted, and multiplied by real scalars, they naturally exhibit a vector space structure. Moreover, it is easy to construct the mean persistence landscape of a given set of persistence landscapes (whereas a mean persistence diagram might fail to exist). This allows us to easily perform statistical tests (e.g., permutation tests) on a set of labeled persistence landscapes which would be much more difficult on the underlying diagrams.

The conversion of a persistence diagram to a persistence landscape just described is shown in Fig. 2. Starting in Fig. 2a left, we see a simple persistence diagram. The vectorization procedure begins in Fig. 2a second from left, by drawing horizontal and vertical lines from each point in the diagram to the diagonal line y = x. The entire picture is rotated in Fig. 2a middle. We take outer envelopes of the dashed piecewise-linear lines of Fig. 2a middle to construct a sequence of functions λ_k, k ≥ 1. Thus, λ₁ consists of the outer-most envelope, λ₂ consists of the second outer-most envelope, and so forth (Fig. 2a right). The entire sequence of functions λ_k is the persistence landscape. In practice, each of these functions is sampled at a large number (roughly 1000) of points and concatenated to construct a single vector. We use the persim package of scikit-tda to compute the persistence landscape in practice. The procedure is the same for actual experimental data, albeit the output is more complicated (Fig. 2b).

Statistical Testing

We now describe in detail our labelled permutation test and support vector classifier (SVC) which comprise the main statistical tests. In addition, we also performed LASSO tests on the data.

Permutation Tests. We perform a total of six labelled permutation tests, each specified by a pair of task conditions from (Periodic, Random, Rest) and a homological degree (0 or 1). The first step in each test is to construct the baseline significance. This is done by computing the average landscape of the selected conditions individually, remembering that landscapes can be treated as vectors in a vector space where arithmetic operations are well-defined. The baseline significance is the supremum norm of the difference of the averages. The second step is to perform a stratified shuffle of the labeling on the original landscapes, so that the correct ratio is maintained between the labels. For each shuffle, the above procedure is repeated: compute the average landscape with respect to the shuffled labelling and compute the supremum norm of their difference. If the supremum norm from a shuffled labelling is greater than the baseline significance, we deem the shuffled labelling to be significant. After 1500 shuffles, the p-value of the test if the number of significant shuffles divided by 1500. These p-values are reported in Fig. 3.

Support Vector Classifiers. The support vector classifiers also depend on a pair of task conditions; however, the landscapes are concatenated along their homological degree for this test. In particular, if λ⁰ and λ¹ are the landscape vectors in degree 0 and 1 respectively, then the vector λ^tot = (λ⁰, λ¹) is used for construction of the SVC. Note that since λ^tot contains both λ⁰ and λ¹ there is no loss in topological information in doing the concatenation. Each SVC is validated using tenfold cross-validation, a commonly used technique in machine learning. For each pair of test conditions, only 90% of the labelled data is used in the construction of the SVC. The remaining 10% is used to test the validity of the SVC in terms of its classification accuracy—we count how many of the remaining 10% of the data was correctly classified. This process is repeated 10 times or folds, so that at the end, the SVC algorithm has been tested against every data point exactly once. The classification accuracy from each of the 10 folds are averaged, and the results are displayed in Figs. 4 and 5.

The hyper-parameters of each SVC were determined by performing a standard grid search with 5-fold cross validation over the regularization parameter (typically denoted ‘C’) and the kernel type (whether ‘linear’, polynomial ‘poly’, or ‘sigmoid’). For the polynomial kernel, degrees less than 4 were also sampled over.

LASSO. We also tested our topological approach through the lens of a LASSO classifier. LASSO (least absolute shrinkage and selection operator) regression is a statistical analysis method that allows for variable selection by enforcing an L₁ regularization penalty in its optimization routine. This leads to enhanced interpretability of the results and improved summary of the data. In particular, LASSO provides in-built feature selection, useful for analyzing high-dimensional data. Figure 7 displays the classification accuracies of a LASSO classifier applied to our persistence landscape (TDA-based) summary versus our non-TDA vectorized approach. The TDA-based approach outperforms the non-TDA vectorization in almost every case. More generally however, LASSO underperforms the SVC pipeline (TDA and Vectorization, see Fig. 4). This is unsurprising because a LASSO classifier constructs a simple linear classifier which is a simpler statistical model than an SVC (with non-linear kernel).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Catanzaro, M.J., Rizzo, S., Kopchick, J. et al. Topological Data Analysis Captures Task-Driven fMRI Profiles in Individual Participants: A Classification Pipeline Based on Persistence. Neuroinform 22, 45–62 (2024). https://doi.org/10.1007/s12021-023-09645-3

Download citation

Accepted: 11 October 2023
Published: 04 November 2023
Issue Date: January 2024
DOI: https://doi.org/10.1007/s12021-023-09645-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Topological Data Analysis Captures Task-Driven fMRI Profiles in Individual Participants: A Classification Pipeline Based on Persistence

Abstract

Access this article

Similar content being viewed by others

MICROSTATELAB: The EEGLAB Toolbox for Resting-State Microstate Analysis

The Functional Aspects of Resting EEG Microstates: A Systematic Review

Fractal dimensions and machine learning for detection of Parkinson’s disease in resting-state electroencephalography

Data Availability

Notes

References

Funding