Brain Imaging and Behavior

, Volume 11, Issue 1, pp 253–263 | Cite as

Decoding power-spectral profiles from FMRI brain activities during naturalistic auditory experience

  • Xintao Hu
  • Lei Guo
  • Junwei HanEmail author
  • Tianming Liu
Original Research


Recent studies have demonstrated a close relationship between computational acoustic features and neural brain activities, and have largely advanced our understanding of auditory information processing in the human brain. Along this line, we proposed a multidisciplinary study to examine whether power spectral density (PSD) profiles can be decoded from brain activities during naturalistic auditory experience. The study was performed on a high resolution functional magnetic resonance imaging (fMRI) dataset acquired when participants freely listened to the audio-description of the movie “Forrest Gump”. Representative PSD profiles existing in the audio-movie were identified by clustering the audio samples according to their PSD descriptors. Support vector machine (SVM) classifiers were trained to differentiate the representative PSD profiles using corresponding fMRI brain activities. Based on PSD profile decoding, we explored how the neural decodability correlated to power intensity and frequency deviants. Our experimental results demonstrated that PSD profiles can be reliably decoded from brain activities. We also suggested a sigmoidal relationship between the neural decodability and power intensity deviants of PSD profiles. Our study in addition substantiates the feasibility and advantage of naturalistic paradigm for studying neural encoding of complex auditory information.


Power-spectral profile fMRI brain decoding Auditory intensity-encoding Frequency-encoding Naturalistic paradigm 


Compliance with ethical standards


This study was funded by National Natural Science Foundation of China (NSFC) 61103061, 61333017, 61473234 and 61522207, and the Fundamental Research Funds for the Central Universities 3102014JCQ01065.

Conflict of Interest

All co-authors have seen and agreed with the contents of the manuscript. We have no relevant conflicts of interest.

Ethical approval

This article does not contain any studies with human participants performed by any of the authors.


  1. Abrams, D. A., Ryali, S., Chen, T., Chordia, P., Khouzam, A., Levitin, D. J., & Menon, V. (2013). Inter-subject synchronization of brain responses during natural music listening. European Journal of Neuroscience, 37(9), 1458–1469.CrossRefPubMedPubMedCentralGoogle Scholar
  2. Alluri, V., Toiviainen, P., Jaaskelainen, I. P., Glerean, E., Sams, M., & Brattico, E. (2012). Large-scale brain networks emerge from dynamic processing of musical timbre, key and rhythm. NeuroImage, 59(4), 3677–3689.CrossRefPubMedGoogle Scholar
  3. Alluri, V., Toiviainen, P., Lund, T. E., Wallentin, M., Vuust, P., Nandi, A. K., Ristaniemi, T., & Brattico, E. (2013). From Vivaldi to Beatles and back: predicting lateralized brain responses to music. NeuroImage, 83, 627–636.CrossRefPubMedGoogle Scholar
  4. Angenstein, N., & Brechmann, A. (2015). Auditory intensity processing: categorization versus comparison. NeuroImage, 119, 362–370.CrossRefPubMedGoogle Scholar
  5. Bartels, A., & Zeki, S. (2005). Brain dynamics during natural viewing conditions - a new guide for mapping connectivity in vivo. NeuroImage, 24(2), 339–349.CrossRefPubMedGoogle Scholar
  6. Bilecen D, Seifritz E, Scheffler K, Henning J, AC S (2002) Amplitopicity of the human auditory cortex: an fMRI study. NeuroImage 17 (2):710–718.Google Scholar
  7. Bordier, C., Puja, F., & Macaluso, E. (2013). Sensory processing during viewing of cinematographic material: computational modeling and functional neuroimaging. NeuroImage, 67, 213–226.CrossRefPubMedGoogle Scholar
  8. Cong, F., Alluri, V., Nandi, A. K., Toiviainen, P., Rui, F., Abu-Jamous, B., Gong, L., Craenen, B. G. W., Poikonen, H., & Huotilainen, M. (2013). Linking brain responses to naturalistic music through analysis of ongoing EEG and stimulus features. Multimedia IEEE Transactions on, 15(5), 1060–1069.CrossRefGoogle Scholar
  9. Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297.Google Scholar
  10. Dykstra, A. R., Koh, C. K., Braida, L. D., & Mark Jude, T. (2012). Dissociation of detection and discrimination of pure tones following bilateral lesions of auditory cortex. PloS One, 7(9), e44602.CrossRefPubMedPubMedCentralGoogle Scholar
  11. Fang, J., Hu, X., Han, J., Jiang, X., Zhu, D., Guo, L., & Liu, T. (2015). Data-driven analysis of functional brain interactions during free listening to music and speech. Brain Imaging and Behavior, 9(2), 162–177.CrossRefPubMedGoogle Scholar
  12. Farbood, M. M., Heeger, D. J., Marcus, G., Hasson, U., & Lerner, Y. (2015). The neural processing of hierarchical structure in music and speech at different timescales. Frontiers in Neuroscience, 9, 157.CrossRefPubMedPubMedCentralGoogle Scholar
  13. Frey, B. J., & Dueck, D. (2007). Clustering by passing messages between data points. Science, 315(5814), 972–976.CrossRefPubMedGoogle Scholar
  14. Han, J., Chen, C., Shao, L., Hu, X., Han, J., & Liu, T. (2015). Learning computational models of video memorability from FMRI brain imaging. IEEE Trans. On Cybernetics, 45(8), 1692–1703.CrossRefGoogle Scholar
  15. Hanke, M., Baumgartner, F. J., Ibe, P., Kaule, F. R., Pollmann, S., Speck, O., Zinke, W., & Stadler, J. (2014). A high-resolution 7-tesla fMRI dataset from complex natural stimulation with an audio movie. Scientific Data, 1, 140003.CrossRefPubMedPubMedCentralGoogle Scholar
  16. Hasson, U., & Honey, C. (2012). Future trends in neuroimaging: neural processes as expressed within real-life contexts. NeuroImage, 62(2), 1272–1278.CrossRefPubMedPubMedCentralGoogle Scholar
  17. Hasson, U., Nir, Y., Levy, I., Fuhrmann, G., & Malach, R. (2004). Intersubject synchronization of cortical activity during natural vision. Science, 303(5664), 1634–1640.CrossRefPubMedGoogle Scholar
  18. Hinton, G. E. (2002). Training products of experts by minimizing contrastive divergence. Neural Computation, 14(8), 1771–1800.CrossRefPubMedGoogle Scholar
  19. Hu, X., Lv, C., Cheng, G., Lv, J., Guo, L., Han, J., & Liu, T. (2015). Sparsity-constrained fMRI decoding of visual saliency in naturalistic video streams. Autonomous Mental Development, IEEE Transactions on 7, 2, 65–75.Google Scholar
  20. Huth, A. G., Nishimoto, S., Vu, A. T., & Gallant, J. L. (2012). A continuous semantic space describes the representation of thousands of object and action categories across the human brain. Neuron, 76(6), 1210–1224.CrossRefPubMedPubMedCentralGoogle Scholar
  21. Jenkinson, M., Beckmann, C. F., Behrens, T. E., Woolrich, M. W., & Smith, S. M. (2012). FSL. NeuroImage, 62(2), 782–790.CrossRefPubMedGoogle Scholar
  22. Ji, X., Han, J., Jiang, X., Hu, X., Guo, L., Han, J., Shao, L., & Liu, T. (2015). Analysis of music/speech via integration of audio content and functional brain response. Information Sciences, 297, 271–282.CrossRefGoogle Scholar
  23. Kauppi, J. P., Pajula, J., & Tohka, J. (2014). A versatile software package for inter-subject correlation based analyses of fMRI. Frontiers in Neuroinformatics, 8, 2.CrossRefPubMedPubMedCentralGoogle Scholar
  24. Klein, M. E., & Zatorre, R. J. (2015). Representations of invariant musical categories are decodable by pattern analysis of locally distributed BOLD responses in superior temporal and intraparietal sulci. Cerebral Cortex, 25(7), 1947–1957.CrossRefPubMedGoogle Scholar
  25. Kumar, S., Bonnici, H. M., Teki, S., Agus, T. R., Pressnitzer, D., Maguire, E. A., & TD, G. (2014). Representations of specific acoustic patterns in the auditory cortex and hippocampus. Proceedings Biological Sciences/The Royal Society, 281(1791), 20141000.CrossRefGoogle Scholar
  26. Langers, D. R., Van, D. P., Schoenmaker, E. S., & Backes, W. H. (2007). fMRI activation in relation to sound intensity and loudness. NeuroImage, 35(2), 709–718.CrossRefPubMedGoogle Scholar
  27. Lasota, K., Ulmer, J., Firszt, J., Biswal, B., Daniels, D., & Prost, R. (2003). Intensity-dependent activation of the primary auditory cortex in functional magnetic resonance imaging. Journal of Computer Assisted Tomography, 27(2), 213–218.CrossRefPubMedGoogle Scholar
  28. Lockwood, A., Salvi, R., Ml, A. S., Wack, D., Murphy, B., & Burkard, R. (1999). The functional anatomy of the normal human auditory system: responses to 0.5 and 4.0 kHz tones at varied intensities. Cerebral Cortex, 9(1), 65–76.CrossRefPubMedGoogle Scholar
  29. Mohr, C. M., King, W. M., Freeman, A. J., Briggs, R. W., & Leonard, C. M. (1999). Influence of speech stimuli intensity on the activation of auditory cortex investigated with functional magnetic resonance imaging. Acoustical Society of America Journal, 105(5), 2738–2745.CrossRefGoogle Scholar
  30. Mustovic, H., Scheffler, K., Di Salle, F., Esposito, F., Neuhoff, J. G., Hennig, J., & Seifritz, E. (2003). Temporal integration of sequential auditory events: silent period in sound pattern activates human planum temporale. NeuroImage, 20(1), 429–434.CrossRefPubMedGoogle Scholar
  31. Nardo, D., Santangelo, V., & Macaluso, E. (2011). Stimulus-driven orienting of visuo-spatial attention in complex dynamic environments. Neuron, 69(5), 1015–1028.CrossRefPubMedGoogle Scholar
  32. Naselaris, T., Kay, K. N., Nishimoto, S., & Gallant, J. L. (2011). Encoding and decoding in fMRI. NeuroImage, 56(2), 400–410.CrossRefPubMedGoogle Scholar
  33. Nishimoto, S., Vu, A. T., Naselaris, T., Benjamini, Y., Yu, B., & Gallant, J. L. (2011). Reconstructing visual experiences from brain activity evoked by natural movies. Current Biology, 21(19), 1641–1646.CrossRefPubMedPubMedCentralGoogle Scholar
  34. Norman, K. A., Polyn, S. M., Detre, G. J., & Haxby, J. V. (2006). Beyond mind-reading: multi-voxel pattern analysis of fMRI data. Trends in Cognitive Sciences, 10(9), 424–430.CrossRefPubMedGoogle Scholar
  35. Opitz, B., Rinne, T., Mecklinger, A., Von Cramon, D. Y., & Schröger, E. (2002). Differential contribution of frontal and temporal cortices to auditory change detection: fMRI and ERP results. NeuroImage, 15(1), 167–174.CrossRefPubMedGoogle Scholar
  36. Proakis, J. G., & Manolakis, D. G. (1992). Digital signal processing: Principles, algorithms, and applications. Maxwell Macmillan Canada, Maxwell Macmillan International: Macmillan.Google Scholar
  37. Reiterer, S., Erb, M., Grodd, W., & Wildgruber, D. (2008). Cerebral processing of timbre and loudness: fMRI evidence for a contribution of Broca’s area to basic auditory discrimination. Brain Imaging and Behavior, 2(1), 1–10.CrossRefGoogle Scholar
  38. Röhl, M., & Uppenkamp, S. (2012). Neural coding of sound intensity and loudness in the human auditory system. Jaro, 13(3), 369–379.CrossRefPubMedPubMedCentralGoogle Scholar
  39. Saenz, M., & Langers, D. (2014). Tonotopic mapping of human auditory cortex. Hearing Research, 307(1), 42–52.CrossRefPubMedGoogle Scholar
  40. Santoro, R., Moerel, M., De, M. F., Goebel, R., Ugurbil, K., Yacoub, E., & Formisano, E. (2014). Encoding of natural sounds at multiple spectral and temporal resolutions in the human auditory cortex. PLoS Computational Biology, 10(1), e1003412.CrossRefPubMedPubMedCentralGoogle Scholar
  41. Spiers, H. J., & Maguire, E. A. (2007). Decoding human brain activity during real-world experiences. Trends in Cognitive Sciences, 11(8), 356–365.CrossRefPubMedGoogle Scholar
  42. Talavage, T. M., Sereno, M. I., Melcher, J. R., Ledden, P. J., Rosen, B. R., & Dale, A. M. (2004). Tonotopic organization in human auditory cortex revealed by progressions of frequency sensitivity. Journal of Neurophysiology, 91(3), 1282–1296.CrossRefPubMedGoogle Scholar
  43. Toiviainen, P., Alluri, V., Brattico, E., Wallentin, M., & Vuust, P. (2013). Capturing the musical brain with lasso: dynamic decoding of musical features from fMRI data. NeuroImage, 88C, 170–180.Google Scholar
  44. Trost, W., Frühholz, S., Cochrane, T., Cojan, Y., & Vuilleumier, P. (2015). Temporal dynamics of musical emotions examined through intersubject synchrony of brain activity. Social Cognitive and Affective Neuroscience. doi: 10.1093/scan/nsv060.PubMedPubMedCentralGoogle Scholar
  45. Uppenkamp, S., & Röhl, M. (2013). Human auditory neuroimaging of intensity and loudness. Hearing Research, 307(1), 65–73.PubMedGoogle Scholar
  46. Welch, P. D. (1967). The use of fast Fourier transform for the estimation of power spectra: a method based on time averaging over short, modified periodograms. IEEE Transactions on Audio and Electroacoustics, 15(2), 70–73.CrossRefGoogle Scholar
  47. Zhao S, Jiang X, Han J, Hu X, Zhu D, Lv J, Zhang T, Guo L, Liu T (2014) Decoding auditory saliency from FMRI brain imaging. Paper presented at the proceedings of the ACM international conference on multimedia, Orlando, Florida, USA.Google Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.School of AutomationNorthwestern Polytechnical UniversityXi’anChina
  2. 2.Cortical Architecture Imaging and Discovery Lab, Department of Computer Science and Bioimaging Research CenterThe University of GeorgiaAthensUSA

Personalised recommendations