Text to Brain: Predicting the Spatial Distribution of Neuroimaging Observations from Text Reports

  • Jérôme DockèsEmail author
  • Demian Wassermann
  • Russell Poldrack
  • Fabian Suchanek
  • Bertrand Thirion
  • Gaël Varoquaux
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11072)


Despite the digital nature of magnetic resonance imaging, the resulting observations are most frequently reported and stored in text documents. There is a trove of information untapped in medical health records, case reports, and medical publications. In this paper, we propose to mine brain medical publications to learn the spatial distribution associated with anatomical terms. The problem is formulated in terms of minimization of a risk on distributions which leads to a least-deviation cost function. An efficient algorithm in the dual then learns the mapping from documents to brain structures. Empirical results using coordinates extracted from the brain-imaging literature show that (i) models must adapt to semantic variation in the terms used to describe a given anatomical structure, (ii) voxel-wise parameterization leads to higher likelihood of locations reported in unseen documents, (iii) least-deviation cost outperforms least-square. As a proof of concept for our method, we use our model of spatial distributions to predict the distribution of specific neurological conditions from text-only reports.



This project received funding from: the European Union’s H2020 Research Programme under Grant Agreement No. 785907 (HBP SGA2), the Metacog Digiteo project, the MetaMRI associate team, and ERC NeuroLang.

Supplementary material

473976_1_En_67_MOESM1_ESM.pdf (362 kb)
Supplementary material 1 (pdf 362 KB)


  1. 1.
    Mummery, C.J., Patterson, K., Price, C.J., et al.: A voxel-based morphometry study of semantic dementia: relationship between temporal lobe atrophy and semantic memory. Ann. Neurol. 47(1), 36–45 (2000)CrossRefGoogle Scholar
  2. 2.
    Bohland, J., Bokil, H., Allen, C., Mitra, P.: The brain atlas concordance problem: quantitative comparison of anatomical parcellations. PloS one 4(9), e7200 (2009)CrossRefGoogle Scholar
  3. 3.
    Laird, A.R., Fox, P.M., Price, C.J., et al.: ALE meta-analysis: controlling the false discovery rate and performing statistical contrasts. Hum. Brain Mapp. 25, 155 (2005)CrossRefGoogle Scholar
  4. 4.
    Yarkoni, T., Poldrack, R.A., Nichols, T.E., et al.: Large-scale automated synthesis of human functional neuroimaging data. Nat. Methods 8, 665 (2011)CrossRefGoogle Scholar
  5. 5.
    Van der Zwaag, W., Gentile, G., et al.: Where sound position influences sound object representations: a 7-T fMRI study. Neuroimage 54(3), 1803–1811 (2011)CrossRefGoogle Scholar
  6. 6.
    Gibbs, A.L., Su, F.E.: On choosing and bounding probability metrics. Int. Stat. Rev. 70(3), 419–435 (2002)CrossRefGoogle Scholar
  7. 7.
    Koenker, R., Bassett Jr., G.: Regression quantiles. Econom.: J. Econom. Soc. 46, 33–50 (1978)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Chen, C., Wei, Y.: Computational issues for quantile regression. Sankhyā: Indian J. Stat. 67, 399–417 (2005)MathSciNetzbMATHGoogle Scholar
  9. 9.
    Koenker, R., d’Orey, V.: Remark AS R92: a remark on algorithm as 229: computing dual regression quantiles and regression rank scores. J. R. Stat. Soc. Ser. C 43(2), 410–414 (1994)Google Scholar
  10. 10.
    Portnoy, S., Koenker, R., et al.: The gaussian hare and the laplacian tortoise: computability of squared-error versus absolute-error estimators. Stat. Sci. 12(4), 279–300 (1997)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Yi, C., Huang, J.: Semismooth newton coordinate descent algorithm for elastic-net penalized huber loss regression and quantile regression. J. Comput. Graph. Stat. 26, 547 (2017)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Byrd, R.H., Lu, P., Nocedal, J., Zhu, C.: A limited memory algorithm for bound constrained optimization. SIAM J. Sci. Comput. 16(5), 1190–1208 (1995)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Walker, F.O.: Huntington’s disease. Lancet 369(9557), 218–228 (2007)CrossRefGoogle Scholar
  14. 14.
    Davie, C.A.: A review of parkinson’s disease. Br. Med. Bull. 86, 109 (2008)CrossRefGoogle Scholar
  15. 15.
    Damasio, A.R.: Aphasia. N. Engl. J. Med. 326(8), 531–539 (1992)CrossRefGoogle Scholar
  16. 16.
    Surguladze, S., et al.: Interaction of catechol O-methyltransferase and serotonin transporter genes modulates effective connectivity in a facial emotion-processing circuitry. Transl. Psychiatry 2(1), e70 (2012)CrossRefGoogle Scholar
  17. 17.
    Van Dam, W.O., Rueschemeyer, S.A., Bekkering, H.: How specifically are action verbs represented in the neural motor system: an fMRI study. Neuroimage 53(4), 1318–1325 (2010)CrossRefGoogle Scholar
  18. 18.
    Silverman, B.W.: Density Estimation for Statistics and Data Analysis, vol. 26. CRC Press, Boca Raton (1986)CrossRefGoogle Scholar
  19. 19.
    Simonoff, J.S.: Smoothing Methods in Statistics. Springer Science & Business Media, New York (2012). Scholar
  20. 20.
    Wand, M.: Fast computation of multivariate kernel estimators. J. Comput. Graph. Stat. 3(4), 433–445 (1994)MathSciNetGoogle Scholar
  21. 21.
    Gramacki, A., Gramacki, J.: FFT-based fast computation of multivariate kernel density estimators with unconstrained bandwidth matrices. J. Comput. Graph. Stat. 26, 459–462 (2016)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Jérôme Dockès
    • 1
  • Demian Wassermann
    • 1
  • Russell Poldrack
    • 2
  • Fabian Suchanek
    • 3
  • Bertrand Thirion
    • 1
  • Gaël Varoquaux
    • 1
  1. 1.INRIA, CEAUniversité Paris-SaclayParisFrance
  2. 2.Stanford UniversityStanfordUSA
  3. 3.Télécom ParisTechParisFrance

Personalised recommendations