Text to Brain: Predicting the Spatial Distribution of Neuroimaging Observations from Text Reports
Despite the digital nature of magnetic resonance imaging, the resulting observations are most frequently reported and stored in text documents. There is a trove of information untapped in medical health records, case reports, and medical publications. In this paper, we propose to mine brain medical publications to learn the spatial distribution associated with anatomical terms. The problem is formulated in terms of minimization of a risk on distributions which leads to a least-deviation cost function. An efficient algorithm in the dual then learns the mapping from documents to brain structures. Empirical results using coordinates extracted from the brain-imaging literature show that (i) models must adapt to semantic variation in the terms used to describe a given anatomical structure, (ii) voxel-wise parameterization leads to higher likelihood of locations reported in unseen documents, (iii) least-deviation cost outperforms least-square. As a proof of concept for our method, we use our model of spatial distributions to predict the distribution of specific neurological conditions from text-only reports.
This project received funding from: the European Union’s H2020 Research Programme under Grant Agreement No. 785907 (HBP SGA2), the Metacog Digiteo project, the MetaMRI associate team, and ERC NeuroLang.
- 9.Koenker, R., d’Orey, V.: Remark AS R92: a remark on algorithm as 229: computing dual regression quantiles and regression rank scores. J. R. Stat. Soc. Ser. C 43(2), 410–414 (1994)Google Scholar