Mapping Heterogeneous Textual Data: A Multidimensional Approach Based on Spatiality and Theme
In this paper, we propose a multidimensional mapping approach for heterogeneous textual data that exploits firstly the spatial dimension and secondly the thematic dimension. Based on the Spatial Textual Representation (STR) as well as the Geodict geographic database, the contribution presented in this paper integrates the thematic dimension of documents. To support our proposal on mapping textual documents, we evaluate the different aspects of the process using two real corpora, including one corpus that is highly heterogeneous.
KeywordsText mining Spatial and thematic dimensions Heterogeneous data
- 1.Arsevska, E., et al.: Monitoring disease outbreak events on the web using text-mining approach and domain expert knowledge. In: European Language Resources Association (ELRA), Paris, France, May 2016Google Scholar
- 6.Fize, J., Roche, M., Teisseire, M.: Matching heterogeneous textual data using spatial features. In: 2018 IEEE International Conference on Data Mining Workshops, ICDM Workshops, Singapore, Singapore, 17–20 November 2018, pp. 1389–1396 (2018). https://doi.org/10.1109/ICDMW.2018.00197
- 7.Fize, J., Shrivastava, G.: GeoDict: an integrated gazetteer. Association for Computational Linguistics (2017)Google Scholar