Advertisement

Mapping Heterogeneous Textual Data: A Multidimensional Approach Based on Spatiality and Theme

  • Jacques FizeEmail author
  • Mathieu Roche
  • Maguelonne Teisseire
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11938)

Abstract

In this paper, we propose a multidimensional mapping approach for heterogeneous textual data that exploits firstly the spatial dimension and secondly the thematic dimension. Based on the Spatial Textual Representation (STR) as well as the Geodict geographic database, the contribution presented in this paper integrates the thematic dimension of documents. To support our proposal on mapping textual documents, we evaluate the different aspects of the process using two real corpora, including one corpus that is highly heterogeneous.

Keywords

Text mining Spatial and thematic dimensions Heterogeneous data 

References

  1. 1.
    Arsevska, E., et al.: Monitoring disease outbreak events on the web using text-mining approach and domain expert knowledge. In: European Language Resources Association (ELRA), Paris, France, May 2016Google Scholar
  2. 2.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)zbMATHGoogle Scholar
  3. 3.
    Bunke, H., Allermann, G.: Inexact graph recognition matching for structural pattern. Pattern Recognit. Lett. 1(May), 245–253 (1983).  https://doi.org/10.1016/0167-8655(83)90033-8CrossRefzbMATHGoogle Scholar
  4. 4.
    Casati, R., Varzi, A.C.: Spatial entities. In: Stock, O. (ed.) Spatial and Temporal Reasoning, pp. 73–96. Springer, Dordrecht (1997).  https://doi.org/10.1007/978-0-585-28322-7_3CrossRefGoogle Scholar
  5. 5.
    Fischer, A., Riesen, K., Bunke, H.: Improved quadratic time approximation of graph edit distance by combining Hausdorff matching and greedy assignment. Pattern Recognit. Lett. 87, 55–62 (2017).  https://doi.org/10.1016/j.patrec.2016.06.014CrossRefGoogle Scholar
  6. 6.
    Fize, J., Roche, M., Teisseire, M.: Matching heterogeneous textual data using spatial features. In: 2018 IEEE International Conference on Data Mining Workshops, ICDM Workshops, Singapore, Singapore, 17–20 November 2018, pp. 1389–1396 (2018).  https://doi.org/10.1109/ICDMW.2018.00197
  7. 7.
    Fize, J., Shrivastava, G.: GeoDict: an integrated gazetteer. Association for Computational Linguistics (2017)Google Scholar
  8. 8.
    Lossio-Ventura, J.A., Jonquet, C., Roche, M., Teisseire, M.: Biomedical term extraction: overview and a new methodology. Inf. Retr. J. 19(1–2), 59–99 (2016).  https://doi.org/10.1007/s10791-015-9262-2CrossRefGoogle Scholar
  9. 9.
    Papadimitriou, P., Dasdan, A., Garcia-Molina, H.: Web graph similarity for anomaly detection. J. Internet Serv. Appl. 1(1), 19–30 (2010).  https://doi.org/10.1007/s13174-010-0003-xCrossRefGoogle Scholar
  10. 10.
    Riesen, K., Jiang, X., Bunke, H.: Exact and inexact graph matching: methodology and applications. In: Aggarwal, C.C., Wang, H. (eds.) Managing and Mining Graph Data, vol. 40, pp. 217–247. Springer, Boston (2010).  https://doi.org/10.1007/978-1-4419-6045-0_7CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Jacques Fize
    • 1
    • 2
    Email author
  • Mathieu Roche
    • 1
    • 2
  • Maguelonne Teisseire
    • 2
  1. 1.CIRAD, UMR TETISMontpellierFrance
  2. 2.TETIS, Univ Montpellier, AgroParisTech, CIRAD, CNRS, IRSTEAMontpellierFrance

Personalised recommendations