Skip to main content

Soft Textual Cartography Based on Topic Modeling and Clustering of Irregular, Multivariate Marked Networks

  • Conference paper
  • First Online:
Complex Networks & Their Applications VI (COMPLEX NETWORKS 2017)

Part of the book series: Studies in Computational Intelligence ((SCI,volume 689))

Included in the following conference series:

Abstract

Soft textual cartography is an original approach aimed to study communities on spatially embedded and textually defined complex weighted networks. The present approach relies on the integration of topic modeling and soft clustering procedures. These two aspects can be combined using topic distances, and weighted unoriented networks representing the spatial configuration; their synergy is promising in topic interpretation and geographical information retrieval. This paper proposes an unified formalism, underlining the compatibility of the two aspects, as illustrated on the textual descriptions of the municipalities of the canton of Vaud, Switzerland. It also points to possible extensions and applications of the method, potentially useful for dealing with the ever growing amount of georeferenced textual content.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The value of q has been selected after computing several topic models with q varying from 9 to 36, With \(q=12\) we had a relatively small number of interesting topics apt to illustrate our method.

References

  1. Bavaud, F.: Aggregation invariance in general clustering approaches. Adv. Data Anal. Classif. 3(3), 205–225 (2009)

    Google Scholar 

  2. Bavaud, F.: Testing spatial autocorrelation in weighted networks: the modes permutation test. J. Geogr. Syst. 3(15), 233–247 (2013)

    Google Scholar 

  3. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003). http://dl.acm.org/citation.cfm?id=944919.944937

  4. Ceré, R., Bavaud, F.: Multi-labelled image segmentation in irregular, weighted networks: a spatial autocorrelation approach. In: GISTAM 2017 - Proceedings of the 3rd International Conference on Geographical Information Systems Theory, Applications and Management, Porto, Portugal, 27–28 April, 2017, pp. 62–69 (2017). https://doi.org/10.5220/0006322800620069, https://doi.org/https://doi.org/10.5220/0006322800620069

  5. Ceré, R., Bavaud, F.: Soft image segmentation: on the clustering of irregular, weighted, multivariate marked networks (2017). Accepted for Springer Book of GISTAM 2017: Communications in Computer and Information Science CCIS series

    Google Scholar 

  6. Chang, J., Gerrish, S., Wang, C., Boyd-Graber, J.L., Blei, D.M.: Reading tea leaves: How humans interpret topic models. In: Advances in Neural Information Processing Systems, pp. 288–296 (2009)

    Google Scholar 

  7. DBpedia: DBpedia (2017). https://dbpedia.org/, http://dbpedia.org. Accessed 01 Sept 2017

  8. Feinerer, I., Hornik, K., Meyer, D.: Text mining infrastructure in r. J. Stat. Softw. 25(5), 1–54 (2008). http://www.jstatsoft.org/v25/i05/

  9. Fellows, I.: wordcloud: Word Clouds (2014). https://CRAN.R-project.org/package=wordcloud. R package version 2.5

  10. Fouss, F., Saerens, M., Shimbo, M.: Algorithms and Models for Network Data and Link Analysis. Cambridge University Press (2016)

    Google Scholar 

  11. Grün, B., Hornik, K.: topicmodels: an R package for fitting topic models. J. Stat. Softw. 40(13), 1–30 (2011). 10.18637/jss.v040.i13

    Google Scholar 

  12. Lê, S., Josse, J., Husson, F.: FactoMineR: A package for multivariate analysis. J. Stat. Softw. 25(1), 1–18 (2008). 10.18637/jss.v025.i01

    Google Scholar 

  13. Lu, K., Cai, X., Ajiferuke, I., Wolfram, D.: Vocabulary size and its effect on topic representation. Inf. Process. Manag. 53(3), 653–665 (2017)

    Google Scholar 

  14. Salah, A., Nadif, M.: Social regularized von mises-fisher mixture model for item recommendation. Data Mining Knowl. Discov. 31(5), 1218–1241 (2017). https://doi.org/10.1007/s10618-017-0499-9

  15. Smola, A.J., Kondor, R.: Kernels and regularization on graphs. In: COLT, vol. 2777, pp. 144–158. Springer (2003)

    Google Scholar 

  16. Sui, D.Z., Elwood, S., Goodchild, M.F. (eds.): Crowdsourcing Geographic Knowledge: Volunteered Geographic Information (VGI) in Theory and Practice. Springer, Dordrecht, New York (2013). OCLC: ocn810987841

    Google Scholar 

  17. Swiss Federal Statistical Office (FSO): STAT-TAB—Interactive tables (2017). http://www.bfs.admin.ch, https://www.pxweb.bfs.admin.ch. Accessed 01 Sept 2017

  18. Wikipedia: Wikipedia, The Free Encyclopedia (2017). https://en.wikipedia.org/, http://en.wikipedia.org. Accessed 01 Sept 2017

  19. Xu, Y., Yin, Y., Yin, J.: Tackling topic general words in topic modeling. Eng. Appl. Artif. Intell. 62, 124–133 (2017). https://doi.org/10.1016/j.engappai.2017.04.009, http://www.sciencedirect.com/science/article/pii/S0952197617300738

  20. Youssef Mourchid, M.E.H., Cherifi, H.: An image segmentation algorithm based on community detection. In: Complex Networks & Their Applications V Proceedings of the 5th International Workshop on Complex Networks and their Applications (COMPLEX NETWORKS 2016), pp. 821–830. Springer (2017). https://doi.org/10.1007/978-3-319-50901-3_65

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Mattia Egloff or Raphaël Ceré .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Egloff, M., Ceré, R. (2018). Soft Textual Cartography Based on Topic Modeling and Clustering of Irregular, Multivariate Marked Networks. In: Cherifi, C., Cherifi, H., Karsai, M., Musolesi, M. (eds) Complex Networks & Their Applications VI. COMPLEX NETWORKS 2017. Studies in Computational Intelligence, vol 689. Springer, Cham. https://doi.org/10.1007/978-3-319-72150-7_59

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-72150-7_59

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-72149-1

  • Online ISBN: 978-3-319-72150-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics