Visualising a Text with a Tree Cloud

  • Philippe GambetteEmail author
  • Jean Véronis
Conference paper
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)


Tag clouds have gained popularity over the internet to provide a quick overview of the content of a website or a text. We introduce a new visualisation which displays more information: the tree cloud. Like a word cloud, it shows the most frequent words of the text, where the size reflects the frequency, but the words are arranged on a tree to reflect their semantic proximity according to the text. Such tree clouds help identify the main topics of a document, and even be used for text analysis. We also provide methods to evaluate the quality of the obtained tree cloud, and some key steps of its construction. Our algorithms are implemented in the free software TreeCloud available at


Frequent Word Semantic Distance Tree Distance Word Cloud Semantic Proximity 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Barthélémy, J. P., & Luong, N. X. (1987). Sur la topologie d’un arbre phylogénétique: Aspects théoriques, algorithmes et applications l’analyse de données textuelles. Mathématiques et Sciences Humaines, 100, 57–80.zbMATHGoogle Scholar
  2. Brunet, E. (1993). Un hypertexte statistique: Hyperbase. JADT 1993, 1–16.Google Scholar
  3. Cilibrasi, R., & Vitanyi, P. (2007). The google similarity distance. IEEE/ACM Transactions on Knowledge and Data Engineering, 19(3), 370–383.CrossRefGoogle Scholar
  4. van Eck, N. J. (2005). Towards Automatic Knwoledge Discovery from Scientific Literature. MSc Thesis.Google Scholar
  5. Evert, S. (2005). The Statistics of Word Cooccurrences, Word Pairs and Collocations. Phd Thesis, pp. 75–91.Google Scholar
  6. Fujimura, K., Fujimura, S., Matsubayashi, T., Yamada, T., & Okuda, H. (2008). Topigraphy: Visualization for Large-scale tag clouds. WWW2008, Beijing, China.Google Scholar
  7. Gascuel, O., & Levy, D. (1996). A reduction algorithm for approximating a (nonmetric) dissimilarity by a tree distance. Journal of Classification, 13(1), 129–155.zbMATHCrossRefMathSciNetGoogle Scholar
  8. Guénoche, A., & Darlu, P. (2009). TreeOfTrees: A new method to evaluate gene tree distances. Manuscript.Google Scholar
  9. Guénoche, A., & Garreta, H. (2000). Can we have confidence in a tree representation? Lecture Notes in Computer Science, 2066, 45–56.CrossRefGoogle Scholar
  10. Harrison, C. (2008). Visualizing the bible.
  11. Hassan-Montero, Y., & Herrero-Solana, V. (2006). Improving tag-clouds as visual information retrieval interfaces. InSciT2006. Merida, Spain.Google Scholar
  12. Kaser, O. and Lemire, D. (2007). Tag-Cloud Drawing: Algorithms for Cloud Visualization, in Tagging and Metadata for Social Information Organization (workshop at WWW2007), 10 pages, May 2007.Google Scholar
  13. Saitou, N., & Nei, M. (1987). The neighbor-joining method: A new method for reconstructing phylogenetic trees. Molecular Biology and Evolution, 4, 406–425.Google Scholar
  14. Sattah, S., & Tversky, A. (1977). Additive similarity trees. Psychometrika, 42, 319–345.CrossRefGoogle Scholar
  15. Shaw, B. (2005). Semidefinite embedding applied to visualizing folksonomies. Manuscript, 9 pages, December 2005.Google Scholar
  16. Véronis, J. (2004). Hyperlex, lexical cartography for information retrieval. Computer, Speech and Language, 18(3), 223–252.CrossRefGoogle Scholar
  17. Viégas, F. B., & Wattenberg, M. (2008). Tag clouds and the case for vernacular visualization. ACM Interactions, 15(4), 49–52.CrossRefGoogle Scholar
  18. Viprey, J.-M. (2006). Ergonomiser la visualisation AFC dans un environnement d’Exploration textuelle : une projection “Géodésique”. JADT 2006, 981–992.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  1. 1.L.I.R.M.M., UMR CNRS 5506Université Montpellier 2MontpellierFrance

Personalised recommendations