Skip to main content

Visualizing Document Content

  • Chapter
  • First Online:
Introduction to Text Visualization

Part of the book series: Atlantis Briefs in Artificial Intelligence ((ABAI,volume 1))

Abstract

Text is primarily made of words and always meant to contain content for information delivery. Content analysis is the earliest established method of text analysis (Holsti et al., The handbook of social psychology, vol 2, pp 596–692, 1968 [55]). Although studied extensively and systematically by linguists, related disciplines are roughly divided into two categories, structure and substance, according to their subjects of study (Ansari, Dimensions in discourse: elementary to essentials. Xlibris Corporation, Bloomington, 2013 [9]). Structure is about the surface characteristics that are visible for a valid text, such as word co-occurrence, text reuse, and grammar structure. On the other hand, substance is the umbrella term for all information that needs to be inferred from text, such as fingerprinting, topics, and events. Various techniques have been proposed to analyze these aspects. In this chapter, we will briefly review these techniques and the corresponding visualization systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 89.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.flickr.com/photos/tags.

References

  1. Picapica. www.picapica.org (2014). [Online; accessed Jan. 2016]

  2. Abbasi, A., Chen, H.: Applying authorship analysis to extremist-group web forum messages. Intelligent Systems, IEEE 20(5), 67–75 (2005)

    Article  Google Scholar 

  3. Abbasi, A., Chen, H.: Visualizing authorship for identification. In: Intelligence and Security Informatics, pp. 60–71. Springer (2006)

    Google Scholar 

  4. Aggarwal, C.C., Zhai, C.: A survey of text clustering algorithms. In: Mining Text Data, pp. 77–128. Springer (2012)

    Google Scholar 

  5. Aiden, E.L., Michel, J.B.: What we learned from 5 million books. https://www.ted.com/talks/what_we_learned_from_5_million_books (2011). [Online; accessed Jan. 2016]

  6. Allan, J.: Topic detection and tracking: event-based information organization, vol. 12. Springer Science & Business Media (2012)

    Google Scholar 

  7. Allan, J., Carbonell, J.G., Doddington, G., Yamron, J., Yang, Y.: Topic detection and tracking pilot study final report (1998)

    Google Scholar 

  8. Allan, J., Papka, R., Lavrenko, V.: On-line new event detection and tracking. In: Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, pp. 37–45. ACM (1998)

    Google Scholar 

  9. Ansari, T.: Dimensions in Discourse: Elementary to Essentials. Xlibris Corporation (2013)

    Google Scholar 

  10. Asuncion, A., Welling, M., Smyth, P., Teh, Y.W.: On smoothing and inference for topic models. In: Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, pp. 27–34. AUAI Press (2009)

    Google Scholar 

  11. Avidan, S., Shamir, A.: Seam carving for content-aware image resizing. In: ACM Transactions on graphics (TOG), vol. 26, p. 10. ACM (2007)

    Google Scholar 

  12. Bateman, S., Gutwin, C., Nacenta, M.: Seeing things in the clouds: the effect of visual features on tag cloud selections. Proceedings of the nineteenth ACM conference on Hypertext and hypermedia 4250, 193–202 (2008). doi:10.1145/1379092.1379130. URL http://portal.acm.org/citation.cfm?id=1379130

  13. Blei, D.M., Lafferty, J.D.: Dynamic topic models. In: Proceedings of the 23rd international conference on Machine learning, pp. 113–120. ACM (2006)

    Google Scholar 

  14. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. the. Journal of machine Learning research 3, 993–1022 (2003)

    MATH  Google Scholar 

  15. Burch, M., Lohmann, S., Beck, F., Rodriguez, N., Di Silvestro, L., Weiskopf, D.: Radcloud: Visualizing multiple texts with merged word clouds. In: Information Visualisation (IV), 2014 18th International Conference on, pp. 108–113. IEEE (2014)

    Google Scholar 

  16. Byron, L., Wattenberg, M.: Stacked graphs-geometry & aesthetics. Visualization and Computer Graphics, IEEE Transactions on 14(6), 1245–1252 (2008)

    Article  Google Scholar 

  17. Cai, D., He, X., Han, J.: Locally consistent concept factorization for document clustering. Knowledge and Data Engineering, IEEE Transactions on 23(6), 902–913 (2011)

    Article  Google Scholar 

  18. Cao, N., Gotz, D., Sun, J., Lin, Y.R., Qu, H.: SolarMap: Multifaceted Visual Analytics for Topic Exploration. 2011 IEEE 11th International Conference on Data Mining pp. 101–110 (2011). doi:10.1109/ICDM.2011.135. URL http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=6137214

  19. Cao, N., Sun, J., Lin, Y.R., Gotz, D., Liu, S., Qu, H.: FacetAtlas: Multifaceted visualization for rich text corpora. IEEE Transactions on Visualization and Computer Graphics 16(6), 1172–1181 (2010). doi:10.1109/TVCG.2010.154

    Article  Google Scholar 

  20. Cavnar, W.B., Trenkle, J.M., et al.: N-gram-based text categorization. Ann Arbor MI 48113(2), 161–175 (1994)

    Google Scholar 

  21. Chakrabarti, D., Kumar, R., Tomkins, A.: Evolutionary clustering. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 554–560. ACM (2006)

    Google Scholar 

  22. Chi, M.T., Lin, S.S., Chen, S.Y., Lin, C.H., Lee, T.Y.: Morphable word clouds for time-varying text data visualization. Visualization and Computer Graphics, IEEE Transactions on 21(12), 1415–1426 (2015)

    Article  Google Scholar 

  23. Chi, Y., Song, X., Zhou, D., Hino, K., Tseng, B.L.: Evolutionary spectral clustering by incorporating temporal smoothness. In: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 153–162. ACM (2007)

    Google Scholar 

  24. Choo, J., Lee, C., Reddy, C.K., Park, H.: UTOPIAN: User-driven topic modeling based on interactive nonnegative matrix factorization. IEEE Transactions on Visualization and Computer Graphics 19(12), 1992–2001 (2013). doi:10.1109/TVCG.2013.212

    Article  Google Scholar 

  25. Chuang, J., Manning, C.D., Heer, J.: Termite: Visualization techniques for assessing textual topic models. In: Proceedings of the International Working Conference on Advanced Visual Interfaces, pp. 74–77. ACM (2012)

    Google Scholar 

  26. Collins, C., Viegas, F.B., Wattenberg, M.: Parallel tag clouds to explore and analyze faceted text corpora. In: Visual Analytics Science and Technology, 2009. VAST 2009. IEEE Symposium on, pp. 91–98. IEEE (2009)

    Google Scholar 

  27. Covington, M.A.: A fundamental algorithm for dependency parsing. In: Proceedings of the 39th annual ACM southeast conference, pp. 95–102. Citeseer (2001)

    Google Scholar 

  28. Cox, T.F., Cox, M.A.: Multidimensional scaling. CRC press (2000)

    Google Scholar 

  29. Cui, W., Liu, S., Tan, L., Shi, C., Song, Y., Gao, Z., Tong, X., Qu, H.: Textflow: Towards better understanding of evolving topics in text. IEEE Transactions on Visualization and Computer Graphics 17(12), 2412–2421 (2011). doi:10.1109/TVCG.2011.239

    Article  Google Scholar 

  30. Cui, W., Liu, S., Tan, L., Shi, C., Song, Y., Gao, Z.J., Qu, H., Tong, X.: Textflow: Towards better understanding of evolving topics in text. Visualization and Computer Graphics, IEEE Transactions on 17(12), 2412–2421 (2011)

    Article  Google Scholar 

  31. Cui, W., Liu, S., Wu, Z., Wei, H.: How Hierarchical Topics Evolve in Large Text Corpora. IEEE Transactions on Visualization and Computer Graphics 20(12), 2281–2290 (2014). doi:10.1109/TVCG.2014.2346433. URL http://research.microsoft.com/en-us/um/people/weiweicu/images/roseriver.pdf

    Google Scholar 

  32. Cui, W., Wu, Y., Liu, S., Wei, F., Zhou, M., Qu, H.: Context-preserving, dynamic word cloud visualization. IEEE Computer Graphics and Applications 30(6), 42–53 (2010). doi:10.1109/MCG.2010.102

    Article  Google Scholar 

  33. Culy, C., Lyding, V.: Double tree: an advanced kwic visualization for expert users. In: Information Visualisation (IV), 2010 14th International Conference, pp. 98–103. IEEE (2010)

    Google Scholar 

  34. Daud, A., Li, J., Zhou, L., Muhammad, F.: Knowledge discovery through directed probabilistic topic models: a survey. Frontiers of computer science in China 4(2), 280–301 (2010)

    Article  Google Scholar 

  35. De Vel, O., Anderson, A., Corney, M., Mohay, G.: Mining e-mail content for author identification forensics. ACM Sigmod Record 30(4), 55–64 (2001)

    Article  Google Scholar 

  36. Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. Journal of the American society for information science 41(6), 391 (1990)

    Article  Google Scholar 

  37. Don, A., Zheleva, E., Gregory, M., Tarkan, S., Auvil, L., Clement, T., Shneiderman, B., Plaisant, C.: Discovering interesting usage patterns in text collections: integrating text mining with visualization. Main pp. 213–221 (2007). doi:10.1145/1321440.1321473. URL http://portal.acm.org/citation.cfm?id=1321473

  38. Dörk, M., Gruen, D., Williamson, C., Carpendale, S.: A visual backchannel for large-scale events. Visualization and Computer Graphics, IEEE Transactions on 16(6), 1129–1138 (2010)

    Article  Google Scholar 

  39. Dou, W., Wang, X., Skau, D., Ribarsky, W., Zhou, M.X.: LeadLine: Interactive visual analysis of text data through event identification and exploration. Visual Analytics Science and Technology (VAST), 2012 IEEE Conference on pp. 93–102 (2012). doi:10.1109/VAST.2012.6400485. URL http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6400485

  40. Dou, W., Yu, L., Wang, X., Ma, Z., Ribarsky, W.: Hierarchicaltopics: Visually exploring large text collections using topic hierarchies. Visualization and Computer Graphics, IEEE Transactions on 19(12), 2002–2011 (2013)

    Article  Google Scholar 

  41. Firth, J.R.: A synopsis of linguistic theory, 1930-1955 (1957)

    Google Scholar 

  42. Forsati, R., Mahdavi, M., Shamsfard, M., Meybodi, M.R.: Efficient stochastic algorithms for document clustering. Information Sciences 220, 269–291 (2013)

    Article  MathSciNet  Google Scholar 

  43. Friendly, M., Denis, D.J.: Milestones in the history of thematic cartography, statistical graphics, and data visualization. URL http://www.datavis.ca/milestones (2001)

  44. Fruchterman, T.M., Reingold, E.M.: Graph drawing by force-directed placement. Software: Practice and experience 21(11), 1129–1164 (1991)

    Google Scholar 

  45. Gaifman, H.: Dependency systems and phrase-structure systems. Information and control 8(3), 304–337 (1965)

    Article  MathSciNet  MATH  Google Scholar 

  46. Gretarsson, B., Odonovan, J., Bostandjiev, S., Höllerer, T., Asuncion, A., Newman, D., Smyth, P.: Topicnets: Visual analysis of large text corpora with topic modeling. ACM Transactions on Intelligent Systems and Technology (TIST) 3(2), 23 (2012)

    Google Scholar 

  47. Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proceedings of the National Academy of Sciences 101(suppl 1), 5228–5235 (2004)

    Article  Google Scholar 

  48. HARRIS, J.: Word clouds considered harmful. www.niemanlab.org/2011/10/word-clouds-considered-harmful/ (2011). [Online; accessed Jan. 2016]

  49. Harris, Z.S.: Distributional structure. Word 10(23), 146–162 (1954)

    Google Scholar 

  50. Havre, S., Hetzler, E., Whitney, P., Nowell, L.: Themeriver: Visualizing thematic changes in large document collections. Visualization and Computer Graphics, IEEE Transactions on 8(1), 9–20 (2002)

    Article  Google Scholar 

  51. Hays, D.G.: Dependency theory: A formalism and some observations. Language pp. 511–525 (1964)

    Google Scholar 

  52. Heintze, N., et al.: Scalable document fingerprinting. In: 1996 USENIX workshop on electronic commerce, vol. 3 (1996)

    Google Scholar 

  53. Hilpert, M.: Dynamic visualizations of language change: Motion charts on the basis of bivariate and multivariate data from diachronic corpora. International Journal of Corpus Linguistics 16(4), 435–461 (2011)

    Article  MathSciNet  Google Scholar 

  54. Hoad, T.C., Zobel, J.: Methods for Identifying Versioned and Plagiarised Documents. Journal of the ASIS&T 54, 203–215 (2003). doi:10.1002/asi.10170

    Google Scholar 

  55. Holsti, O.R., et al.: Content analysis. The handbook of social psychology 2, 596–692 (1968)

    Google Scholar 

  56. Houvardas, J., Stamatatos, E.: N-gram feature selection for authorship identification. In: Artificial Intelligence: Methodology, Systems, and Applications, pp. 77–86. Springer (2006)

    Google Scholar 

  57. Jaffe, A., Naaman, M., Tassa, T., Davis, M.: Generating summaries and visualization for large collections of geo-referenced photographs. In: Proceedings of the 8th ACM international workshop on Multimedia information retrieval, pp. 89–98. ACM (2006)

    Google Scholar 

  58. Jankowska, M., Keselj, V., Milios, E.: Relative n-gram signatures: Document visualization at the level of character n-grams. In: Visual Analytics Science and Technology (VAST), 2012 IEEE Conference on, pp. 103–112. IEEE (2012)

    Google Scholar 

  59. Keim, D., Oelke, D., et al.: Literature fingerprinting: A new method for visual literary analysis. In: Visual Analytics Science and Technology, 2007. VAST 2007. IEEE Symposium on, pp. 115–122. IEEE (2007)

    Google Scholar 

  60. Kim, J.: Causation, nomic subsumption, and the concept of event. The Journal of Philosophy pp. 217–236 (1973)

    Google Scholar 

  61. Koh, K., Lee, B., Kim, B., Seo, J.: Maniwordle: Providing flexible control over wordle. Visualization and Computer Graphics, IEEE Transactions on 16(6), 1190–1197 (2010)

    Article  Google Scholar 

  62. Krstajić, M., Bertini, E.: Keim, D.a.: Cloudlines: Compact display of event episodes in multiple time-series. IEEE Transactions on Visualization and Computer Graphics 17(12), 2432–2439 (2011). doi:10.1109/TVCG.2011.179

    Article  Google Scholar 

  63. Kurby, C.A., Zacks, J.M.: Segmentation in the perception and memory of events. Trends in cognitive sciences 12(2), 72–79 (2008)

    Article  Google Scholar 

  64. Landauer, T.K., Foltz, P.W., Laham, D.: An introduction to latent semantic analysis. Discourse processes 25(2–3), 259–284 (1998)

    Article  Google Scholar 

  65. Lasswell, H.D.: Describing the contents of communication. Propaganda, communication and public opinion pp. 74–94 (1946)

    Google Scholar 

  66. Leavenworth, R.S., Grant, E.L.: Statistical quality control. Tata McGraw-Hill Education (2000)

    Google Scholar 

  67. Lee, B., Riche, N.H., Karlson, A.K., Carpendale, S.: Sparkclouds: Visualizing trends in tag clouds. Visualization and Computer Graphics, IEEE Transactions on 16(6), 1182–1189 (2010)

    Article  Google Scholar 

  68. Lee, H., Kihm, J., Choo, J., Stasko, J., Park, H.: ivisclustering: An interactive visual document clustering via topic modeling. In: Computer Graphics Forum, vol. 31, pp. 1155–1164. Wiley Online Library (2012)

    Google Scholar 

  69. Li, J., Zheng, R., Chen, H.: From fingerprint to writeprint. Communications of the ACM 49(4), 76–82 (2006)

    Article  Google Scholar 

  70. Liu, S., Wang, X., Chen, J., Zhu, J., Guo, B.: Topicpanorama: a full picture of relevant topics. In: Visual Analytics Science and Technology (VAST), 2014 IEEE Conference on, pp. 183–192. IEEE (2014)

    Google Scholar 

  71. Lotman, I.: The structure of the artistic text

    Google Scholar 

  72. Lu, Y., Mei, Q., Zhai, C.: Investigating task performance of probabilistic topic models: an empirical study of plsa and lda. Information Retrieval 14(2), 178–203 (2011)

    Article  Google Scholar 

  73. Luo, D., Yang, J., Krstajic, M., Ribarsky, W., Keim, D.: Event river: Visually exploring text collections with temporal references. IEEE Transactions on Visualization and Computer Graphics 18(1), 93–105 (2012). doi:10.1109/TVCG.2010.225. URL http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5611507

    Google Scholar 

  74. Luo, D., Yang, J., Krstajic, M., Ribarsky, W., Keim, D.: Eventriver: Visually exploring text collections with temporal references. Visualization and Computer Graphics, IEEE Transactions on 18(1), 93–105 (2012)

    Article  Google Scholar 

  75. Manber, U.: Finding similar files in a large file system. In: 1994 Winter USENIX Technical Conference, vol. 94, pp. 1–10 (1994)

    Google Scholar 

  76. Manning, C.D., Raghavan, P., Schütze, H., et al.: Introduction to information retrieval, vol. 1. Cambridge university press Cambridge (2008)

    Google Scholar 

  77. Mates, B.: Stoic logic (1953)

    Google Scholar 

  78. Mei, Q., Zhai, C.: Discovering evolutionary theme patterns from text: an exploration of temporal text mining. In: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, pp. 198–207. ACM (2005)

    Google Scholar 

  79. Milgram, S.: Psychological maps of paris, the individual in a social world (1977)

    Google Scholar 

  80. Mukhopadhyay, A., Maulik, U., Bandyopadhyay, S.: A survey of multiobjective evolutionary clustering. ACM Computing Surveys (CSUR) 47(4), 61 (2015)

    Article  Google Scholar 

  81. Ng, A.Y., Jordan, M.I., Weiss, Y., et al.: On spectral clustering: Analysis and an algorithm. Advances in neural information processing systems 2, 849–856 (2002)

    Google Scholar 

  82. Oelke, D., Kokkinakis, D., Keim, D.A.: Fingerprint matrices: Uncovering the dynamics of social networks in prose literature 32(3pt4), 371–380 (2013)

    Google Scholar 

  83. Oelke, D., Spretke, D., Stoffel, A., Keim, D.A.: Visual readability analysis: How to make your writings easier to read. Visualization and Computer Graphics, IEEE Transactions on 18(5), 662–674 (2012)

    Article  Google Scholar 

  84. Pirolli, P., Schank, P., Hearst, M., Diehl, C.: Scatter/gather browsing communicates the topic structure of a very large text collection. In: Proceedings of the SIGCHI conference on Human factors in computing systems, pp. 213–220. ACM (1996)

    Google Scholar 

  85. Playfair, W.: Commercial and political atlas and statistical breviary (1786)

    Google Scholar 

  86. Pylyshyn, Z.W., Storm, R.W.: Tracking multiple independent targets: Evidence for a parallel tracking mechanism*. Spatial vision 3(3), 179–197 (1988)

    Article  Google Scholar 

  87. Ribler, R.L., Abrams, M.: Using visualization to detect plagiarism in computer science classes. In: Proceedings of the IEEE Symposium on Information Vizualization, p. 173. IEEE Computer Society (2000)

    Google Scholar 

  88. Riehmann, P., Potthast, M., Stein, B., Froehlich, B.: Visual Assessment of Alleged Plagiarism Cases. Computer Graphics Forum 34(3), 61–70 (2015). doi:10.1111/cgf.12618. URL http://doi.wiley.com/10.1111/cgf.12618

    Google Scholar 

  89. Rivadeneira, a.W., Gruen, D.M., Muller, M.J., Millen, D.R.: Getting Our Head in the Clouds: Toward Evaluation Studies of Tagclouds. 25th SIGCHI Conference on Human Factors in Computing Systems, CHI 2007 pp. 995–998 (2007). doi:10.1145/1240624.1240775

  90. Rivadeneira, A.W., Gruen, D.M., Muller, M.J., Millen, D.R.: Getting our head in the clouds: toward evaluation studies of tagclouds. In: Proceedings of the SIGCHI conference on Human factors in computing systems, pp. 995–998. ACM (2007)

    Google Scholar 

  91. Robertson, G., Fernandez, R., Fisher, D., Lee, B., Stasko, J.: Effectiveness of animation in trend visualization. Visualization and Computer Graphics, IEEE Transactions on 14(6), 1325–1332 (2008)

    Article  Google Scholar 

  92. Rohrer, R.M., Ebert, D.S., Sibert, J.L.: The shape of shakespeare: visualizing text using implicit surfaces. In: Information Visualization, 1998. Proceedings. IEEE Symposium on, pp. 121–129. IEEE (1998)

    Google Scholar 

  93. Seifert, C., Ulbrich, E., Granitzer, M.: Word clouds for efficient document labeling. In: Discovery Science, pp. 292–306. Springer (2011)

    Google Scholar 

  94. Sgall, P.: Dependency-based formal description of language. The Encyclopedia of Language and Linguistics 2, 867–872 (1994)

    Google Scholar 

  95. Shivakumar, N., Garcia-Molina, H.: Finding near-replicas of documents on the web. In: The World Wide Web and Databases, pp. 204–212. Springer (1998)

    Google Scholar 

  96. Sinclair, J.: Corpus, concordance, collocation. Oxford University Press (1991)

    Google Scholar 

  97. Slingsby, A., Dykes, J., Wood, J., Clarke, K.: Interactive tag maps and tag clouds for the multiscale exploration of large spatio-temporal datasets. In: Information Visualization, 2007. IV’07. 11th International Conference, pp. 497–504. IEEE (2007)

    Google Scholar 

  98. Smith, A.E., Humphreys, M.S.: Evaluation of unsupervised semantic mapping of natural language with leximancer concept mapping. Behavior Research Methods 38(2), 262–279 (2006)

    Article  Google Scholar 

  99. Steyvers, M., Griffiths, T.: Probabilistic topic models. Handbook of latent semantic analysis 427(7), 424–440 (2007)

    Google Scholar 

  100. Subašić, I., Berendt, B.: Web Mining for Understanding Stories through Graph Visualisation. 2008 Eighth IEEE International Conference on Data Mining pp. 570–579 (2008). doi:10.1109/ICDM.2008.138. URL http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=4781152

  101. Sun, G., Wu, Y., Liu, S., Peng, T.Q., Zhu, J.J.H., Liang, R.: EvoRiver: Visual Analysis of Topic Coopetition on Social Media. Visualization and Computer Graphics, IEEE Transactions on PP(99), 1 (2014). doi:10.1109/TVCG.2014.2346919

    Google Scholar 

  102. Svartvik, J.: The Evans statements. University of Goteburg (1968)

    Google Scholar 

  103. Teh, Y.W., Jordan, M.I., Beal, M.J., Blei, D.M.: Hierarchical dirichlet processes. Journal of the american statistical association (2012)

    Google Scholar 

  104. Tufte, E.R.: Envisioning information. Optometry & Vision Science 68(4), 322–324 (1991)

    Article  Google Scholar 

  105. Tufte, E.R.: Beautiful evidence. New York (2006)

    Google Scholar 

  106. Van Ham, F., Wattenberg, M., Viégas, F.B.: Mapping text with phrase nets. IEEE Transactions on Visualization & Computer Graphics 6, 1169–1176 (2009)

    Google Scholar 

  107. Viégas, F.B., Wattenberg, M., Feinberg, J.: Participatory visualization with wordle. IEEE Transactions on Visualization and Computer Graphics 15(6), 1137–1144 (2009). doi:10.1109/TVCG.2009.171

    Article  Google Scholar 

  108. Vuillemot, R., Clement, T., Plaisant, C., Kumar, A.: What’s being said near martha? exploring name entities in literary text collections. In: Visual Analytics Science and Technology, 2009. VAST 2009. IEEE Symposium on, pp. 107–114. IEEE (2009)

    Google Scholar 

  109. Wang, C., Blei, D., Heckerman, D.: Continuous time dynamic topic models. arXiv preprint arXiv:1206.3298 (2012)

  110. Wang, X., McCallum, A.: Topics over time: a non-markov continuous-time model of topical trends. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 424–433. ACM (2006)

    Google Scholar 

  111. Wattenberg, M.: Arc diagrams: visualizing structure in strings. Information Visualization Proceedings 2002(2002), 110–116 (2002). doi:10.1109/INFVIS.2002.1173155

    Google Scholar 

  112. Wattenberg, M., Viégas, F.B.: The word tree, an interactive visual concordance. IEEE Transactions on Visualization and Computer Graphics 14(6), 1221–1228 (2008). doi:10.1109/TVCG.2008.172

    Article  Google Scholar 

  113. Wei, F., Liu, S., Song, Y., Pan, S., Zhou, M.X., Qian, W., Shi, L., Tan, L., Zhang, Q.: Tiara: a visual exploratory text analytic system. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 153–162. ACM (2010)

    Google Scholar 

  114. Werlich, E.: A text grammar of English. Quelle & Meyer (1976)

    Google Scholar 

  115. Wu, Y., Provan, T., Wei, F., Liu, S., Ma, K.L.: Semantic-Preserving Word Clouds by Seam Carving. Computer Graphics Forum 30(3), 741–750 (2011). doi:10.1111/j.1467-8659.2011.01923.x. URL http://doi.wiley.com/10.1111/j.1467-8659.2011.01923.x

    Google Scholar 

  116. Xu, P., Wu, Y., Wei, E., Peng, T.Q., Liu, S., Zhu, J.J.H., Qu, H.: Visual analysis of topic competition on social media. IEEE Transactions on Visualization and Computer Graphics 19(12), 2012–2021 (2013). doi:10.1109/TVCG.2013.221

    Article  Google Scholar 

  117. Xu, T., Zhang, Z., Yu, P.S., Long, B.: Evolutionary clustering by hierarchical dirichlet process with hidden markov state. In: Data Mining, 2008. ICDM’08. Eighth IEEE International Conference on, pp. 658–667. IEEE (2008)

    Google Scholar 

  118. Xu, W., Gong, Y.: Document clustering by concept factorization. In: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 202–209. ACM (2004)

    Google Scholar 

  119. Xu, W., Liu, X., Gong, Y.: Document clustering based on non-negative matrix factorization. In: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pp. 267–273. ACM (2003)

    Google Scholar 

  120. Zacks, J.M., Tversky, B.: Event structure in perception and conception. Psychological bulletin 127(1), 3 (2001)

    Article  Google Scholar 

  121. Zhang, J., Ghahramani, Z., Yang, Y.: A probabilistic model for online document clustering with application to novelty detection. In: Advances in Neural Information Processing Systems, pp. 1617–1624 (2004)

    Google Scholar 

  122. Zhao, Q., Mitra, P.: Event Detection and Visualization for Social Text Streams. Event London pp. 26–28 (2007). URL http://www.icwsm.org/papers/3--Zhao-Mitra.pdf

  123. Zheng, R., Li, J., Chen, H., Huang, Z.: A framework for authorship identification of online messages: Writing-style features and classification techniques. Journal of the American Society for Information Science and Technology 57(3), 378–393 (2006)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nan Cao .

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Atlantis Press and the author(s)

About this chapter

Cite this chapter

Cao, N., Cui, W. (2016). Visualizing Document Content. In: Introduction to Text Visualization. Atlantis Briefs in Artificial Intelligence, vol 1. Atlantis Press, Paris. https://doi.org/10.2991/978-94-6239-186-4_5

Download citation

  • DOI: https://doi.org/10.2991/978-94-6239-186-4_5

  • Published:

  • Publisher Name: Atlantis Press, Paris

  • Print ISBN: 978-94-6239-185-7

  • Online ISBN: 978-94-6239-186-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics