Visualizing Document Content

Cao, Nan; Cui, Weiwei

doi:10.2991/978-94-6239-186-4_5

Nan Cao⁶ &
Weiwei Cui⁷

Part of the book series: Atlantis Briefs in Artificial Intelligence ((ABAI,volume 1))

2186 Accesses
1 Altmetric

Abstract

Text is primarily made of words and always meant to contain content for information delivery. Content analysis is the earliest established method of text analysis (Holsti et al., The handbook of social psychology, vol 2, pp 596–692, 1968 [55]). Although studied extensively and systematically by linguists, related disciplines are roughly divided into two categories, structure and substance, according to their subjects of study (Ansari, Dimensions in discourse: elementary to essentials. Xlibris Corporation, Bloomington, 2013 [9]). Structure is about the surface characteristics that are visible for a valid text, such as word co-occurrence, text reuse, and grammar structure. On the other hand, substance is the umbrella term for all information that needs to be inferred from text, such as fingerprinting, topics, and events. Various techniques have been proposed to analyze these aspects. In this chapter, we will briefly review these techniques and the corresponding visualization systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Hardcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://www.flickr.com/photos/tags.

References

Picapica. www.picapica.org (2014). [Online; accessed Jan. 2016]
Abbasi, A., Chen, H.: Applying authorship analysis to extremist-group web forum messages. Intelligent Systems, IEEE 20(5), 67–75 (2005)
Article Google Scholar
Abbasi, A., Chen, H.: Visualizing authorship for identification. In: Intelligence and Security Informatics, pp. 60–71. Springer (2006)
Google Scholar
Aggarwal, C.C., Zhai, C.: A survey of text clustering algorithms. In: Mining Text Data, pp. 77–128. Springer (2012)
Google Scholar
Aiden, E.L., Michel, J.B.: What we learned from 5 million books. https://www.ted.com/talks/what_we_learned_from_5_million_books (2011). [Online; accessed Jan. 2016]
Allan, J.: Topic detection and tracking: event-based information organization, vol. 12. Springer Science & Business Media (2012)
Google Scholar
Allan, J., Carbonell, J.G., Doddington, G., Yamron, J., Yang, Y.: Topic detection and tracking pilot study final report (1998)
Google Scholar
Allan, J., Papka, R., Lavrenko, V.: On-line new event detection and tracking. In: Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, pp. 37–45. ACM (1998)
Google Scholar
Ansari, T.: Dimensions in Discourse: Elementary to Essentials. Xlibris Corporation (2013)
Google Scholar
Asuncion, A., Welling, M., Smyth, P., Teh, Y.W.: On smoothing and inference for topic models. In: Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, pp. 27–34. AUAI Press (2009)
Google Scholar
Avidan, S., Shamir, A.: Seam carving for content-aware image resizing. In: ACM Transactions on graphics (TOG), vol. 26, p. 10. ACM (2007)
Google Scholar
Bateman, S., Gutwin, C., Nacenta, M.: Seeing things in the clouds: the effect of visual features on tag cloud selections. Proceedings of the nineteenth ACM conference on Hypertext and hypermedia 4250, 193–202 (2008). doi:10.1145/1379092.1379130. URL http://portal.acm.org/citation.cfm?id=1379130
Blei, D.M., Lafferty, J.D.: Dynamic topic models. In: Proceedings of the 23rd international conference on Machine learning, pp. 113–120. ACM (2006)
Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. the. Journal of machine Learning research 3, 993–1022 (2003)
MATH Google Scholar
Burch, M., Lohmann, S., Beck, F., Rodriguez, N., Di Silvestro, L., Weiskopf, D.: Radcloud: Visualizing multiple texts with merged word clouds. In: Information Visualisation (IV), 2014 18th International Conference on, pp. 108–113. IEEE (2014)
Google Scholar
Byron, L., Wattenberg, M.: Stacked graphs-geometry & aesthetics. Visualization and Computer Graphics, IEEE Transactions on 14(6), 1245–1252 (2008)
Article Google Scholar
Cai, D., He, X., Han, J.: Locally consistent concept factorization for document clustering. Knowledge and Data Engineering, IEEE Transactions on 23(6), 902–913 (2011)
Article Google Scholar
Cao, N., Gotz, D., Sun, J., Lin, Y.R., Qu, H.: SolarMap: Multifaceted Visual Analytics for Topic Exploration. 2011 IEEE 11th International Conference on Data Mining pp. 101–110 (2011). doi:10.1109/ICDM.2011.135. URL http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=6137214
Cao, N., Sun, J., Lin, Y.R., Gotz, D., Liu, S., Qu, H.: FacetAtlas: Multifaceted visualization for rich text corpora. IEEE Transactions on Visualization and Computer Graphics 16(6), 1172–1181 (2010). doi:10.1109/TVCG.2010.154
Article Google Scholar
Cavnar, W.B., Trenkle, J.M., et al.: N-gram-based text categorization. Ann Arbor MI 48113(2), 161–175 (1994)
Google Scholar
Chakrabarti, D., Kumar, R., Tomkins, A.: Evolutionary clustering. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 554–560. ACM (2006)
Google Scholar
Chi, M.T., Lin, S.S., Chen, S.Y., Lin, C.H., Lee, T.Y.: Morphable word clouds for time-varying text data visualization. Visualization and Computer Graphics, IEEE Transactions on 21(12), 1415–1426 (2015)
Article Google Scholar
Chi, Y., Song, X., Zhou, D., Hino, K., Tseng, B.L.: Evolutionary spectral clustering by incorporating temporal smoothness. In: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 153–162. ACM (2007)
Google Scholar
Choo, J., Lee, C., Reddy, C.K., Park, H.: UTOPIAN: User-driven topic modeling based on interactive nonnegative matrix factorization. IEEE Transactions on Visualization and Computer Graphics 19(12), 1992–2001 (2013). doi:10.1109/TVCG.2013.212
Article Google Scholar
Chuang, J., Manning, C.D., Heer, J.: Termite: Visualization techniques for assessing textual topic models. In: Proceedings of the International Working Conference on Advanced Visual Interfaces, pp. 74–77. ACM (2012)
Google Scholar
Collins, C., Viegas, F.B., Wattenberg, M.: Parallel tag clouds to explore and analyze faceted text corpora. In: Visual Analytics Science and Technology, 2009. VAST 2009. IEEE Symposium on, pp. 91–98. IEEE (2009)
Google Scholar
Covington, M.A.: A fundamental algorithm for dependency parsing. In: Proceedings of the 39th annual ACM southeast conference, pp. 95–102. Citeseer (2001)
Google Scholar
Cox, T.F., Cox, M.A.: Multidimensional scaling. CRC press (2000)
Google Scholar
Cui, W., Liu, S., Tan, L., Shi, C., Song, Y., Gao, Z., Tong, X., Qu, H.: Textflow: Towards better understanding of evolving topics in text. IEEE Transactions on Visualization and Computer Graphics 17(12), 2412–2421 (2011). doi:10.1109/TVCG.2011.239
Article Google Scholar
Cui, W., Liu, S., Tan, L., Shi, C., Song, Y., Gao, Z.J., Qu, H., Tong, X.: Textflow: Towards better understanding of evolving topics in text. Visualization and Computer Graphics, IEEE Transactions on 17(12), 2412–2421 (2011)
Article Google Scholar
Cui, W., Liu, S., Wu, Z., Wei, H.: How Hierarchical Topics Evolve in Large Text Corpora. IEEE Transactions on Visualization and Computer Graphics 20(12), 2281–2290 (2014). doi:10.1109/TVCG.2014.2346433. URL http://research.microsoft.com/en-us/um/people/weiweicu/images/roseriver.pdf
Google Scholar
Cui, W., Wu, Y., Liu, S., Wei, F., Zhou, M., Qu, H.: Context-preserving, dynamic word cloud visualization. IEEE Computer Graphics and Applications 30(6), 42–53 (2010). doi:10.1109/MCG.2010.102
Article Google Scholar
Culy, C., Lyding, V.: Double tree: an advanced kwic visualization for expert users. In: Information Visualisation (IV), 2010 14th International Conference, pp. 98–103. IEEE (2010)
Google Scholar
Daud, A., Li, J., Zhou, L., Muhammad, F.: Knowledge discovery through directed probabilistic topic models: a survey. Frontiers of computer science in China 4(2), 280–301 (2010)
Article Google Scholar
De Vel, O., Anderson, A., Corney, M., Mohay, G.: Mining e-mail content for author identification forensics. ACM Sigmod Record 30(4), 55–64 (2001)
Article Google Scholar
Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. Journal of the American society for information science 41(6), 391 (1990)
Article Google Scholar
Don, A., Zheleva, E., Gregory, M., Tarkan, S., Auvil, L., Clement, T., Shneiderman, B., Plaisant, C.: Discovering interesting usage patterns in text collections: integrating text mining with visualization. Main pp. 213–221 (2007). doi:10.1145/1321440.1321473. URL http://portal.acm.org/citation.cfm?id=1321473
Dörk, M., Gruen, D., Williamson, C., Carpendale, S.: A visual backchannel for large-scale events. Visualization and Computer Graphics, IEEE Transactions on 16(6), 1129–1138 (2010)
Article Google Scholar
Dou, W., Wang, X., Skau, D., Ribarsky, W., Zhou, M.X.: LeadLine: Interactive visual analysis of text data through event identification and exploration. Visual Analytics Science and Technology (VAST), 2012 IEEE Conference on pp. 93–102 (2012). doi:10.1109/VAST.2012.6400485. URL http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6400485
Dou, W., Yu, L., Wang, X., Ma, Z., Ribarsky, W.: Hierarchicaltopics: Visually exploring large text collections using topic hierarchies. Visualization and Computer Graphics, IEEE Transactions on 19(12), 2002–2011 (2013)
Article Google Scholar
Firth, J.R.: A synopsis of linguistic theory, 1930-1955 (1957)
Google Scholar
Forsati, R., Mahdavi, M., Shamsfard, M., Meybodi, M.R.: Efficient stochastic algorithms for document clustering. Information Sciences 220, 269–291 (2013)
Article MathSciNet Google Scholar
Friendly, M., Denis, D.J.: Milestones in the history of thematic cartography, statistical graphics, and data visualization. URL http://www.datavis.ca/milestones (2001)
Fruchterman, T.M., Reingold, E.M.: Graph drawing by force-directed placement. Software: Practice and experience 21(11), 1129–1164 (1991)
Google Scholar
Gaifman, H.: Dependency systems and phrase-structure systems. Information and control 8(3), 304–337 (1965)
Article MathSciNet MATH Google Scholar
Gretarsson, B., Odonovan, J., Bostandjiev, S., Höllerer, T., Asuncion, A., Newman, D., Smyth, P.: Topicnets: Visual analysis of large text corpora with topic modeling. ACM Transactions on Intelligent Systems and Technology (TIST) 3(2), 23 (2012)
Google Scholar
Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proceedings of the National Academy of Sciences 101(suppl 1), 5228–5235 (2004)
Article Google Scholar
HARRIS, J.: Word clouds considered harmful. www.niemanlab.org/2011/10/word-clouds-considered-harmful/ (2011). [Online; accessed Jan. 2016]
Harris, Z.S.: Distributional structure. Word 10(23), 146–162 (1954)
Google Scholar
Havre, S., Hetzler, E., Whitney, P., Nowell, L.: Themeriver: Visualizing thematic changes in large document collections. Visualization and Computer Graphics, IEEE Transactions on 8(1), 9–20 (2002)
Article Google Scholar
Hays, D.G.: Dependency theory: A formalism and some observations. Language pp. 511–525 (1964)
Google Scholar
Heintze, N., et al.: Scalable document fingerprinting. In: 1996 USENIX workshop on electronic commerce, vol. 3 (1996)
Google Scholar
Hilpert, M.: Dynamic visualizations of language change: Motion charts on the basis of bivariate and multivariate data from diachronic corpora. International Journal of Corpus Linguistics 16(4), 435–461 (2011)
Article MathSciNet Google Scholar
Hoad, T.C., Zobel, J.: Methods for Identifying Versioned and Plagiarised Documents. Journal of the ASIS&T 54, 203–215 (2003). doi:10.1002/asi.10170
Google Scholar
Holsti, O.R., et al.: Content analysis. The handbook of social psychology 2, 596–692 (1968)
Google Scholar
Houvardas, J., Stamatatos, E.: N-gram feature selection for authorship identification. In: Artificial Intelligence: Methodology, Systems, and Applications, pp. 77–86. Springer (2006)
Google Scholar
Jaffe, A., Naaman, M., Tassa, T., Davis, M.: Generating summaries and visualization for large collections of geo-referenced photographs. In: Proceedings of the 8th ACM international workshop on Multimedia information retrieval, pp. 89–98. ACM (2006)
Google Scholar
Jankowska, M., Keselj, V., Milios, E.: Relative n-gram signatures: Document visualization at the level of character n-grams. In: Visual Analytics Science and Technology (VAST), 2012 IEEE Conference on, pp. 103–112. IEEE (2012)
Google Scholar
Keim, D., Oelke, D., et al.: Literature fingerprinting: A new method for visual literary analysis. In: Visual Analytics Science and Technology, 2007. VAST 2007. IEEE Symposium on, pp. 115–122. IEEE (2007)
Google Scholar
Kim, J.: Causation, nomic subsumption, and the concept of event. The Journal of Philosophy pp. 217–236 (1973)
Google Scholar
Koh, K., Lee, B., Kim, B., Seo, J.: Maniwordle: Providing flexible control over wordle. Visualization and Computer Graphics, IEEE Transactions on 16(6), 1190–1197 (2010)
Article Google Scholar
Krstajić, M., Bertini, E.: Keim, D.a.: Cloudlines: Compact display of event episodes in multiple time-series. IEEE Transactions on Visualization and Computer Graphics 17(12), 2432–2439 (2011). doi:10.1109/TVCG.2011.179
Article Google Scholar
Kurby, C.A., Zacks, J.M.: Segmentation in the perception and memory of events. Trends in cognitive sciences 12(2), 72–79 (2008)
Article Google Scholar
Landauer, T.K., Foltz, P.W., Laham, D.: An introduction to latent semantic analysis. Discourse processes 25(2–3), 259–284 (1998)
Article Google Scholar
Lasswell, H.D.: Describing the contents of communication. Propaganda, communication and public opinion pp. 74–94 (1946)
Google Scholar
Leavenworth, R.S., Grant, E.L.: Statistical quality control. Tata McGraw-Hill Education (2000)
Google Scholar
Lee, B., Riche, N.H., Karlson, A.K., Carpendale, S.: Sparkclouds: Visualizing trends in tag clouds. Visualization and Computer Graphics, IEEE Transactions on 16(6), 1182–1189 (2010)
Article Google Scholar
Lee, H., Kihm, J., Choo, J., Stasko, J., Park, H.: ivisclustering: An interactive visual document clustering via topic modeling. In: Computer Graphics Forum, vol. 31, pp. 1155–1164. Wiley Online Library (2012)
Google Scholar
Li, J., Zheng, R., Chen, H.: From fingerprint to writeprint. Communications of the ACM 49(4), 76–82 (2006)
Article Google Scholar
Liu, S., Wang, X., Chen, J., Zhu, J., Guo, B.: Topicpanorama: a full picture of relevant topics. In: Visual Analytics Science and Technology (VAST), 2014 IEEE Conference on, pp. 183–192. IEEE (2014)
Google Scholar
Lotman, I.: The structure of the artistic text
Google Scholar
Lu, Y., Mei, Q., Zhai, C.: Investigating task performance of probabilistic topic models: an empirical study of plsa and lda. Information Retrieval 14(2), 178–203 (2011)
Article Google Scholar
Luo, D., Yang, J., Krstajic, M., Ribarsky, W., Keim, D.: Event river: Visually exploring text collections with temporal references. IEEE Transactions on Visualization and Computer Graphics 18(1), 93–105 (2012). doi:10.1109/TVCG.2010.225. URL http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5611507
Google Scholar
Luo, D., Yang, J., Krstajic, M., Ribarsky, W., Keim, D.: Eventriver: Visually exploring text collections with temporal references. Visualization and Computer Graphics, IEEE Transactions on 18(1), 93–105 (2012)
Article Google Scholar
Manber, U.: Finding similar files in a large file system. In: 1994 Winter USENIX Technical Conference, vol. 94, pp. 1–10 (1994)
Google Scholar
Manning, C.D., Raghavan, P., Schütze, H., et al.: Introduction to information retrieval, vol. 1. Cambridge university press Cambridge (2008)
Google Scholar
Mates, B.: Stoic logic (1953)
Google Scholar
Mei, Q., Zhai, C.: Discovering evolutionary theme patterns from text: an exploration of temporal text mining. In: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, pp. 198–207. ACM (2005)
Google Scholar
Milgram, S.: Psychological maps of paris, the individual in a social world (1977)
Google Scholar
Mukhopadhyay, A., Maulik, U., Bandyopadhyay, S.: A survey of multiobjective evolutionary clustering. ACM Computing Surveys (CSUR) 47(4), 61 (2015)
Article Google Scholar
Ng, A.Y., Jordan, M.I., Weiss, Y., et al.: On spectral clustering: Analysis and an algorithm. Advances in neural information processing systems 2, 849–856 (2002)
Google Scholar
Oelke, D., Kokkinakis, D., Keim, D.A.: Fingerprint matrices: Uncovering the dynamics of social networks in prose literature 32(3pt4), 371–380 (2013)
Google Scholar
Oelke, D., Spretke, D., Stoffel, A., Keim, D.A.: Visual readability analysis: How to make your writings easier to read. Visualization and Computer Graphics, IEEE Transactions on 18(5), 662–674 (2012)
Article Google Scholar
Pirolli, P., Schank, P., Hearst, M., Diehl, C.: Scatter/gather browsing communicates the topic structure of a very large text collection. In: Proceedings of the SIGCHI conference on Human factors in computing systems, pp. 213–220. ACM (1996)
Google Scholar
Playfair, W.: Commercial and political atlas and statistical breviary (1786)
Google Scholar
Pylyshyn, Z.W., Storm, R.W.: Tracking multiple independent targets: Evidence for a parallel tracking mechanism*. Spatial vision 3(3), 179–197 (1988)
Article Google Scholar
Ribler, R.L., Abrams, M.: Using visualization to detect plagiarism in computer science classes. In: Proceedings of the IEEE Symposium on Information Vizualization, p. 173. IEEE Computer Society (2000)
Google Scholar
Riehmann, P., Potthast, M., Stein, B., Froehlich, B.: Visual Assessment of Alleged Plagiarism Cases. Computer Graphics Forum 34(3), 61–70 (2015). doi:10.1111/cgf.12618. URL http://doi.wiley.com/10.1111/cgf.12618
Google Scholar
Rivadeneira, a.W., Gruen, D.M., Muller, M.J., Millen, D.R.: Getting Our Head in the Clouds: Toward Evaluation Studies of Tagclouds. 25th SIGCHI Conference on Human Factors in Computing Systems, CHI 2007 pp. 995–998 (2007). doi:10.1145/1240624.1240775
Rivadeneira, A.W., Gruen, D.M., Muller, M.J., Millen, D.R.: Getting our head in the clouds: toward evaluation studies of tagclouds. In: Proceedings of the SIGCHI conference on Human factors in computing systems, pp. 995–998. ACM (2007)
Google Scholar
Robertson, G., Fernandez, R., Fisher, D., Lee, B., Stasko, J.: Effectiveness of animation in trend visualization. Visualization and Computer Graphics, IEEE Transactions on 14(6), 1325–1332 (2008)
Article Google Scholar
Rohrer, R.M., Ebert, D.S., Sibert, J.L.: The shape of shakespeare: visualizing text using implicit surfaces. In: Information Visualization, 1998. Proceedings. IEEE Symposium on, pp. 121–129. IEEE (1998)
Google Scholar
Seifert, C., Ulbrich, E., Granitzer, M.: Word clouds for efficient document labeling. In: Discovery Science, pp. 292–306. Springer (2011)
Google Scholar
Sgall, P.: Dependency-based formal description of language. The Encyclopedia of Language and Linguistics 2, 867–872 (1994)
Google Scholar
Shivakumar, N., Garcia-Molina, H.: Finding near-replicas of documents on the web. In: The World Wide Web and Databases, pp. 204–212. Springer (1998)
Google Scholar
Sinclair, J.: Corpus, concordance, collocation. Oxford University Press (1991)
Google Scholar
Slingsby, A., Dykes, J., Wood, J., Clarke, K.: Interactive tag maps and tag clouds for the multiscale exploration of large spatio-temporal datasets. In: Information Visualization, 2007. IV’07. 11th International Conference, pp. 497–504. IEEE (2007)
Google Scholar
Smith, A.E., Humphreys, M.S.: Evaluation of unsupervised semantic mapping of natural language with leximancer concept mapping. Behavior Research Methods 38(2), 262–279 (2006)
Article Google Scholar
Steyvers, M., Griffiths, T.: Probabilistic topic models. Handbook of latent semantic analysis 427(7), 424–440 (2007)
Google Scholar
Subašić, I., Berendt, B.: Web Mining for Understanding Stories through Graph Visualisation. 2008 Eighth IEEE International Conference on Data Mining pp. 570–579 (2008). doi:10.1109/ICDM.2008.138. URL http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=4781152
Sun, G., Wu, Y., Liu, S., Peng, T.Q., Zhu, J.J.H., Liang, R.: EvoRiver: Visual Analysis of Topic Coopetition on Social Media. Visualization and Computer Graphics, IEEE Transactions on PP(99), 1 (2014). doi:10.1109/TVCG.2014.2346919
Google Scholar
Svartvik, J.: The Evans statements. University of Goteburg (1968)
Google Scholar
Teh, Y.W., Jordan, M.I., Beal, M.J., Blei, D.M.: Hierarchical dirichlet processes. Journal of the american statistical association (2012)
Google Scholar
Tufte, E.R.: Envisioning information. Optometry & Vision Science 68(4), 322–324 (1991)
Article Google Scholar
Tufte, E.R.: Beautiful evidence. New York (2006)
Google Scholar
Van Ham, F., Wattenberg, M., Viégas, F.B.: Mapping text with phrase nets. IEEE Transactions on Visualization & Computer Graphics 6, 1169–1176 (2009)
Google Scholar
Viégas, F.B., Wattenberg, M., Feinberg, J.: Participatory visualization with wordle. IEEE Transactions on Visualization and Computer Graphics 15(6), 1137–1144 (2009). doi:10.1109/TVCG.2009.171
Article Google Scholar
Vuillemot, R., Clement, T., Plaisant, C., Kumar, A.: What’s being said near martha? exploring name entities in literary text collections. In: Visual Analytics Science and Technology, 2009. VAST 2009. IEEE Symposium on, pp. 107–114. IEEE (2009)
Google Scholar
Wang, C., Blei, D., Heckerman, D.: Continuous time dynamic topic models. arXiv preprint arXiv:1206.3298 (2012)
Wang, X., McCallum, A.: Topics over time: a non-markov continuous-time model of topical trends. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 424–433. ACM (2006)
Google Scholar
Wattenberg, M.: Arc diagrams: visualizing structure in strings. Information Visualization Proceedings 2002(2002), 110–116 (2002). doi:10.1109/INFVIS.2002.1173155
Google Scholar
Wattenberg, M., Viégas, F.B.: The word tree, an interactive visual concordance. IEEE Transactions on Visualization and Computer Graphics 14(6), 1221–1228 (2008). doi:10.1109/TVCG.2008.172
Article Google Scholar
Wei, F., Liu, S., Song, Y., Pan, S., Zhou, M.X., Qian, W., Shi, L., Tan, L., Zhang, Q.: Tiara: a visual exploratory text analytic system. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 153–162. ACM (2010)
Google Scholar
Werlich, E.: A text grammar of English. Quelle & Meyer (1976)
Google Scholar
Wu, Y., Provan, T., Wei, F., Liu, S., Ma, K.L.: Semantic-Preserving Word Clouds by Seam Carving. Computer Graphics Forum 30(3), 741–750 (2011). doi:10.1111/j.1467-8659.2011.01923.x. URL http://doi.wiley.com/10.1111/j.1467-8659.2011.01923.x
Google Scholar
Xu, P., Wu, Y., Wei, E., Peng, T.Q., Liu, S., Zhu, J.J.H., Qu, H.: Visual analysis of topic competition on social media. IEEE Transactions on Visualization and Computer Graphics 19(12), 2012–2021 (2013). doi:10.1109/TVCG.2013.221
Article Google Scholar
Xu, T., Zhang, Z., Yu, P.S., Long, B.: Evolutionary clustering by hierarchical dirichlet process with hidden markov state. In: Data Mining, 2008. ICDM’08. Eighth IEEE International Conference on, pp. 658–667. IEEE (2008)
Google Scholar
Xu, W., Gong, Y.: Document clustering by concept factorization. In: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 202–209. ACM (2004)
Google Scholar
Xu, W., Liu, X., Gong, Y.: Document clustering based on non-negative matrix factorization. In: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pp. 267–273. ACM (2003)
Google Scholar
Zacks, J.M., Tversky, B.: Event structure in perception and conception. Psychological bulletin 127(1), 3 (2001)
Article Google Scholar
Zhang, J., Ghahramani, Z., Yang, Y.: A probabilistic model for online document clustering with application to novelty detection. In: Advances in Neural Information Processing Systems, pp. 1617–1624 (2004)
Google Scholar
Zhao, Q., Mitra, P.: Event Detection and Visualization for Social Text Streams. Event London pp. 26–28 (2007). URL http://www.icwsm.org/papers/3--Zhao-Mitra.pdf
Zheng, R., Li, J., Chen, H., Huang, Z.: A framework for authorship identification of online messages: Writing-style features and classification techniques. Journal of the American Society for Information Science and Technology 57(3), 378–393 (2006)
Article Google Scholar

Download references

Author information

Authors and Affiliations

IBM T. J. Watson Research Center, Yorktown Heights, New York, USA
Nan Cao
Microsoft Research Asia, Beijing, China
Weiwei Cui

Authors

Nan Cao
View author publications
You can also search for this author in PubMed Google Scholar
Weiwei Cui
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nan Cao .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Cao, N., Cui, W. (2016). Visualizing Document Content. In: Introduction to Text Visualization. Atlantis Briefs in Artificial Intelligence, vol 1. Atlantis Press, Paris. https://doi.org/10.2991/978-94-6239-186-4_5

Download citation

DOI: https://doi.org/10.2991/978-94-6239-186-4_5
Published: 23 October 2016
Publisher Name: Atlantis Press, Paris
Print ISBN: 978-94-6239-185-7
Online ISBN: 978-94-6239-186-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics