Journal of Intelligent Information Systems

, Volume 43, Issue 3, pp 411–435 | Cite as

The human is the loop: new directions for visual analytics

  • Alex Endert
  • M. Shahriar Hossain
  • Naren Ramakrishnan
  • Chris North
  • Patrick Fiaux
  • Christopher Andrews
Article

Abstract

Visual analytics is the science of marrying interactive visualizations and analytic algorithms to support exploratory knowledge discovery in large datasets. We argue for a shift from a ‘human in the loop’ philosophy for visual analytics to a ‘human is the loop’ viewpoint, where the focus is on recognizing analysts’ work processes, and seamlessly fitting analytics into that existing interactive process. We survey a range of projects that provide visual analytic support contextually in the sensemaking loop, and outline a research agenda along with future challenges.

Keywords

Visual analytics Clustering Spatialization Semantic interaction Storytelling 

References

  1. Aghabozorgi, S.R., & Wah, T.Y. (2009). Recommender systems: incremental clustering on web log data. In ICIS ’09 (pp. 812–818).Google Scholar
  2. Alonso, O., & Talbot, J. (2008). Structuring collections with scatter/gather extensions. In SIGIR ’08 (pp. 697–698).Google Scholar
  3. Alsakran, J., Chen, Y., Zhao, Y., Yang, J., Luo, D. (2011). STREAMIT: dynamic visualization and interactive exploration of text streams. In PACIFICVIS ’11 (pp. 131–138).Google Scholar
  4. Andrews, C., Endert, A., North, C. (2010). Space to think: large high-resolution displays for sensemaking. In CHI ’10 (pp. 55–64).Google Scholar
  5. Andrews, C., & North, C. (2012). Analyst’s workspace: an embodied sensemaking environment for large, high resolution displays. In VAST ’12.Google Scholar
  6. Bach, B., Pietriga, E., Liccardi, I., Legostaev, G. (2011). OntoTrix: a hybrid visualization for populated ontologies. In WWW ’11 (pp. 177–180).Google Scholar
  7. Baron, A., & Freedman, M. (2008). Who is who and what is what: experiments in cross-document co-reference. In EMNLP ’08 (pp. 274–283).Google Scholar
  8. Brown, E.T., Liu, J., Brodley, C.E., Chang, R. (2012). Dis-function: learning distance functions interactively. In VAST ’12.Google Scholar
  9. Cadez, I., Heckerman, D., Meek, C., Smyth, P., White, S. (2003). Model-based clustering and visualization of navigation patterns on a web site. Data Mining and Knowledge Discovery, 7(4): 399–424.MathSciNetCrossRefGoogle Scholar
  10. Clark, H.H., & Brennan, S.A. (1991). Grounding in communication. In Perspectives on socially shared cognition. Washington, DC: APA Books.Google Scholar
  11. Davidson, I., Ravi, S., Ester, M. (2007). Efficient incremental constrained clustering. In KDD ’07 (pp. 240–249).Google Scholar
  12. Davidson, I., & Ravi, S.S. (2005). Clustering with constraints: feasibility issues and the k-means algorithm. In SDM ’05 (pp. 201–211).Google Scholar
  13. Drucker, S.M., Fisher, D., Basu, S. (2011). Helping users sort faster with adaptive machine learning recommendations. In INTERACT ’11 (pp. 187–203).Google Scholar
  14. Eccles, R., Kapler, T., Harper, R., Wright, W. (2008). Stories in GeoTime. Information Visualization, 7(1), 3–17.CrossRefGoogle Scholar
  15. Elmqvist, N., Moere, A.V., Jetter, H.-C., Cernea, D., Reiterer, H., Jankun-Kelly, T.J. (2011). Fluid interaction for information visualization. Information Visualization, 10(4), 327–340.CrossRefGoogle Scholar
  16. Endert, A., Fiaux, P., Chung, H., Stewart, M., Andrews, C., North, C. (2011). ChairMouse: leveraging natural chair rotation for cursor navigation on large, high-resolution displays. In CHI EA ’11 (pp. 571–580).Google Scholar
  17. Endert, A., Fiaux, P., North, C. (2012a). Semantic interaction for sensemaking: inferring analytical reasoning for model steering. In VAST ’12.Google Scholar
  18. Endert, A., Fiaux, P., North, C. (2012b). Semantic interaction for visual text analytics. In CHI ’12 (pp. 473–482).Google Scholar
  19. Endert, A., Fox, S., Maiti, D., Leman, S., North, C. (2012). The semantics of clustering: analysis of user-generated spatializations of text documents. In AVI ’12 (pp. 555–562).Google Scholar
  20. Endert, A., Han, C., Maiti, D., House, L., Leman, S., North, C. (2011). Observation-level interaction with statistical models for visual analytics. In VAST ’11 (pp. 121–130).Google Scholar
  21. Ernst, J., Nau, G., Joseph, Z. (2005). Clustering short time series gene expression data. Bioinformatics, 21, i159–i168.CrossRefGoogle Scholar
  22. Fiaux, P. (2012). Solving intelligence analysis problems using biclusters. Blacksburg, VA: Master’s thesis, Virginia Tech. http://scholar.lib.vt.edu/theses/available/etd-02202012-084450/.
  23. Fink, G.A., North, C.L., Endert, A., Rose, S. (2009). Visualizing cyber security: usable workspaces. In VizSec ’09 (pp. 45–56).Google Scholar
  24. Green, T.M., Ribarsky, W., Fisher, B. (2009). Building and applying a human cognition model for visual analytics. Information Visualization, 8(1), 1–13.CrossRefGoogle Scholar
  25. Guha, R., Kumar, R., Sivakumar, D., Sundaram, R. (2005). Unweaving a web of documents. In KDD ’05 (pp. 574–579).Google Scholar
  26. Henry, N., Fekete, J.-D., McGuffin, M.J. (2007). NodeTrix: a hybrid visualization of social networks. TVCG, 13(6), 1302–1309.Google Scholar
  27. Heuer, R. (1999). Psychology of intelligence analysis. CIA: Center for the study of intelligence.Google Scholar
  28. Hossain, M.S., Akbar, M., Polys, N.F. (2012). Narratives in the network: interactive methods for mining cell signaling networks. Journal of Computational Biology, 19(9), 1043–1059.MathSciNetCrossRefGoogle Scholar
  29. Hossain, M.S., Andrews, C., Ramakrishnan, N., North, C. (2011). Helping intelligence analysts make connections. In AAAI ’11 workshop on scalable integration of analytics and visualization (WS-11-17) (pp. 22–31).Google Scholar
  30. Hossain, M.S., Butler, P., Boedihardjo, A.P., Ramakrishnan, N. (2012a). Storytelling in entity networks to support intelligence analysts. In KDD ’12 (pp. 1375–1383).Google Scholar
  31. Hossain, M.S., Gresock, J., Edmonds, Y., Helm, R., Potts, M., Ramakrishnan, N. (2012b). Connecting the dots between PubMed abstracts. PLoS ONE, 7(1), e29509.CrossRefGoogle Scholar
  32. Hossain, M.S., Ojili, P.K.R., Grimm, C.,Mueller, R.,Watson, L.T., Ramakrishnan, N. (2012c). Scatter/gather clustering: flexibly incorporating user feedback to steer clustering results. In VAST ’12.Google Scholar
  33. Hossain, M.S., Tadepalli, S., Watson, L., Davidson, I., Helm, R., Ramakrishnan, N. (2010). Unifying dependent clustering and disparate clustering for non-homogeneous data. In KDD ’10 (pp. 593–602).Google Scholar
  34. Hsieh, H., & Shipman, F.M. (2002). Manipulating structured information in a visual workspace. In UIST’02 (pp. 217–226).Google Scholar
  35. Huang, Y., & Mitchell, T.M. (2006). Text clustering with extended user feedback. In SIGIR ’06 (pp. 413–420).Google Scholar
  36. Hwang, I., Kahng, M., Lee, S. (2011). Exploiting user feedback to improve quality of search results clustering. In ICUIMC ’11 (Vol. 5, pp. 68:1–68:5).Google Scholar
  37. i2group. The analyst’s notebook. http://www.i2group.com/us. Accessed 08 Oct 2012.
  38. Jain, A.K., Murty, M.N., Flynn, P.J. (1999). Data clustering: a review. ACM Computing Surveys, 31(3), 264–323.CrossRefGoogle Scholar
  39. Jeong, D.H., Ziemkiewicz, C., Fisher, B., Ribarsky, W., Chang, R. (2009). iPCA: an interactive system for pca-based visual analytics. Computers and Graphics Forum, 28(3), 767–774.CrossRefGoogle Scholar
  40. Jin, Y., Murali, T.M., Ramakrishnan, N. (2008). Compositional mining of multirelational biological datasets. ACM Transactions Knowledge in Discovery Data, 2(1), 1–35.CrossRefGoogle Scholar
  41. Kang, Y., Grg, C., Stasko, J. (2009). The evaluation of visual analytics systems for investigative analysis: deriving design principles from a case study. In VAST (pp. 139–146).Google Scholar
  42. Keim, D.A., Mansmann, F., Thomas, J. (2010). Visual Analytics: how much visualization and how much analytics?. SIGKDD Exploration Newsletter, 11(2), 5–8.CrossRefGoogle Scholar
  43. Kelleher, C., & Pausch, R. (2007). Using storytelling to motivate programming. Communications of the ACM, 50(7), 58–64.CrossRefGoogle Scholar
  44. Kielman, J., Thomas, J., May, R. (2009). Foundations and frontiers in visual analytics. Information Visualization, 8(4), 239–246.CrossRefGoogle Scholar
  45. Kuchinsky, A., Graham, K., Moh, D., Adler, A., Babaria, K., Creech, M.L. (2002). Biological storytelling: a software tool for biological information organization based upon narrative structure. ACM SIGGROUP Bulletin, 23(2) 4–5.CrossRefGoogle Scholar
  46. Kumar, D., Ramakrishnan, N., Helm, R., Potts, M. (2006). Algorithms for storytelling. In KDD ’06.Google Scholar
  47. Kumar, D., Ramakrishnan, N., Helm, R., Potts, M. (2008). Algorithms for storytelling. IEEE Transactions on Knowledge and Data Engineering, 20(6), 736–751.CrossRefGoogle Scholar
  48. Liang, J., Abidi, B., Abidi, M. (2003). Automatic x-ray image segmentation for threat detection. In ICCIMA ’03 (pp. 396–401).Google Scholar
  49. Liu, J., Brown, E.T., Chang, R. (2011). Find distance function, hide model inference. In VAST ’11 (pp. 289–290).Google Scholar
  50. MacArthur, S.D., Brodley, C.E., Kak, A.C., Broderick, L.S. (2002). Interactive content-based image retrieval using relevance feedback. Computer Vision and Image Understanding, 88(2), 55–75.MATHCrossRefGoogle Scholar
  51. Madeira, S.C., & Oliveira, A.L. (2004). Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Transactions Computer Biology Bioinformatics, 1(1), 24–45.CrossRefGoogle Scholar
  52. Marshall, C.C., Shipman, III F.M., Coombs, J.H. (1994). VIKI: spatial hypertext supporting emergent structure. In ECHT ’94 (pp. 13–23).Google Scholar
  53. Miao, G., Tatemura, J., Hsiung, W., Sawires, A., Moser, L. (2009). Extracting data records from the web using tag path clustering. In WWW ’09 (pp. 981–990).Google Scholar
  54. Momtazpour, M., Butler, P., Hossain, M.S., Bozchalui, M.C., Ramakrishnan, N., Sharma, R. (2012). Coordinated clustering algorithms to support charging infrastructure design for electric vehicles. In The ACM SIGKDD international workshop on urban computing, UrbComp ’12 (pp. 26–133).Google Scholar
  55. Monti, S., Tamayo, P., Mesirov, J., Golub, T. (2003). Consensus clustering: a resampling-based method for class discovery and visualization of gene expression microarray data. Machine Learning, 52, 91–118.MATHCrossRefGoogle Scholar
  56. Petrushin, V. (2005). Mining rare and frequent events in multi-camera surveillance video using self-organizing maps. In KDD ’05 (pp. 794–800).Google Scholar
  57. Pirolli, P., & Card, S. (2005). The sensemaking process and leverage points for analyst technology as identified through cognitive task analysis. In ICIA ’05.Google Scholar
  58. Pirolli, P., Schank, P., Hearst,M., Diehl, C. (1996). Scatter/gather browsing communicates the topic structure of a very large text collection. In CHI ’96 (pp. 213–220).Google Scholar
  59. PNNL (2012). Pacific Northwest National Laboratory, IN-SPIRE Visual Document Analysis. http://in-spire.pnnl.gov/. Accessed 08 Oct 2012.
  60. Robinson, A.C. (2008). Design for synthesis in geovisualization. University Park, PA: PhD thesis, Pennsylvania State University.Google Scholar
  61. Rzhetsky, A., Iossifov, I., Loh, J.M., White, K.P. (2006). Microparadigms: chains of collective reasoning in publications about molecular interactions. Proceedings of the national academy of sciences, USA, 103(13), 4940–4945.CrossRefGoogle Scholar
  62. Sese, J., Kurokawa, Y., Monden, M., Kato, K., Morishita, S. (2004). Constrained clusters of gene expression profiles with pathological features. Bioinformatics, 20(17), 3137–3145.CrossRefGoogle Scholar
  63. Shaparenko, B., & Joachims, T. (2007). Information genealogy: uncovering the flow of ideas in non-hyperlinked document databases. In KDD ’07 (pp. 619–628).Google Scholar
  64. Shipman, F.M., & Marshall, C.C. (1999). Formality considered harmful: experiences, emerging themes, and directions on the use of formal representations in interactive systems. CSCW, 8, 333–352.Google Scholar
  65. Simoff, S., Bhlen, M., Mazeika, A. (2008). Visual data mining: an introduction and overview. In S. Simoff, M. Bhlen, A. Mazeika (Eds.), Visual Data Mining (Vol. 4404, pp. 1–12). Berlin/Heidelberg: Springer.Google Scholar
  66. Stasko, J., Görg, C., Liu, Z. (2008). Jigsaw: supporting investigative analysis through interactive visualization. Information Visualization, 7, (2), 118–132.CrossRefGoogle Scholar
  67. Thomas, J.J. & Cook, K.A. (Eds.), (2005). Illuminating the path: the research and development agenda for visual analytics. IEEE Computer Society Press.Google Scholar
  68. Uno, T., Asai, T., Uchida, Y., Arimura, H. (2003). LCM: an efficient algorithm for enumerating frequent closed item sets. In FIMI03.Google Scholar
  69. Van Wijk, J.J., & Van Selow, E.R. (1999). Cluster and calendar based visualization of time series data. In INFOVIS ’99 (pp. 4–9).Google Scholar
  70. Wagstaff, K., Cardie, C., Rogers, S., Schrödl, S. (2001). Constrained k-means clustering with background knowledge. In ICML ’01 (pp. 577–584).Google Scholar
  71. Wang, X., & Davidson, I. (2010). Flexible constrained spectral clustering. In KDD ’10 (pp. 563–572).Google Scholar
  72. Wright, W., Schroh, D., Proulx, P., Skaburskis, A., Cort, B. (2006). The sandbox for analysis: concepts and methods. In CHI ’06 (pp. 801–810).Google Scholar
  73. Wu, H., Mampaey, M., Tatti, N., Vreeken, J., Hossain, M.S., Ramakrishnan, N. (2012). Where do i start? algorithmic strategies to guide intelligence analysts. In ACM SIGKDD workshop on intelligence and security informatics ISI-KDD ’12 (pp. 3:1–3:8).Google Scholar
  74. Xu, Y., Olman, V., Xu, D. (2002). Clustering gene expression data using a graph-theoretic approach: an application of minimum spanning trees. Bioinformatics, 18(4), 536–545.CrossRefGoogle Scholar
  75. Zaki, M., & Hsiao, C. (2002). Charm: an efficient algorithm for closed itemset mining. In SIAM international conference on data mining (pp. 457–473).Google Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  • Alex Endert
    • 1
  • M. Shahriar Hossain
    • 2
  • Naren Ramakrishnan
    • 3
  • Chris North
    • 3
  • Patrick Fiaux
    • 3
  • Christopher Andrews
    • 4
  1. 1.Pacific Northwest National LaboratoryRichlandUSA
  2. 2.Department of Computer ScienceUniversity of Texas at El PasoEl PasoUSA
  3. 3.Department of Computer ScienceVirginia TechBlacksburgUSA
  4. 4.Department of Computer ScienceMount Holyoke CollegeSouth HadleyUSA

Personalised recommendations