Interactive Tweaking of Text Analytics Dashboards

  • Arnab Nandi
  • Ziqi Huang
  • Man Cao
  • Micha Elsner
  • Lilong Jiang
  • Srinivasan Parthasarathy
  • Ramiya Venkatachalam
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8999)

Abstract

With the increasing importance of text analytics in all disciplines, e.g., science, business, and social media analytics, it has become important to extract actionable insights from text in a timely manner. Insights from text analytics are conventionally presented as visualizations and dashboards to the analyst. While these insights are intended to be set up as a one-time task and observed in a passive manner, most use cases in the real world require constant tweaking of these dashboards in order to adapt to new data analysis settings. Current systems supporting such analysis have grown from simplistic chains of aggregations to complex pipelines with a range of implicit (or latent) and explicit parametric knobs. The re-execution of such pipelines can be computationally expensive, and the increased query-response time at each step may significantly delay the analysis task. Enabling the analyst to interactively tweak and explore the space allows the analyst to get a better hold on the data and insights. We propose a novel interactive framework that allows social media analysts to tweak the text mining dashboards not just during its development stage, but also during the analytics process itself. Our framework leverages opportunities unique to text pipelines to ensure fast response times, allowing for a smooth, rich and usable exploration of an entire analytics space.

Keywords

text analytics interactivity database systems social media analysis 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aggarwal, C.C.: An Introduction to Social Network Data Analytics. Springer (2011)Google Scholar
  2. 2.
    Alexe, B., Hernandez, M.A., Hildrum, K.W., Krishnamurthy, R., Koutrika, G., Nagarajan, M., Roitman, H., Shmueli-Scheuer, M., Stanoi, I.R., Venkatramani, C., Wagle, R.: Surfacing Time-critical Insights from Social Media. In: SIGMOD (2012)Google Scholar
  3. 3.
    Asur, S., Huberman, B.A.: Predicting the Future with Social Media. In: WI-IAT (2010)Google Scholar
  4. 4.
    Deng, K., Moore, A.W.: Multiresolution Instance-based Learning. In: IJCAI (1995)Google Scholar
  5. 5.
    Domingos, P., Pazzani, M.: On the Optimality of the Simple Bayesian Classifier under Zero-One Loss. In: Machine Learning (1997)Google Scholar
  6. 6.
    Fisher, D.H.: Knowledge Acquisition via Incremental Conceptual Clustering. In: Machine Learning (1987)Google Scholar
  7. 7.
    Gama, J.: A Cost-sensitive Iterative bayes. In: ICML (2000)Google Scholar
  8. 8.
    Gama, J., Castillo, G.: Adaptive Bayes. In: Advances in AI BERAMIA (2002)Google Scholar
  9. 9.
    Gravano, L., Ipeirotis, P.G., Jagadish, H.V., Koudas, N., Muthukrishnan, S., Pietarinen, L., Srivastava, D.: Using q-grams in a DBMS for Approximate String Processing. In: TCDE (2001)Google Scholar
  10. 10.
    Gupta, H., Mumick, I.S.: Selection of Views to Materialize in a Data Warehouse. In: TKDE (2005)Google Scholar
  11. 11.
    Halevy, A.Y.: Answering Queries Using Views: A Survey. In: VLDB (2001)Google Scholar
  12. 12.
    Infosphere Biginsights, I. (2011), http://www.ibm.com
  13. 13.
    Facebook Inc. 1.35 Billion Monthly Active Users as of. Company Information (September 30, 2014)Google Scholar
  14. 14.
    Indyk, P., Motwani, R.: Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality. In: STOC (1998)Google Scholar
  15. 15.
    International Telecommunication Union: United Nations Special Agency. The World in 2014. ICT Facts and Figures (2014)Google Scholar
  16. 16.
    Ivanova, M.G., Kersten, M.L., Nes, N.J.: An Architecture for Recycling Intermediates in a Column-store. In: TODS (2010)Google Scholar
  17. 17.
    Jadhav, A.S., Purohit, H., Kapanipathi, P., Anantharam, P., Ranabahu, A.H., Nguyen, V., Mendes, P.N., Smith, A.G., Cooney, M., Sheth, A.: Twitris 2.0: Semantically Empowered System for Understanding Perceptions from Social Data. In: ISWC (2010)Google Scholar
  18. 18.
    Koudas, N., Marathe, A., Srivastava, D.: Flexible String Matching Against Large Databases in Practice. In: VLDB (2004)Google Scholar
  19. 19.
    Lewis, D.D.: Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval. Springer, 1998Google Scholar
  20. 20.
    Liu, Z., Heer, J.: The effects of interactive latency on exploratory visual analysis. IEEE Trans. Visualization & Comp. Graphics, Proc. InfoVis (2014)Google Scholar
  21. 21.
    Mami, I., Bellahsene, Z.: A Survey of View Selection Methods. In: SIGMOD (2012)Google Scholar
  22. 22.
    Marcus, A., Bernstein, M.S., Badar, O., Karger, D.R., Madden, S., Miller, R.C.: Tweets as Data: Demonstration of TweeQL and Twitinfo. In: SIGMOD (2011)Google Scholar
  23. 23.
    McCallum, A., Nigam, K.: A Comparison of Event Models for naive bayes Text Classification. AAAI-LTC (1998)Google Scholar
  24. 24.
    Miller, R.B.: Response time in man-computer conversational transactions. In: Proceedings of the, Fall Joint Computer Conference, Part I, December 9-11, pp. 267–277. ACM (1968)Google Scholar
  25. 25.
    Moore, A., Lee, M.S.: Cached Sufficient Statistics for Efficient Machine Learning with Large Datasets. JAIR (1998)Google Scholar
  26. 26.
    Murphy, K.P.: Naive Bayes Classifiers. Springer (2006)Google Scholar
  27. 27.
    Olston, C., Bortnikov, E., Elmeleegy, K., Junqueira, F., Reed, B.: Interactive Analysis of Web-scale Data. In: CIDR (2009)Google Scholar
  28. 28.
    Park, C.-S., Kim, M.H., Lee, Y.-J.: Finding an Efficient Rewriting of OLAP Queries Using Materialized Views in Data Warehouses. In: DSS (2002)Google Scholar
  29. 29.
    Reips, U., Garaizar, P.: Mining Twitter: A Source for Psychological Wisdom of the Crowds. Behavior Research Methods (2011)Google Scholar
  30. 30.
    Rish, I.: An Empirical Study of the Naive bayes Classifier. IJCAI (2001)Google Scholar
  31. 31.
    Ross, K.A., Srivastava, D., Sudarshan., S.: Materialized View Maintenance and Integrity Constraint checking: Trading Space for Time. In: SIGMOD (1996)Google Scholar
  32. 32.
    Roy, P., Seshadri, S., Sudarshan, S., Bhobe, S.: Efficient and Extensible Algorithms for Multi Query Optimization. In: SIGMOD (2000)Google Scholar
  33. 33.
    Sankaranarayanan, J., Samet, H., Teitler, B.E., Lieberman, M.D., Sperling, J.: Twitterstand: News in Tweets. SIGSPATIAL GIS (2009)Google Scholar
  34. 34.
    Shneiderman, B.: Response time and display rate in human performance with computers. ACM Computing Surveys (CSUR) 16(3), 265–285 (1984)CrossRefGoogle Scholar
  35. 35.
    Twitter Inc. Twitter Usage: 500 million Tweets are sent per day. Company Information (2014)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Arnab Nandi
    • 1
  • Ziqi Huang
    • 1
  • Man Cao
    • 1
  • Micha Elsner
    • 1
  • Lilong Jiang
    • 1
  • Srinivasan Parthasarathy
    • 1
  • Ramiya Venkatachalam
    • 1
  1. 1.The Ohio State UniversityColumbusUSA

Personalised recommendations