Skip to main content

Temporal Dynamics of On-Line Information Streams

  • Chapter
  • First Online:
Data Stream Management

Part of the book series: Data-Centric Systems and Applications ((DCSA))

  • 3476 Accesses

Abstract

A number of recent computing applications involve information arriving continuously over time in the form of a data stream, and this has led to new ways of thinking about traditional problems in a variety of areas. In some cases, the rate and overall volume of data in the stream may be so great that it cannot all be stored for processing, and this leads to new requirements for efficiency and scalability. In other cases, the quantities of information may still be manageable, but the data stream perspective takes what has generally been a static view of a problem and adds a strong temporal dimension to it. Our focus here is on some of the challenges that this latter issue raises in the settings of text mining, on-line information, and information retrieval.

This survey was written in 2004 and circulated on-line as a preprint prior to its appearance in this volume.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 99.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. E. Adar, L. Zhang, L.A. Adamic, R.M. Lukose, Implicit structure and the dynamics of blogspace. Workshop on the weblogging ecosystem, at the international WWW conference (2004)

    Google Scholar 

  2. R. Agrawal, R. Srikant, Mining sequential patterns, in Proc. Intl. Conf. on Data Engineering (1995)

    Google Scholar 

  3. J. Aizen, D. Huttenlocher, J. Kleinberg, A. Novak, Traffic-based feedback on the web. Proc. Natl. Acad. Sci. 101(Suppl. 1), 5254–5260 (2004)

    Article  Google Scholar 

  4. J. Allan (ed.), Topic Detection and Tracking: Event Based Information Retrieval (Kluwer Academic, Norwell, 2002)

    MATH  Google Scholar 

  5. J. Allan, J.G. Carbonell, G. Doddington, J. Yamron, Y. Yang, Topic detection and tracking pilot study: final report, in Proc. DARPA Broadcast News Transcription and Understanding Workshop (1998)

    Google Scholar 

  6. R. Allen, Timelines as information system interfaces, in Proc. International Symposium on Digital Libraries (1995)

    Google Scholar 

  7. D. Anick, D. Mitra, M. Sondhi, Stochastic theory of a data handling system with multiple sources. Bell Syst. Tech. J. 61 (1982)

    Google Scholar 

  8. J. Ask, Top searches at http://static.wc.ask.com/docs/about/jeevesiq.html?o=0

  9. S. Ben-David, J. Gehrke, D. Kifer, Detecting change in data streams, in Proc. 30th Intl. Conference on Very Large Databases (VLDB) (2004)

    Google Scholar 

  10. M. Charikar, K. Chen, M. Farach-Colton, Finding frequent items in data streams, in Proc. Intl. Colloq. on Automata Languages and Programming (2002)

    Google Scholar 

  11. Daypop. http://www.daypop.com

  12. F. Diaz, R. Jones, Using temporal profiles of queries for precision prediction, in Proc. SIGIR Intl. Conf. on Information Retrieval (2004)

    Google Scholar 

  13. A. Elwalid, D. Mitra, Effective bandwidth of general Markovian traffic sources and admission control of high speed networks. IEEE Trans. Netw. 1 (1993)

    Google Scholar 

  14. P. Felzenszwalb, D. Huttenlocher, J. Kleinberg, Fast algorithms for large-state-space HMMs with applications to web usage analysis, in Advances in Neural Information Processing Systems (NIPS), vol. 16 (2003)

    Google Scholar 

  15. E. Gabrilovich, S. Dumais, E. Horvitz, NewsJunkie: providing personalized newsfeeds via analysis of information novelty, in Proceedings of the Thirteenth International World Wide Web Conference (2004)

    Google Scholar 

  16. Google. Zeitgeist at http://www.google.com/press/zeitgeist.html

  17. D. Gruhl, R. Guha, D. Liben-Nowell, A. Tomkins, Information diffusion through blogspace, in Proc. International WWW Conference (2004)

    Google Scholar 

  18. D. Hand, H. Mannila, P. Smyth, Principles of Data Mining (MIT Press, Cambridge, 2001)

    Google Scholar 

  19. S. Havre, B. Hetzler, L. Nowell, ThemeRiver: visualizing theme changes over time, in Proc. IEEE Symposium on Information Visualization (2000)

    Google Scholar 

  20. D. Jensen, Personal communication (2002)

    Google Scholar 

  21. F.P. Kelly, Notes on effective bandwidths, in Stochastic Networks: Theory and Applications, ed. by F.P. Kelly, S. Zachary, I. Ziedins (Oxford University Press, London, 1996)

    Google Scholar 

  22. D. Kempe, J. Kleinberg, E. Tardos, Maximizing the spread of influence through a social network, in Proc. 9th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining (2003)

    Google Scholar 

  23. J. Kleinberg, Bursty and hierarchical structure in streams, in Proc. 8th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining (2002)

    Google Scholar 

  24. R. Kumar, J. Novak, P. Raghavan, A. Tomkins, On the bursty evolution of blogspace, in Proc. International WWW Conference (2003)

    Google Scholar 

  25. V. Lavrenko, M. Schmill, D. Lawrie, P. Ogilvie, D. Jensen, J. Allan, Mining of concurrent text and time series. KDD-2000 workshop on text mining (2000)

    Google Scholar 

  26. R. Liebscher, R. Belew, Lexical dynamics and conceptual change: analyses and implications for information retrieval. Cogn. Sci. (Online) 1 (2003)

    Google Scholar 

  27. K. Mane, K. Börner, Mapping topics and topic bursts in PNAS. Proc. Natl. Acad. Sci. 101(Suppl. 1), 5287–5290 (2004)

    Article  Google Scholar 

  28. H. Mannila, H. Toivonen, A.I. Verkamo, Discovering frequent episodes in sequences, in Proc. Intl. Conf. on Knowledge Discovery and Data Mining (1995)

    Google Scholar 

  29. N. Miller, P. Wong, M. Brewster, H. Foote, Topic islands: a wavelet-based text visualization system, in Proc. IEEE Visualization (1998)

    Google Scholar 

  30. R. Papka, On-line new event detection, clustering, and tracking. PhD thesis, Univ. Mass. Amherst (1999)

    Google Scholar 

  31. C. Plaisant, B. Milash, A. Rose, S. Widoff, B. Shneiderman, LifeLines: visualizing personal histories, in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (1996)

    Google Scholar 

  32. L. Rabiner, A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77 (1989)

    Google Scholar 

  33. R. Swan, J. Allan, Extracting significant time-varying features from text, in Proc. 8th Intl. Conf. on Information Knowledge Management (1999)

    Google Scholar 

  34. R. Swan, J. Allan, Automatic generation of overview timelines, in Proc. SIGIR Intl. Conf. on Information Retrieval (2000)

    Google Scholar 

  35. R. Swan, D. Jensen, TimeMines: constructing timelines with statistical models of word usage. KDD-2000 Workshop on Text Mining (2000)

    Google Scholar 

  36. M. Vlachos, C. Meek, Z. Vagena, D. Gunopulos, Identifying similarities, periodicities and bursts for online search queries, in Proc. ACM SIGMOD International Conference on Management of Data (2004)

    Google Scholar 

  37. P. Wong, W. Cowley, H. Foote, E. Jurrus, J. Thomas, Visualizing sequential patterns for text mining, in Proc. IEEE Information Visualization (2000)

    Google Scholar 

  38. Y. Zhu, D. Shasha, Efficient elastic burst detection in data streams, in Proc. ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jon Kleinberg .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Kleinberg, J. (2016). Temporal Dynamics of On-Line Information Streams. In: Garofalakis, M., Gehrke, J., Rastogi, R. (eds) Data Stream Management. Data-Centric Systems and Applications. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-28608-0_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-28608-0_11

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-28607-3

  • Online ISBN: 978-3-540-28608-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics