Data Mining and Knowledge Discovery

, Volume 17, Issue 1, pp 111–128 | Cite as

Two heads better than one: pattern discovery in time-evolving multi-aspect data

  • Jimeng Sun
  • Charalampos E. Tsourakakis
  • Evan Hoke
  • Christos Faloutsos
  • Tina Eliassi-Rad
Article

Abstract

Data stream values are often associated with multiple aspects. For example each value observed at a given time-stamp from environmental sensors may have an associated type (e.g., temperature, humidity, etc.) as well as location. Time-stamp, type and location are the three aspects, which can be modeled using a tensor (high-order array). However, the time aspect is special, with a natural ordering, and with successive time-ticks having usually correlated values. Standard multiway analysis ignores this structure. To capture it, we propose 2 Heads Tensor Analysis (2-heads), which provides a qualitatively different treatment on time. Unlike most existing approaches that use a PCA-like summarization scheme for all aspects, 2-heads treats the time aspect carefully. 2-heads combines the power of classic multilinear analysis with wavelets, leading to a powerful mining tool. Furthermore, 2-heads has several other advantages as well: (a) it can be computed incrementally in a streaming fashion, (b) it has a provable error guarantee and, (c) it achieves significant compression ratio against competitors. Finally, we show experiments on real datasets, and we illustrate how 2-heads reveals interesting trends in the data. This is an extended abstract of an article published in the Data Mining and Knowledge Discovery journal.

Keywords

Tensor Multilinear analysis Stream mining Wavelet 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Acar E, Çamtepe SA, Krishnamoorthy MS, Yener B (2005) Modeling and multiway analysis of chatroom tensors. In: ISI, pp 256–268Google Scholar
  2. Bader BW, Kolda TG (2006) Algorithm 862: MATLAB tensor classes for fast algorithm prototyping. ACM Trans Math Softw 32(4): 635–653. doi:10.1145/1186785.1186794 CrossRefMathSciNetGoogle Scholar
  3. Chew PA, Bader BW, Kolda TG, Abdelali A (2007) Cross-language information retrieval using parafac2. In: KDD, ACM Press, New York, NY, USA, pp 143–152Google Scholar
  4. Daubechies I (1992) Ten lectures on wavelets. Capital City Press, Montpelier, Vermont. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PAGoogle Scholar
  5. De Lathauwer L, Moor BD, Vandewalle J (2000) A multilinear singular value decomposition. SIAM J Matrix Anal Appl 21(4): 1253–1278MATHCrossRefMathSciNetGoogle Scholar
  6. Gilbert AC, Kotidis Y, Muthukrishnan S, Strauss MJ (2003) One-pass wavelet decompositions of data streams. IEEE Trans Knowl Data Eng 15(3): 541–554CrossRefGoogle Scholar
  7. Kolda TG, Bader BW, Kenny JP (2005) Higher-order web link analysis using multilinear algebra. In: ICDMGoogle Scholar
  8. Papadimitriou S, Brockwell A, Faloutsos C (2003) Adaptive, hands-off stream mining. In: VLDBGoogle Scholar
  9. Press WH, Teukolsky SA, Vetterling WT, Flannery BP (1992) Numerical recipes in C, 2nd edn. Cambridge University PressGoogle Scholar
  10. Sun J-T, Zeng H-J, Liu H, Lu Y, Chen Z (2005) Cubesvd: a novel approach to personalized web search. In: WWW, pp 382–390Google Scholar
  11. Sun J, Papadimitriou S, Yu P (2006a) Window-based tensor analysis on high-dimensional and multi-aspect streams. In: Proceedings of the international conference on data mining (ICDM)Google Scholar
  12. Sun J, Tao D, Faloutsos C (2006b) Beyond streams and graphs: dynamic tensor analysis. In: KDDGoogle Scholar
  13. Tucker LR (1966) Some mathematical notes on three-mode factor analysis. Psychometrika 31(3):279–311CrossRefMathSciNetGoogle Scholar
  14. Vasilescu MAO, Terzopoulos D (2002) Multilinear analysis of image ensembles: tensorfaces. In: ECCVGoogle Scholar
  15. Xu D, Yan S, Zhang L, Zhang H-J, Liu Z, Shum H-Y (2005) Concurrent subspaces analysis. In: CVPRGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  • Jimeng Sun
    • 1
  • Charalampos E. Tsourakakis
    • 2
  • Evan Hoke
    • 3
  • Christos Faloutsos
    • 2
  • Tina Eliassi-Rad
    • 4
  1. 1.IBM TJ Watson Research CenterHawthorneUSA
  2. 2.Carnegie Mellon UniversityPittsburghUSA
  3. 3.Apple Computer, Inc.CupertinoUSA
  4. 4.Lawrence Livermore National LaboratoryLivermoreUSA

Personalised recommendations