Scenario Discovery Using Nonnegative Tensor Factorization

  • Brett W. Bader
  • Andrey A. Puretskiy
  • Michael W. Berry
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5197)

Abstract

In the relatively new field of visual analytics there is a great need for automated approaches to both verify and discover the intentions and schemes of primary actors through time. Data mining and knowledge discovery play critical roles in facilitating the ability to extract meaningful information from large and complex textual-based (digital) collections. In this study, we develop a mathematical strategy based on nonnegative tensor factorization (NTF) to extract and sequence important activities and specific events from sources such as news articles. The ability to automatically reconstruct a plot or confirm involvement in a questionable activity is greatly facilitated by our approach. As a variant of the PARAFAC multidimensional data model, we apply our NTF algorithm to the terrorism-based scenarios of the VAST 2007 Contest data set to demonstrate how term-by-entity associations can be used for scenario/plot discovery and evaluation.

Keywords

nonnegative tensor factorization PARAFAC scenario discovery VAST 2007 visual analytics 

References

  1. 1.
    Bader, B.W., Berry, M.W., Browne, M.: Discussion tracking in Enron email using PARAFAC. In: Survey of Text Mining II: Clustering, Classification and Retrieval, pp. 147–163. Springer, London (2008)CrossRefGoogle Scholar
  2. 2.
    Harshman, R.A.: Foundations of the PARAFAC procedure: models and conditions for an “explanatory” multi-modal factor analysis. UCLA working papers in phonetics 16, 1–84 (1970), http://publish.uwo.ca/~harshman/wpppfac0.pdf Google Scholar
  3. 3.
    Scholtz, J., Plaisant, C., Grinstein, G.: IEEE VAST 2007 Constest (2007), http://www.cs.umd.edu/hcil/VASTcontest07
  4. 4.
    Smilde, A., Bro, R., Geladi, P.: Multi-way analysis: applications in the chemical sciences. Wiley, West Sussex (2004)CrossRefGoogle Scholar
  5. 5.
    Carroll, J.D., Chang, J.J.: Analysis of individual differences in multidimensional scaling via an N-way generalization of ‘Eckart-Young’ decomposition. Psychometrika 35, 283–319 (1970)CrossRefMATHGoogle Scholar
  6. 6.
    Kolda, T.G., Bader, B.W.: Tensor decompositions and applications. SIAM Review (to appear)Google Scholar
  7. 7.
    Faber, N.K.M., Bro, R., Hopke, P.K.: Recent developments in CANDECOMP/PARAFAC algorithms: A critical review. Chemometrics and Intelligent Laboratory Systems 65(1), 119–137 (2003)CrossRefGoogle Scholar
  8. 8.
    Tomasi, G., Bro, R.: A comparison of algorithms for fitting the PARAFAC model. Computational Statistics & Data Analysis 50(7), 1700–1734 (2006)MathSciNetCrossRefMATHGoogle Scholar
  9. 9.
    Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401, 788–791 (1999)CrossRefGoogle Scholar
  10. 10.
    Welling, M., Weber, M.: Positive tensor factorization. Pattern Recognition Letters 22(12), 1255–1261 (2001)CrossRefMATHGoogle Scholar
  11. 11.
    Berry, M.W., Browne, M.: Email surveillance using nonnegative matrix factorization. In: Workshop on Link Analysis, Counterterrorism and Security, SIAM Conference on Data Mining, Newport Beach, CA (2005)Google Scholar
  12. 12.
    Berry, M., Browne, M.: Email Surveillance Using Nonnegative Matrix Factorization. Computational & Mathematical Organization Theory 11, 249–264 (2005)CrossRefMATHGoogle Scholar
  13. 13.
    Bader, B.W., Kolda, T.G.: Efficient MATLAB computations with sparse and factored tensors. Technical Report SAND2006-7592, Sandia National Laboratories, Albuquerque, New Mexico and Livermore, California (December 2006)Google Scholar
  14. 14.
    Bader, B.W., Kolda, T.G.: Matlab tensor toolbox, version 2.2 (January 2007), http://csmr.ca.sandia.gov/~tgkolda/TensorToolbox/
  15. 15.
    Giles, J., Wo, L., Berry, M.: GTP (General Text Parser) Software for Text Mining. In: Bozdogan, H. (ed.) Software for Text Mining, in Statistical Data Mining and Knowledge Discovery, pp. 455–471. CRC Press, Boca Raton (2003)Google Scholar
  16. 16.
    Berry, M., Browne, M.: Understanding Search Engines: Mathematical Modeling and Text Retrieval, 2nd edn. SIAM, Philadelphia (2005)CrossRefMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Brett W. Bader
    • 1
  • Andrey A. Puretskiy
    • 2
  • Michael W. Berry
    • 2
  1. 1.Sandia National LaboratoriesAlbuquerqueUSA
  2. 2.Department of Electrical Engineering and Computer ScienceUniversity of TennesseeKnoxvilleUSA

Personalised recommendations