Advertisement

Anomaly Detection Using Nonnegative Matrix Factorization

  • Edward G. Allan
  • Michael R. Horvath
  • Christopher V. Kopek
  • Brian T. Lamb
  • Thomas S. Whaples
  • Michael W. Berry

For the Text Mining 2007 Workshop contest (see Appendix), we use the nonnegative matrix factorization (NMF) to generate feature vectors that can be used to cluster Aviation Safety Reporting System (ASRS) documents. By preserving non-negativity, the NMF facilitates a sum-of-parts representation of the underlying term usage patterns in the ASRS document collection. Both the training and test sets of ASRS documents are parsed and then factored by the NMF to produce a reduced-rank representation of the entire document space. The resulting feature and coefficient matrix factors are used to cluster ASRS documents so that the (known) anomalies of training documents are directly mapped to the feature vectors. Dominant features of test documents are then used to generate anomaly relevance scores for those documents. The General Text Parser (GTP) software environment is used to generate term-bydocument matrices for the NMF model.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. M.W. Berry, M. Browne, A.N. Langville, V.P. Pauca, and R.J. Plemmons. Algorithms and applications for approximate nonnegative matrix factorization. Computational Statistics & Data Analysis, 52(1):155-173, 2007.zbMATHCrossRefMathSciNetGoogle Scholar
  2. C. Boutsidis and E. Gallopoulos. On SVD-based initialization for nonnegative matrix factorization. Technical Report HPCLAB-SCG-6/08-05, University of Patras, Patras, Greece, 2005.Google Scholar
  3. A. Cichocki, R. Zdunek, and S. Amari. Csisz ár’s divergences for non-negative matrix factorization: family of new algorithms. In Proc. 6th International Conference on Independent Component Analysis and Blind Signal Separation, Springer, New York, 2006.Google Scholar
  4. I.S. Dhillon and S. Sra. Generalized nonnegative matrix approximations with Bregman divergences. In Proceeding of the Neural Information Processing Systems (NIPS) Conference, Vancouver, B.C., 2005.Google Scholar
  5. D. Guillamet, M. Bressan, and J. Vitria. A weighted non-negative matrix factorization for local representations. In Proc. 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, volume 1, pages 942-947, IEEE, Los Alamitos, CA, 2001.CrossRefGoogle Scholar
  6. J.T. Giles, L. Wo, and M.W. Berry. GTP (General Text Parser) software for text mining. In H. Bozdogan, editor, Software for Text Mining, in Statistical Data Mining and Knowledge Discovery, pages 455-471. CRC Press, Boca Raton, FL, 2003.Google Scholar
  7. E.F. Gonzalez and Y. Zhang. Accelerating the Lee-Seung Algorithm for Nonnegative Matrix Factorization. Technical Report TR-05-02, Rice University, March 2005.Google Scholar
  8. A. B. Hamza and D. Brady. Reconstruction of reflectance spectra using robust non-negative matrix factorization. IEEE Transactions on Signal Processing, 54(9):3637-3642, 2006.CrossRefGoogle Scholar
  9. C.-J. Lin. On the Convergence of Multiplicative Update Algorithms for Non-negative Matrix Factorization. Technical Report Information and Support Services Technical Report, Department of Computer Science, National Taiwan University, 2005.Google Scholar
  10. C.-J. Lin. Projected gradient methods for non-negative matrix factorization. Technical Report Information and Support Services Technical Report ISSTECH-95-013, Department of Computer Science, National Taiwan University, 2005.Google Scholar
  11. D. Lee and H. Seung. Learning the parts of objects by non-negative matrix factorization. Nature, 401:788-791, 1999.CrossRefGoogle Scholar
  12. D. Lee and H. Seung. Algorithms for non-negative matrix factorization. Advances in Neural Information Processing Systems, 13:556-562, 2001.Google Scholar
  13. P. Paatero. Least squares formulation of robust non-negative factor analysis. Chemometrics and Intelligent Laboratory Systems, 37:23-35, 1997.CrossRefGoogle Scholar
  14. P. Paatero. The multilinear engine—a table-driven least squares program for solving multilinear problems, including the n-way parallel factor analysis model. Journal of Computational and Graphical Statistics, 8(4):1-35, 1999.MathSciNetGoogle Scholar
  15. V.P. Pauca, R.J. Plemmons, and K. Abercromby. Nonnegative Matrix Factorization Methods with Physical Constraints for Spectral Unmixing, 2007. In preparation.Google Scholar
  16. P. Paatero and U. Tapper. Positive matrix factorization: a non-negative factor model with optimal utilization of error estimates of data values. Environmetrics, 5:111-126,1994.CrossRefGoogle Scholar
  17. S. Wild, J. Curry, and A. Dougherty. Motivating non-negative matrix factorizations. In Proceedings of the Eighth SIAM Conference on Applied Linear Algebra, SIAM, Philadelphia, 2003. Available from World Wide Web: http://www.siam.org/meetings/la03/proceedings.
  18. Y. Wang, Y. Jiar, C. Hu, and M. Turk. Fisher non-negative matrix factorization for learning local features. In Asian Conference on Computer Vision, Korea, January 27-30, 2004.Google Scholar
  19. R. Zdunek and A. Cichocki. Non-negative matrix factorization with quasi-newton optimization. In Proc. Eighth International Conference on Artificial Intelligence and Soft Computing, ICAISC, Zakopane, Poland, June 25-29, 2006.Google Scholar

Copyright information

© Springer-Verlag London Limited 2008

Authors and Affiliations

  • Edward G. Allan
    • 1
  • Michael R. Horvath
    • 1
  • Christopher V. Kopek
    • 1
  • Brian T. Lamb
    • 1
  • Thomas S. Whaples
    • 1
  • Michael W. Berry
    • 2
  1. 1.Department of Computer ScienceWake Forest UniversityWinston-Salem
  2. 2.Department of Electrical Engineering and Computer ScienceUniversity of TennesseeKnoxville

Personalised recommendations