Anomaly Detection Using Nonnegative Matrix Factorization
For the Text Mining 2007 Workshop contest (see Appendix), we use the nonnegative matrix factorization (NMF) to generate feature vectors that can be used to cluster Aviation Safety Reporting System (ASRS) documents. By preserving non-negativity, the NMF facilitates a sum-of-parts representation of the underlying term usage patterns in the ASRS document collection. Both the training and test sets of ASRS documents are parsed and then factored by the NMF to produce a reduced-rank representation of the entire document space. The resulting feature and coefficient matrix factors are used to cluster ASRS documents so that the (known) anomalies of training documents are directly mapped to the feature vectors. Dominant features of test documents are then used to generate anomaly relevance scores for those documents. The General Text Parser (GTP) software environment is used to generate term-bydocument matrices for the NMF model.
Unable to display preview. Download preview PDF.
- C. Boutsidis and E. Gallopoulos. On SVD-based initialization for nonnegative matrix factorization. Technical Report HPCLAB-SCG-6/08-05, University of Patras, Patras, Greece, 2005.Google Scholar
- A. Cichocki, R. Zdunek, and S. Amari. Csisz ár’s divergences for non-negative matrix factorization: family of new algorithms. In Proc. 6th International Conference on Independent Component Analysis and Blind Signal Separation, Springer, New York, 2006.Google Scholar
- I.S. Dhillon and S. Sra. Generalized nonnegative matrix approximations with Bregman divergences. In Proceeding of the Neural Information Processing Systems (NIPS) Conference, Vancouver, B.C., 2005.Google Scholar
- J.T. Giles, L. Wo, and M.W. Berry. GTP (General Text Parser) software for text mining. In H. Bozdogan, editor, Software for Text Mining, in Statistical Data Mining and Knowledge Discovery, pages 455-471. CRC Press, Boca Raton, FL, 2003.Google Scholar
- E.F. Gonzalez and Y. Zhang. Accelerating the Lee-Seung Algorithm for Nonnegative Matrix Factorization. Technical Report TR-05-02, Rice University, March 2005.Google Scholar
- C.-J. Lin. On the Convergence of Multiplicative Update Algorithms for Non-negative Matrix Factorization. Technical Report Information and Support Services Technical Report, Department of Computer Science, National Taiwan University, 2005.Google Scholar
- C.-J. Lin. Projected gradient methods for non-negative matrix factorization. Technical Report Information and Support Services Technical Report ISSTECH-95-013, Department of Computer Science, National Taiwan University, 2005.Google Scholar
- D. Lee and H. Seung. Algorithms for non-negative matrix factorization. Advances in Neural Information Processing Systems, 13:556-562, 2001.Google Scholar
- V.P. Pauca, R.J. Plemmons, and K. Abercromby. Nonnegative Matrix Factorization Methods with Physical Constraints for Spectral Unmixing, 2007. In preparation.Google Scholar
- S. Wild, J. Curry, and A. Dougherty. Motivating non-negative matrix factorizations. In Proceedings of the Eighth SIAM Conference on Applied Linear Algebra, SIAM, Philadelphia, 2003. Available from World Wide Web: http://www.siam.org/meetings/la03/proceedings.
- Y. Wang, Y. Jiar, C. Hu, and M. Turk. Fisher non-negative matrix factorization for learning local features. In Asian Conference on Computer Vision, Korea, January 27-30, 2004.Google Scholar
- R. Zdunek and A. Cichocki. Non-negative matrix factorization with quasi-newton optimization. In Proc. Eighth International Conference on Artificial Intelligence and Soft Computing, ICAISC, Zakopane, Poland, June 25-29, 2006.Google Scholar