Skip to main content

Factor Analysis of Incidence Data via Novel Decomposition of Matrices

  • Conference paper
Formal Concept Analysis (ICFCA 2009)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5548))

Included in the following conference series:

Abstract

Matrix decomposition methods provide representations of an object-variable data matrix by a product of two different matrices, one describing relationship between objects and hidden variables or factors, and the other describing relationship between the factors and the original variables. We present a novel approach to decomposition and factor analysis of matrices with incidence data. The matrix entries are grades to which objects represented by rows satisfy attributes represented by columns, e.g. grades to which an image is red or a person performs well in a test. We assume that the grades belong to a scale bounded by 0 and 1 which is equipped with certain aggregation operators and forms a complete residuated lattice. We present an approximation algorithm for the problem of decomposition of such matrices with grades into products of two matrices with grades with the number of factors as small as possible. Decomposition of binary matrices into Boolean products of binary matrices is a special case of this problem in which 0 and 1 are the only grades. Our algorithm is based on a geometric insight provided by a theorem identifying particular rectangular-shaped submatrices as optimal factors for the decompositions. These factors correspond to formal concepts of the input data and allow for an easy interpretation of the decomposition. We present the problem formulation, basic geometric insight, algorithm, illustrative example, experimental evaluation.

Supported by research plan MSM 6198959214.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ausiello, G., et al.: Complexity and Approximation. Combinatorial Optimization Problems and Their Approximability Properties. Springer, Heidelberg (2003)

    Google Scholar 

  2. Bartholomew, D.J., Knott, M.: Latent Variable Models and Factor Analysis, 2nd edn., London, Arnold (1999)

    Google Scholar 

  3. Bartl, E., Belohlavek, R., Konecny, J.: Optimal decompositions of matrices with grades into binary and graded matrices. In: Proc. CLA 2008, The Sixth Intl. Conference on Concept Lattice and Their Applications, Olomouc, Czech Republic, pp. 59–70 (2008) ISBN 978–80–244–2111–7

    Google Scholar 

  4. Belohlavek, R.: Concept lattices and order in fuzzy logic. Annals of Pure and Applied Logic 128(1–3), 277–298 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  5. Belohlavek, R.: Optimal decompositions of matrices with grades. IEEE IS 2008, Proc. Intl. IEEE Conference on Intelligent Systems, Varna, Bulgaria, pp. 15-2–15-7 (2008) IEEE Catalog Number CFP08802-PRT, ISBN 978-1-4244-1740-7

    Google Scholar 

  6. Belohlavek, R., Vychodil, V.: Discovery of optimal factors in binary data via a novel method of matrix decomposition. J. Computer and System Sciences (to appear)

    Google Scholar 

  7. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 2nd edn. MIT Press, Cambridge (2001)

    MATH  Google Scholar 

  8. Fagin, R.: Combining fuzzy information from multiple systems. J. Computer and System Sciences 58, 83–99 (1999); Preliminary version in PODS 1996, Montreal, pp. 216–226 (1996)

    Google Scholar 

  9. Fagin, R., Lotem, A., Naor, M.: Combining fuzzy information: an overview. SIGMOD Record 31(2), 109–118 (2002)

    Article  Google Scholar 

  10. Frolov, A.A., Húsek, D., Muraviev, I.P., Polyakov, P.A.: Boolean factor analysis by Hopfield-like autoassociative memory. IEEE Transactions on Neural Networks 18(3), 698–707 (2007)

    Article  Google Scholar 

  11. Ganter, B., Wille, R.: Formal Concept Analysis. Mathematical Foundations. Springer, Berlin (1999)

    Book  MATH  Google Scholar 

  12. Geerts, F., Goethals, B., Mielikäinen, T.: Tiling Databases. In: Suzuki, E., Arikawa, S. (eds.) DS 2004. LNCS, vol. 3245, pp. 278–289. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  13. Golub, G., Van Loan, C.: Matrix Computations. Johns Hopkins University Press (1996)

    Google Scholar 

  14. Hájek, P.: Metamathematics of Fuzzy Logic. Kluwer, Dordrecht (1998)

    Book  MATH  Google Scholar 

  15. Keprt, A., Snášel, V.: Binary factor analysis with help of formal concepts. In: Proc. CLA, pp. 90–101 (2004)

    Google Scholar 

  16. Keprt, A., Snášel, V.: Binary Factor Analysis with Genetic Algorithms. In: Proc. IEEE WSTST, pp. 1259–1268. Springer, Heidelberg (2005)

    Google Scholar 

  17. Klement, E.P., Mesiar, R., Pap, E.: Triangular Norms. Kluwer, Dordrecht (2000)

    Book  MATH  Google Scholar 

  18. Krantz, H.H., Luce, R.D., Suppes, P., Tversky, A.: Foundations of Measurement. vol. I (Additive and Polynomial Representations), vol. II (Geometric, Threshold, and Probabilistic Represenations), vol. III (Represenations, Axiomatization, and Invariance). Dover Edition (2007)

    Google Scholar 

  19. Lee, D., Seung, H.: Learning the parts of objects by non-negative matrix factorization. Nature 401, 788–791 (1999)

    Article  MATH  Google Scholar 

  20. Leeuw, J.D.: Principal component analysis of binary data. Application to roll-call analysis (2003), http://gifi.stat.ucla.edu

  21. Mickey, M.R., Mundle, P., Engelman, L.: Boolean factor analysis. In: Dixon, W.J. (ed.) BMDP statistical software manual, vol. 2, pp. 849–860. University of California Press, Berkeley (1990)

    Google Scholar 

  22. Miettinen, P., Mielikäinen, T., Gionis, A., Das, G., Mannila, H.: The discrete basis problem. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) PKDD 2006. LNCS (LNAI), vol. 4213, pp. 335–346. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  23. Miller, G.A.: The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychol. Rev. 63, 81–97 (1956)

    Article  Google Scholar 

  24. Nau, D.S.: Specificity covering: immunological and other applications, computational complexity and other mathematical properties, and a computer program. A. M. Thesis, Technical Report CS–1976–7, Computer Sci. Dept., Duke Univ., Durham, N. C (1976)

    Google Scholar 

  25. Nau, D.S., Markowsky, G., Woodbury, M.A., Amos, D.B.: A Mathematical Analysis of Human Leukocyte Antigen Serology. Math. Biosciences 40, 243–270 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  26. Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290, 2323–2326 (2000)

    Article  Google Scholar 

  27. Sajama, O.A.: Semi-parametric Exponential Family PCA. In: NIPS 2004 (2004)

    Google Scholar 

  28. Schein, A., Saul, L., Ungar, L.: A generalized linear model for principal component analysis of binary data. In: Proc. Int. Workshop on Artificial Intelligence and Statistics, pp. 14–21 (2003)

    Google Scholar 

  29. Stockmeyer, L.J.: The set basis problem is NP-complete. IBM Research Report RC5431, Yorktown Heights, NY (1975)

    Google Scholar 

  30. Tang, F., Tao, H.: Binary principal component analysis. In: Proc. British Machine Vision Conference 2006, pp. 377–386 (2006)

    Google Scholar 

  31. Tatti, N., Mielikäinen, T., Gionis, A., Mannila, H.: What is the dimension of your binary data? In: The 2006 IEEE Conference on Data Mining (ICDM 2006), pp. 603–612. IEEE Computer Society, Los Alamitos (2006)

    Google Scholar 

  32. Tenenbaum, J.B., de Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290, 2319–2323 (2000)

    Article  Google Scholar 

  33. Vaidya, J., Atluri, V., Guo, Q.: The Role Mining Problem: Finding a Minimal Descriptive Set of Roles. In: ACM Symposium on Access Control Models and Technologies, pp. 175–184 (June 2007)

    Google Scholar 

  34. Ward, M., Dilworth, R.P.: Residuated lattices. Trans. Amer. Math. Soc. 45, 335–354 (1939)

    Article  MathSciNet  MATH  Google Scholar 

  35. Zadeh, L.A.: Fuzzy sets. Inf. Control 8, 338–353 (1965)

    Article  MATH  Google Scholar 

  36. Zivkovic, Z., Verbeek, J.: Transformation invariant component analysis for binary images. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006), vol. 1, pp. 254–259 (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Belohlavek, R., Vychodil, V. (2009). Factor Analysis of Incidence Data via Novel Decomposition of Matrices. In: Ferré, S., Rudolph, S. (eds) Formal Concept Analysis. ICFCA 2009. Lecture Notes in Computer Science(), vol 5548. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01815-2_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-01815-2_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-01814-5

  • Online ISBN: 978-3-642-01815-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics