Matrix Factorization With Aggregated Observations

  • Yoshifumi Aimoto
  • Hisashi Kashima
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7819)


Missing value estimation is a fundamental task in machine learning and data mining. It is not only used as a preprocessing step in data analysis, but also serves important purposes such as recommendation. Matrix factorization with low-rank assumption is a basic tool for missing value estimation. However, existing matrix factorization methods cannot be applied directly to such cases where some parts of the data are observed as aggregated values of several features in high-level categories. In this paper, we propose a new problem of restoring original micro observations from aggregated observations, and we give formulations and efficient solutions to the problem by extending the ordinary matrix factorization model. Experiments using synthetic and real data sets show that the proposed method outperforms several baseline methods.


Singular Value Decomposition Matrix Factorization Baseline Method Purchase Data Correspondence Matrix 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Brand, M.: Incremental singular value decomposition of uncertain data with missing values. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002, Part I. LNCS, vol. 2350, pp. 707–720. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  2. 2.
    Candes, E.J., Tao, T.: The power of convex relaxation: Near-optimal matrix completion. IEEE Transactions on Information Theory 56(5), 2053–2080 (2010)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Eckart, C., Young, G.: The approximation of one matrix by another of lower rank. Psychometrika 1(3), 211–218 (1936)zbMATHCrossRefGoogle Scholar
  4. 4.
    Eriksson, A., Hengel, A.V.D.: Efficient computation of robust low-rank matrix approximations in the presence of missing data using the L 1 norm. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 771–778. IEEE, San Francisco (2010)Google Scholar
  5. 5.
    Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques, 3rd edn. Morgan Kaufmann (2011)Google Scholar
  6. 6.
    Koren, Y., Bell, R., Volinsky, C.: Matrix factorization techniques for recommender systems. Computer 42(8), 30–37 (2009)CrossRefGoogle Scholar
  7. 7.
    Lakshminarayan, K., Harp, S.A., Samad, T.: Imputation of missing data in industrial databases. Applied Intelligence 11, 259–275 (1999)CrossRefGoogle Scholar
  8. 8.
    Lee, L., Seung, D.: Algorithms for non-negative matrix factorization. In: Advances in Neural Information Processing Systems 13, pp. 556–562 (2001)Google Scholar
  9. 9.
    Little, R.J., Rubin, D.B.: Statistical Analysis with Missing Data. Wiley (1987)Google Scholar
  10. 10.
    Singh, A.P., Gordon, G.J.: Relational learning via collective matrix factorization. In: ACM SIGKDD, Las Vegas, USA, pp. 650–658 (2008)Google Scholar
  11. 11.
    Srebro, N., Rennie, J., Jaakkola, T.: Maximum-margin matrix factorization. In: Advances in Neural Information Processing Systems 17 (2005)Google Scholar
  12. 12.
    Zhu, X., Zhang, S., Jin, Z., Zhang, Z., Xu, Z.: Missing value estimation for mixed-attribute data sets. IEEE Transactions on Knowledge and Data Engineering 23(1), 110–121 (2011)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Yoshifumi Aimoto
    • 1
  • Hisashi Kashima
    • 1
  1. 1.Department of Mathematical InformaticsThe University of TokyoBunkyo-kuJapan

Personalised recommendations