Skip to main content

Analysis of nutrition data by means of a matrix factorization method

Abstract

We present a factorization framework to analyze the data of a regression learning task with two peculiarities. First, inputs can be split into two parts that represent semantically significant entities. Second, the performance of regressors is very low. The basic idea of the approach presented here is to try to learn the ordering relations of the target variable instead of its exact value. Each part of the input is mapped into a common Euclidean space in such a way that the distance in the common space is the representation of the interaction of both parts of the input. The factorization approach obtains reliable models from which it is possible to compute a ranking of the features according to their responsibility in the variation of the target variable. Additionally, the Euclidean representation of data provides a visualization where metric properties have a clear semantics. We illustrate the approach with a case study: the analysis of a dataset about the variations of Body Mass Index for Age of children after a Food Aid Program deployed in poor rural communities in Southern México. In this case, the two parts of inputs are the vectorial representation of children and their diets. In addition to discovering latent information, the mapping of inputs allows us to visualize children and diets in a common metric space.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2

References

  1. Bahamonde, A., Bayón, G.F., Díez, J., Quevedo, J.R., Luaces, O., del Coz, J.J., Alonso, J., Goyache, F.: Feature subset selection for learning preferences: a case study. In: Proceedings of the International Conference on Machine Learning (ICML ’04), pp. 49–56 (2004)

  2. Chen, S., Moore, J., Turnbull, D., Joachims, T.: Playlist prediction via metric embedding. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 714–722. ACM, New York (2012)

  3. González de Cossío, T., Gutiérrez, J., González-Castell, D., Rodríguez-Ramárez, S., Unar, M., Leroy, J., Gadsden, P., Hernández-Licona, G., Gertler, P.: Evaluación de impacto del programa de apoyo alimentario. In: Nutrición y pobreza: política pública basada en evidencia. World Bank, SEDESOL (2008)

  4. del Coz, J.J., Bayón, G.F., Díez, J., Luaces, O., Bahamonde, A., Sañudo, C.: Trait selection for assessing beef meat quality using non-linear SVM. In: Advances in Neural Information Processing Systems 17 (NIPS ’04), pp. 321–328 (2005)

  5. Díez, J., Bayón, G.F., Quevedo, J.R., del Coz, J.J., Luaces, O., Alonso, J., Bahamonde, A.: Discovering relevancies in very difficult regression problems: applications to sensory data analysis. In: Proceedings of the European Conference on Artificial Intelligence (ECAI ’04) (2004)

  6. Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46(1–3), 389–422 (2002)

    MATH  Article  Google Scholar 

  7. Koren, Y., Bell, R., Volinsky, C.: Matrix factorization techniques for recommender systems. Computer 42(8), 30–37 (2009)

    Article  Google Scholar 

  8. Kulis, B., Grauman, K.: Kernelized locality-sensitive hashing for scalable image search. In: Proceedings of the IEEE 12th International Conference on Computer Vision, pp. 2130–2137 (2009)

  9. Leroy, J.L., Gadsden, P., González de Cossío, T., Gertler, P.: Cash and in-kind transfers lead to excess weight gain in a population of women with a high prevalence of overweight in rural mexico. J. Nutr. 143(3), 378–383 (2013)

    Article  Google Scholar 

  10. Leroy, J.L., Gadsden, P., Rodríguez-Ramírez, S., Gonzalez de Cossío, T.: Cash and in-kind transfers in poor rural communities in mexico increase household fruit, vegetable, and micronutrient consumption but also lead to excess energy consumption. J. Nutr. 140(3), 612–617 (2010)

    Article  Google Scholar 

  11. Leroy, J.L., García-Guerra, A., García, R., Dominguez, C., Rivera, J., Neufeld, L.M.: The oportunidades program increases the linear growth of children enrolled at young ages in urban mexico. J. Nutr. 138(4), 793–798 (2008)

    Google Scholar 

  12. Leroy, J.L., Ruel, M., Verhofstadt, E.: The impact of conditional cash transfer programmes on child nutrition: a review of evidence using a programme theory framework. J. Dev. Eff. 1(2), 103–129 (2009)

    Article  Google Scholar 

  13. Luaces, O., Bayón, G.F., Quevedo, J.R., Díez, J., del Coz, J.J., Bahamonde, A.: Analyzing sensory data using non-linear preference learning with feature subset selection. In: Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD ’04), pp. 286–297 (2004)

  14. Moore, J., Chen, S., Joachims, T., Turnbull, D.: Learning to embed songs and tags for playlist prediction. In: Proceedings ISMIR (2012)

  15. Rakotomamonjy, A.: Variable selection using svm based criteria. J. Mach. Learn. Res. 3, 1357–1370 (2003)

    MATH  MathSciNet  Google Scholar 

  16. Rendle, S.: Factorization machines with libfm. ACM Trans. Intell. Syst. Technol. (TIST) 3(3), 57 (2012)

    Google Scholar 

  17. Rendle, S., Freudenthaler, C., Gantner, Z., Schmidt-Thieme, L.: Bpr: Bayesian personalized ranking from implicit feedback. In: Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, pp. 452–461. AUAI Press, Corvallis (2009)

  18. Rendle, S., Schmidt-Thieme, L.: Pairwise interaction tensor factorization for personalized tag recommendation. In: Proceedings of the third ACM international conference on Web search and data mining, pp. 81–90. ACM, New York (2010)

  19. Rivera, J.A., Sotres-Alvarez, D., Habicht, J.P., Shamah, T., Villalpando, S.: Impact of the mexican program for education, health, and nutrition (progresa) on rates of growth and anemia in infants and young children. JAMA: J. Am. Med. Assoc. 291(21), 2563–2570 (2004)

    Article  Google Scholar 

  20. Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 22(3), 400–407 (1951)

    MATH  MathSciNet  Article  Google Scholar 

  21. Weston, J., Bengio, S., Hamel, P.: Multi-tasking with joint semantic spaces for large-scale music annotation and retrieval. J. New Music Res. 40(4), 337–348 (2011)

    Article  Google Scholar 

  22. Weston, J., Bengio, S., Usunier, N.: Large scale image annotation: learning to rank with joint word-image embeddings. Mach. Learn. 81(1), 21–35 (2010)

    MathSciNet  Article  Google Scholar 

Download references

Acknowledgments

The research reported here is supported in part under grant TIN2011-23558 from the MICINN (Ministerio de Ciencia e Innovación, Spain). Edna Gamboa was supported by a Ph.D. grant from CONACYT (Consejo Nacional de Ciencia y Tecnología, México). The paper was written while Antonio Bahamonde was visiting Cornell University with Grants of Movilidad Campus de Excelencia Internacional (Universidad de Oviedo) and of Programa Nacional de Movilidad de Recursos Humanos del Plan Nacional de Investigación (Ministerio de Educación, Cultura y Deporte, Spain). The dataset was gathered in a project supported by Ministerio de Desarrollo Social de México.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jorge Díez.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Díez, J., Gamboa, E., González de Cossío, T. et al. Analysis of nutrition data by means of a matrix factorization method. Prog Artif Intell 3, 119–127 (2015). https://doi.org/10.1007/s13748-015-0062-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13748-015-0062-0

Keywords

  • Matrix factorization
  • Learning to rank
  • Feature selection
  • Data analysis
  • Nutrition data
  • Body mass index (BMI)