Food Sales Prediction with Meteorological Data — A Case Study of a Japanese Chain Supermarket

  • Xin Liu
  • Ryutaro Ichise
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10387)


The weather has a strong influence on food retailers’ sales, as it affects customers emotional state, drives their purchase decisions, and dictates how much they are willing to spend. In this paper, we introduce a deep learning based method which use meteorological data to predict sales of a Japanese chain supermarket. To be specific, our method contains a long short-term memory (LSTM) network and a stacked denoising autoencoder network, both of which are used to learn how sales changes with the weathers from a large amount of history data. We showed that our method gained initial success in predicting sales of some weather-sensitive products such as drinks. Particularly, our method outperforms traditional machine learning methods by 19.3%.


Sales prediction LSTM Autoencoder Meteorological data 



We gratefully thank Japan Weather Association especially Mr.Tomohiro Yoshikai for supporting our work.


  1. 1.
    Alomar, M.L., Canals, V., Perez-Mora, N., Martínez-Moll, V., Rosselló, J.L.: Fpga-based stochastic echo state networks for time-series forecasting. Comput. Intell. Neurosci. 3917892, 1–14 (2016)CrossRefGoogle Scholar
  2. 2.
    Amiri, H., Resnik, P., Boyd-Graber, J., Daumé III, H.: Learning text pair similarity with context-sensitive autoencoders. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, pp. 1882–1892, August 2016Google Scholar
  3. 3.
    Bengio, Y.: Learning deep architectures for ai. Foundations and trends® in. Mach. Learn. 2(1), 1–127 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. In: Proceedings of the 20th Annual Conference on Neural Information Processing Systems, Vancouver, Canada, pp. 153–160, December 2006Google Scholar
  5. 5.
    Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Networks 5(2), 157–166 (1994)CrossRefGoogle Scholar
  6. 6.
    Brentan, B.M., Luvizotto Jr., E., Herrera, M., Izquierdo, J., Pérez-García, R.: Hybrid regression model for near real-time urban water demand forecasting. J. Comput. Appl. Math. 309, 532–541 (2017)Google Scholar
  7. 7.
    Chandra, R., Zhang, M.: Cooperative coevolution of elman recurrent neural networks for chaotic time series prediction. Neurocomputing 86, 116–123 (2012)CrossRefGoogle Scholar
  8. 8.
    Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)zbMATHGoogle Scholar
  9. 9.
    Cortez, P., Rio, M., Rocha, M., Sousa, P.: Multi-scale internet traffic forecasting using neural networks and time series methods. Exp. Syst. 29(2), 143–155 (2012)Google Scholar
  10. 10.
    Cox, D.R.: The regression analysis of binary sequences. J. Roy. Stat. Soc. Ser. B 20(2), 215–242 (1958)MathSciNetzbMATHGoogle Scholar
  11. 11.
    De Felice, M., Alessandri, A., Ruti, P.M.: Electricity demand forecasting over italy: Potential benefits using numerical weather prediction models. Electr. Power Syst. Res. 104, 71–79 (2013)CrossRefGoogle Scholar
  12. 12.
    Freund, Y., Schapire, R.E.: A short introduction to boosting. J. Japan. Soc. Artif. Intell. 14(5), 771–780 (1999)Google Scholar
  13. 13.
    Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29(5), 1189–1232 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Gamboa, J.C.B.: Deep learning for time-series analysis. arXiv preprint (2017). arXiv:1701.01887
  15. 15.
    Gao, Y., Hendricks, L.A., Kuchenbecker, K.J., Darrell, T.: Deep learning for tactile understanding from visual and haptic data. In: Proceedings of the 2016 IEEE International Conference on Robotics and Automation, Stockholm, Sweden, pp. 536–543, May 2016Google Scholar
  16. 16.
    Graves, A.: Mohamed, A.r., Hinton, G.: Speech recognition with deep recurrent neural networks. In: Proceedings of the 2013 IEEE International Conference on Acoustics. Speech and Signal Processing, Vancouver, Canada, pp. 6645–6649, May 2013Google Scholar
  17. 17.
    Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    Ho, T.K.: Random decision forests. In: Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, Canada, pp. 278–282, August 1995Google Scholar
  19. 19.
    Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20(8), 832–844 (1998)CrossRefGoogle Scholar
  20. 20.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  21. 21.
    Knerr, S., Personnaz, L., Dreyfus, G.: Single-layer learning revisited: a stepwise procedure for building and training a neural network. Neurocomputing 68, 41–50 (1990)MathSciNetCrossRefGoogle Scholar
  22. 22.
    Lv, Y., Duan, Y., Kang, W., Li, Z., Wang, F.Y.: Traffic flow prediction with big data: a deep learning approach. IEEE Trans. Intell. Transp. Syst. 16(2), 865–873 (2015)Google Scholar
  23. 23.
    Nita, S.: Application of big data technology in support of food manufacturers commodity demand forecasting. NEC Tech. J. 10(1), 90–93 (2015)Google Scholar
  24. 24.
    Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)Google Scholar
  25. 25.
    Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Networks 61, 85–117 (2015)CrossRefGoogle Scholar
  26. 26.
    Schulz, M., Matthies, M.: Artificial neural networks for modeling time series of beach litter in the southern north sea. Marine Environ. Res. 98, 14–20 (2014)CrossRefGoogle Scholar
  27. 27.
    Silberer, C., Lapata, M.: Learning grounded meaning representations with autoencoders. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, MD, USA, pp. 721–732, June 2014Google Scholar
  28. 28.
    Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th International Conference on Machine Learning, Helsinki, Finland, pp. 1096–1103, July 2008Google Scholar
  29. 29.
    Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010)MathSciNetzbMATHGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.National Institute of InformaticsTokyoJapan

Personalised recommendations