Learning Temporal Transformations from Time-Lapse Videos

  • Yipin ZhouEmail author
  • Tamara L. BergEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9912)


Based on life-long observations of physical, chemical, and biologic phenomena in the natural world, humans can often easily picture in their minds what an object will look like in the future. But, what about computers? In this paper, we learn computational models of object transformations from time-lapse videos. In particular, we explore the use of generative models to create depictions of objects at future times. These models explore several different prediction tasks: generating a future state given a single depiction of an object, generating a future state given two depictions of an object at different times, and generating future states recursively in a recurrent framework. We provide both qualitative and quantitative evaluations of the generated results, and also conduct a human evaluation to compare variations of our models.


Generation Temporal prediction Time-lapse video 

Supplementary material

Supplementary material 1 (mp4 16071 KB)


  1. 1.
    Srivastava, N., Mansimov, E., Salakhudinov, R.: Unsupervised learning of video representations using LSTMs. In: ICML (2015)Google Scholar
  2. 2.
    Mathieu, M., Couprie, C., LeCun, Y.: Deep multi-scale video prediction beyond mean square error. CoRR (2015)Google Scholar
  3. 3.
    Ranzato, M., Szlam, A., Bruna, J., Mathieu, M., Collobert, R., Chopra, S.: Video (language) modeling: a baseline for generative models of natural videos. CoRR (2014)Google Scholar
  4. 4.
    Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.: Describing objects by their attributes. In: CVPR (2009)Google Scholar
  5. 5.
    Parikh, D., Grauman, K.: Relative attributes. IJCV (2011)Google Scholar
  6. 6.
    Patterson, G., Hays, J.: Sun attribute database: discovering, annotating, and recognizing scene attributes. In: CVPR (2012)Google Scholar
  7. 7.
    Isola, P., Lim, J.J., Adelson, E.H.: Discovering states and transformations in image collections. In: CVPR (2015)Google Scholar
  8. 8.
    Shih, Y., Paris, S., Durand, F., Freeman, W.T.: Data-driven hallucination of different times of day from a single outdoor photo. ACM Trans. Graph. 32(6), 200:1–200:11 (2013)CrossRefGoogle Scholar
  9. 9.
    Martin-Brualla, R., Gallup, D., Seitz, S.M.: Time-lapse mining from internet photos. ACM Trans. Graph. 34(4), 621–628 (2015)CrossRefGoogle Scholar
  10. 10.
    Martin-Brualla, R., Gallup, D., Seitz, S.M.: 3D time-lapse reconstruction from internet photos. In: ICCV (2015)Google Scholar
  11. 11.
    Walker, J., Gupta, A., Hebert, M.: Patch to the future: unsupervised visual prediction. In: CVPR (2014)Google Scholar
  12. 12.
    Kitani, K.M., Ziebart, B.D., Bagnell, J.A., Hebert, M.: Activity forecasting. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7575, pp. 201–214. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-33765-9_15 CrossRefGoogle Scholar
  13. 13.
    Yuen, J., Torralba, A.: A data-driven approach for event prediction. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6312, pp. 707–720. Springer, Heidelberg (2010). doi: 10.1007/978-3-642-15552-9_51 CrossRefGoogle Scholar
  14. 14.
    Zhou, Y., Berg, T.L.: Temporal perception and prediction in ego-centric video. In: ICCV (2015)Google Scholar
  15. 15.
    Vondrick, C., Pirsiavash, H., Torralba, A.: Anticipating the future by watching unlabeled video. CoRR (2015)Google Scholar
  16. 16.
    Hinton, G.E., Sejnowski, T.J.: Learning and relearning in Boltzmann machines. In: Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol. 1 (1986)Google Scholar
  17. 17.
    Smolensky, P.: Information processing in dynamical systems: foundations of harmony theory. In: Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol. 1 (1986)Google Scholar
  18. 18.
    Lee, H., Grosse, R., Ranganath, R., Ng, A.Y.: Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: ICML (2009)Google Scholar
  19. 19.
    Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006)CrossRefzbMATHMathSciNetGoogle Scholar
  20. 20.
    Tang, Y., Salakhutdinov, R.R.: Learning stochastic feedforward neural networks. In: NIPS (2013)Google Scholar
  21. 21.
    Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. ArXiv e-prints (2013)Google Scholar
  22. 22.
    Dosovitskiy, A., Springenberg, J.T., Brox, T.: Learning to generate chairs with convolutional neural networks. In: CVPR (2015)Google Scholar
  23. 23.
    Gregor, K., Danihelka, I., Graves, A., Rezende, D., Wierstra, D.: Draw: a recurrent neural network for image generation. In: ICML (2015)Google Scholar
  24. 24.
    Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: NIPS (2014)Google Scholar
  25. 25.
    Denton, E.L., Chintala, S., Szlam, A., Fergus, R.: Deep generative image models using a Laplacian pyramid of adversarial networks (2015)Google Scholar
  26. 26.
    Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. CoRR (2015)Google Scholar
  27. 27.
    Pathak, D., Krähenbühl, P., Donahue, J., Darrell, T., Efros, A.: Context encoders: feature learning by inpainting. arXiv preprint arXiv:1604.07379 (2016)
  28. 28.
    Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: ICML (2015)Google Scholar
  29. 29.
    Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: ICMLGoogle Scholar
  30. 30.
    Ridgeway, K., Snell, J., Roads, B., Zemel, R.S., Mozer, M.C.: Learning to generate images with perceptual similarity metrics. CoRR (2015)Google Scholar
  31. 31.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  32. 32.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR (2014)Google Scholar
  33. 33.
  34. 34.
    Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR (2009)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  1. 1.University of North Carolina at Chapel HillChapel HillUSA

Personalised recommendations