Space–time image layout

  • 246 Accesses


Cameras are now ubiquitous in our lives. A given activity is often captured by multiple people from different viewpoints resulting in a sizable collection of photograph footage. We present a method that effectively organizes this spatiotemporal content. Given an unorganized collection of photographs taken by a number of photographers, capturing some dynamic event at a number of time steps, we would like to organize the collection into a space–time table. The organization is an embedding of the photographs into clusters that preserve the viewpoint and time order. Our method relies on a self-organizing map (SOM), which is a neural network that embeds the training data (the set of images) into a discrete domain. We introduce BiSOM, which is a variation of SOM that considers two features (space and time) rather than a single one, to layout the given photograph collection into a table. We demonstrate our method on several challenging datasets, using different space and time descriptors.

This is a preview of subscription content, log in to check access.

Access options

Buy single article

Instant unlimited access to the full article PDF.

US$ 39.95

Price includes VAT for USA

Subscribe to journal

Immediate online access to all issues from 2019. Subscription will auto renew annually.

US$ 199

This is the net price. Taxes to be calculated in checkout.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18


  1. 1.

    The silhouette index \(-1 \le {s_i}\left( j \right) \le 1\) provides an indication for how well the element i lies within the cluster j. A value of \({s_i}\left( j \right) \) close to positive one means that the datum i is appropriately clustered in the cluster j, conversely, a value close to negative one means that the datum i is unlikely to belong to cluster j.

  2. 2.

    The Rand index \(0\le R_I \le 1\) is a measure of the similarity between two clustering results, with 0 indicating that the two data clusters do not agree on any pair of points and 1 indicating that the data clusters are exactly the same.

  3. 3.

    Datasets boats and slides in the courtesy of Dekel et al. [9]

  4. 4.

    The swapping distance is defined to be the minimum number of swaps, or transpositions, of two adjacent clusters that transforms one permutation into another.


  1. 1.

    Aoki, T., Aoyagi, T.: Self-organizing maps with asymmetric neighborhood function. Neural Comput. 19(9), 2515–2535 (2007)

  2. 2.

    Averbuch-Elor, H., Cohen-Or, D.: Ringit: Ring-ordering casual photos of a temporal event. ACM Trans. Graph. 35(1), 33 (2015)

  3. 3.

    Bashyal, S., Venayagamoorthy, G.K.: Recognition of facial expressions using gabor wavelets and learning vector quantization. Eng. Appl. Artif. Intell. 21(7), 1056–1064 (2008)

  4. 4.

    Brahmachari, A.S., Sarkar, S.: View clustering of wide-baseline n-views for photo tourism. In: Graphics, Patterns and Images (Sibgrapi), 2011 24th SIBGRAPI Conference on, pp. 157–164. IEEE (2011)

  5. 5.

    Caspi, Y., Irani, M.: Spatio-temporal alignment of sequences. IEEE Trans. Pattern Anal. Mach. Intell. 24(11), 1409–1424 (2002)

  6. 6.

    Chen, L.P., Liu, Y.G., Huang, Z.X., Shi, Y.T.: An improved som algorithm and its application to color feature extraction. Neural Comput. Appl. 24(7–8), 1759–1770 (2014)

  7. 7.

    Cormode, G., Muthukrishnan, S.: The string edit distance matching problem with moves. ACM Trans. Algorithms. (TALG) 3(1), 2 (2007)

  8. 8.

    Moses, Y., Avidan, S., et al.: Space-time tradeoffs in photo sequencing. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 977–984 (2013)

  9. 9.

    Dekel, T., Moses, Y., Avidan, S.: Photo sequencing. Int. J. Comput. Vis. 110(3), 275–289 (2014)

  10. 10.

    Dexter, E., Pérez, P., Laptev, I.: Multi-view synchronization of human actions and dynamic scenes. In: BMVC, pp. 1–11. Citeseer (2009)

  11. 11.

    Endo, M., Ueno, M., Tanabe, T.: A clustering method using hierarchical self-organizing maps. J. VLSI Signal Process. Syst. Signal Image Video Technol. 32(1–2), 105–118 (2002)

  12. 12.

    Fried, O., DiVerdi, S., Halber, M., Sizikova, E., Finkelstein, A.: IsoMatch: Creating informative grid layouts. In: Computer Graphics Forum, vol. 34, no 2, pp. 155–166. Wiley (2015)

  13. 13.

    Furukawa, Y., Curless, B., Seitz, S.M., Szeliski, R.: Towards internet-scale multi-view stereo. In: Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, pp. 1434–1441. IEEE (2010)

  14. 14.

    Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2(1), 193–218 (1985)

  15. 15.

    Kiang, M.Y.: Extending the kohonen self-organizing map networks for clustering analysis. Comput. Stat. Data Anal. 38(2), 161–180 (2001)

  16. 16.

    Kohonen, T.: The self-organizing map. Proc. IEEE 78(9), 1464–1480 (1990)

  17. 17.

    Lee, J.A., Verleysen, M.: Self-organizing maps with recursive neighborhood adaptation. Neural Netw. 15(8), 993–1003 (2002)

  18. 18.

    Lefebvre, G., Laurent, C., Ros, J., Garcia, C.: Supervised image classification by som activity map comparison. In: Pattern Recognition, 2006. ICPR 2006. 18th International Conference on, vol. 2, pp. 728–731. IEEE (2006)

  19. 19.

    Ling, H., Jacobs, D.W.: Shape classification using the inner-distance. IEEE Trans. Pattern Anal. Mach. Intell. 29(2), 286–299 (2007)

  20. 20.

    Mauro, M., Riemenschneider, H., Van Gool, L., Leonardi, R., Brescia, I.: Overlapping camera clustering through dominant sets for scalable 3D reconstruction. In: BMVC, vol. 1, no 2, p. 3 (2013)

  21. 21.

    Moehrmann, J., Bernstein, S., Schlegel, T., Werner, G., Heidemann, G.: Improving the usability of hierarchical representations for interactively labeling large image data sets. In: Human-Computer Interaction. Design and Development Approaches, pp. 618–627. Springer, Berlin (2011)

  22. 22.

    Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42(3), 145–175 (2001)

  23. 23.

    Ong, S.H., Yeo, N., Lee, K., Venkatesh, Y., Cao, D.: Segmentation of color images using a two-stage self-organizing network. Image Vis. Comput. 20(4), 279–289 (2002)

  24. 24.

    Quadrianto, N., Song, L., Smola, A.J.: Kernelized sorting. In: Advances in neural information processing systems, pp. 1289–1296 (2009)

  25. 25.

    Reinert, B., Ritschel, T., Seidel, H.P.: Interactive by-example design of artistic packing layouts. ACM Trans. Graph. (TOG) 32(6), 218 (2013)

  26. 26.

    Rother, C., Kolmogorov, V., Blake, A.: Grabcut: interactive foreground extraction using iterated graph cuts. ACM Trans. Graph. (TOG) 23(3), 309–314 (2004)

  27. 27.

    Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)

  28. 28.

    Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)

  29. 29.

    Snavely, N., Seitz, S.M., Szeliski, R.: Photo tourism: exploring photo collections in 3D. In: ACM transactions on graphics (TOG), vol. 25, no 3, pp. 835–846. ACM (2006)

  30. 30.

    Strong, G., Gong, M.: Self-sorting map: An efficient algorithm for presenting multimedia data in structured layouts. IEEE Trans. Multimedia. 16(4), 1045–1058 (2014)

  31. 31.

    Tomasi, C., Kanade, T.: Shape and motion from image streams under orthography: a factorization method. Int. J. Comput. Vis. 9(2), 137–154 (1992)

  32. 32.

    Zhou, H., Yuan, Y., Shi, C.: Object tracking using sift features and mean shift. Comput. Vis. Image Underst. 113(3), 345–352 (2009)

Download references

Author information

Correspondence to Shahar Ben-Ezra.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (mp4 220892 KB)

Supplementary material 1 (mp4 220892 KB)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ben-Ezra, S., Cohen-Or, D. Space–time image layout. Vis Comput 34, 417–430 (2018) doi:10.1007/s00371-016-1347-4

Download citation


  • Image organization
  • Spatial ordering
  • Temporal ordering
  • Self-organizing maps