Abstract
Motion segmentation, i.e., the problem of clustering data in multiple images based on different 3D motions, is an important task for reconstructing and understanding dynamic scenes. In this paper we address motion segmentation in multiple images by combining partial results coming from triplets of images, which are obtained by fitting a number of trifocal tensors to correspondences. We exploit the fact that the trifocal tensor is a stronger model than the fundamental matrix, as it provides fewer but more reliable matches over three images than fundamental matrices provide over the two. We also consider an alternative solution which merges partial results coming from both triplets and pairs of images, showing the strength of three-frame segmentation in a combination with two-frame segmentation. Our real experiments on standard as well as new datasets demonstrate the superior accuracy of the proposed approaches when compared to previous techniques .
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Observe that these permutations are represented as square matrices since we are assuming that the number of motions is known and constant over all the frames.
- 2.
- 3.
- 4.
- 5.
- 6.
This value was optimally determined on a small subset of sequences (Penguin, Flowers, Pencils and Bag [3]). As for the remaining parameters of RPA (e.g. the number of sampled hypotheses), we used default values provided in the code by the authors.
- 7.
This choice is motivated by the fact that, in the presence of high corruption among the correspondences, one may not expect to classify all the points, as explained in [3]. Observe also that this error metric reports the fraction of wrong labelled data, that one wants to minimize in practice.
References
Arrigoni, F., Fusiello, A.: Synchronization problems in computer vision with closed-form solutions. Int. J. Comput. Vis. 128, 26–52 (2020)
Arrigoni, F., Pajdla, T.: Motion segmentation via synchronization. In: IEEE International Conference on Computer Vision Workshops (ICCVW) (2019)
Arrigoni, F., Pajdla, T.: Robust motion segmentation from pairwise matches. In: Proceedings of the International Conference on Computer Vision (2019)
Barath, D., Matas, J.: Multi-class model fitting by energy minimization and mode-seeking. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11220, pp. 229–245. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01270-0_14
Chin, T.J., Suter, D., Wang, H.: Multi-structure model selection via kernel optimisation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3586–3593 (2010)
Delong, A., Osokin, A., Isack, H.N., Boykov, Y.: Fast approximate energy minimization with label costs. Int. J. Comput. Vis. 96(1), 1–27 (2012)
Elhamifar, E., Vidal, R.: Sparse subspace clustering: algorithm, theory, and applications. IEEE Trans. Pattern Anal. Mach. Intell. 35(11), 2765–2781 (2013)
Fischler, M., Bolles, R.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Morgan Kaufmann Readings Ser. 24, 726–740 (1987)
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2012)
Hartley, R., Vidal, R.: The multibody trifocal tensor: motion segmentation from 3 perspective views. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, pp. I-769-I-775, June 2004. https://doi.org/10.1109/CVPR.2004.1315109
Hartley, R.I., Zisserman, A.: Multiple View Geometry in Computer Vision, 2nd edn. Cambridge University Press, Cambridge (2004)
Hartley, R.: Lines and points in three views and the trifocal tensor. Int. J. Comput. Vis. 22(2), 125–140 (1997)
Holland, P.W., Welsch, R.E.: Robust regression using iteratively reweighted least-squares. Commun. Stat. Theory Methods 6(9), 813–827 (1977)
Isack, H., Boykov, Y.: Energy-based geometric multi-model fitting. Int. J. Comput. Vis. 97(2), 123–147 (2012)
Ji, P., Li, H., Salzmann, M., Dai, Y.: Robust motion segmentation with unknown correspondences. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 204–219. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10599-4_14
Ji, P., Li, H., Salzmann, M., Zhong, Y.: Robust multi-body feature tracker: a segmentation-free approach. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)
Ji, P., Salzmann, M., Li, H.: Shape interaction matrix revisited and robustified: efficient subspace clustering with corrupted and incomplete data. In: Proceedings of the International Conference on Computer Vision, pp. 4687–4695 (2015)
Julià, L.F., Monasse, P.: A critical review of the trifocal tensor estimation. In: Paul, M., Hitoshi, C., Huang, Q. (eds.) PSIVT 2017. LNCS, vol. 10749, pp. 337–349. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-75786-5_28
Kim, J.B., Kim, H.J.: Efficient region-based motion segmentation for a video monitoring system. Pattern Recogn. Lett. 24(1), 113–128 (2003)
Kuang, D., Yun, S., Park, H.: SymNMF: nonnegative low-rank approximation of a similarity matrix for graph clustering. J. Global Optim. 62(3), 545–574 (2014). https://doi.org/10.1007/s10898-014-0247-2
Kuhn, H.W.: The Hungarian method for the assignment problem. Naval Res. Logistics Q. 2(2), 83–97 (1955)
Lai, T., Wang, H., Yan, Y., Chin, T.J., Zhao, W.L.: Motion segmentation via a sparsity constraint. IEEE Trans. Intell. Transp. Syst. 18(4), 973–983 (2017)
Li, Z., Guo, J., Cheong, L.F., Zhou, S.Z.: Perspective motion segmentation via collaborative clustering. In: Proceedings of the International Conference on Computer Vision, pp. 1369–1376 (2013)
Lin, Z., Chen, M., Ma, Y.: The augmented Lagrange multiplier method for exact recovery of corrupted low-rank matrices. eprint arXiv:1009.5055 (2010)
Liu, G., Lin, Z., Yan, S., Sun, J., Yu, Y., Ma, Y.: Robust recovery of subspace structures by low-rank representation. IEEE Trans. Pattern Anal. Mach. Intel. 26(5), 171–184 (2013)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004). https://doi.org/10.1023/B:VISI.0000029664.99615.94
Magri, L., Fusiello, A.: Robust multiple model fitting with preference analysis and low-rank approximation. In: Proceedings of the British Machine Vision Conference, pp. 20.1-20.12. BMVA Press, September 2015
Magri, L., Fusiello, A.: Multiple models fitting as a set coverage problem. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3318–3326, June 2016
Olsson, C., Enqvist, O.: Stable structure from motion for unordered image collections. In: Heyden, A., Kahl, F. (eds.) SCIA 2011. LNCS, vol. 6688, pp. 524–535. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21227-7_49
Ozden, K.E., Schindler, K., Van Gool, L.: Multibody structure-from-motion in practice. IEEE Trans. Pattern Anal. Mach. Intell. 32(6), 1134–1141 (2010)
Pachauri, D., Kondor, R., Singh, V.: Solving the multi-way matching problem by permutation synchronization. In: Advances in Neural Information Processing Systems 26, pp. 1860–1868. Curran Associates, Inc. (2013)
Pavan, A., Tangwongsan, K., Tirthapura, S., Wu, K.L.: Counting and sampling triangles from a graph stream. Proc. VLDB Endowment 6(14), 1870–1881 (2013)
Rao, S., Tron, R., Vidal, R., Ma, Y.: Motion segmentation in the presence of outlying, incomplete, or corrupted trajectories. Pattern Anal. Mach. Intell. 32(10), 1832–1845 (2010)
Rubino, C., Del Bue, A., Chin, T.J.: Practical motion segmentation for urban street view scenes. In: Proceedings of the IEEE International Conference on Robotics and Automation (2018)
Sabzevari, R., Scaramuzza, D.: Monocular simultaneous multi-body motion segmentation and reconstruction from perspective views. In: Proceedings of the IEEE International Conference on Robotics and Automation, pp. 23–30 (2014)
Saputra, M.R.U., Markham, A., Trigoni, N.: Visual SLAM and structure from motion in dynamic environments: a survey. ACM Comput. Surveys 51(2), 37:1–37:36 (2018)
Schindler, K., Suter, D., Wang, H.: A model-selection framework for multibody structure-and-motion of image sequences. Int. J. Comput. Vis. 79(2), 159–177 (2008)
Shen, Y., Huang, Q., Srebro, N., Sanghavi, S.: Normalized spectral map synchronization. In: Advances in Neural Information Processing Systems 29, pp. 4925–4933. Curran Associates, Inc. (2016)
Toldo, R., Fusiello, A.: Robust multiple structures estimation with J-Linkage. In: Proceedings of the European Conference on Computer Vision, pp. 537–547 (2008)
Torr, P.H.S., Zisserman, A.: Concerning Bayesian motion segmentation, model averaging, matching and the trifocal tensor. In: Burkhardt, H., Neumann, B. (eds.) ECCV 1998. LNCS, vol. 1406, pp. 511–527. Springer, Heidelberg (1998). https://doi.org/10.1007/BFb0055687
Torr, P.H.S., Zisserman, A., Murray, D.W.: Motion clustering using the trilinear constraint over three views. In: Europe-China Workshop on Geometric Modelling and Invariants for Computer Vision, pp. 118–125. Springer (1995)
Tron, R., Vidal, R.: A benchmark for the comparison of 3-D motion segmentation algorithms. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2007)
Tron, R., Zhou, X., Esteves, C., Daniilidis, K.: Fast multi-image matching via density-based clustering. In: Proceedings of the International Conference on Computer Vision, pp. 4077–4086 (2017)
Vidal, R., Ma, Y., Sastry, S.: Generalized principal component analysis (GPCA). IEEE Trans. Pattern Anal. Mach. Intell. 27(12), 1945–1959 (2005)
Von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007)
Wang, Y., Liu, Y., Blasch, E., Ling, H.: Simultaneous trajectory association and clustering for motion segmentation. IEEE Signal Process. Lett. 25(1), 145–149 (2018)
Xu, X., Cheong, L.F., Li, Z.: Motion segmentation by exploiting complementary geometric models. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2859–2867 (2018)
Yan, J., Pollefeys, M.: A general framework for motion segmentation: independent, articulated, rigid, non-rigid, degenerate and nondegenerate. In: Proceedings of the European Conference on Computer Vision, pp. 94–106 (2006)
Acknowledgements
This research was supported by the European Regional Development Fund under IMPACT No. CZ.02.1.01/0.0/0.0/15 003/0000468, R4I 4.0 No. CZ.02.1.01/0.0/0.0/15 003/0000470, EU H2020 ARtwin No. 856994, and EU H2020 SPRING No. 871245 Projects.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Arrigoni, F., Magri, L., Pajdla, T. (2020). On the Usage of the Trifocal Tensor in Motion Segmentation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12365. Springer, Cham. https://doi.org/10.1007/978-3-030-58565-5_31
Download citation
DOI: https://doi.org/10.1007/978-3-030-58565-5_31
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58564-8
Online ISBN: 978-3-030-58565-5
eBook Packages: Computer ScienceComputer Science (R0)