Shape Similarity for 3D Video Sequences of People

Huang, Peng; Hilton, Adrian; Starck, Jonathan

doi:10.1007/s11263-010-0319-9

Shape Similarity for 3D Video Sequences of People

Published: 05 February 2010

Volume 89, pages 362–381, (2010)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Peng Huang¹,
Adrian Hilton¹ &
Jonathan Starck¹

503 Accesses
76 Citations
Explore all metrics

Abstract

This paper presents a performance evaluation of shape similarity metrics for 3D video sequences of people with unknown temporal correspondence. Performance of similarity measures is compared by evaluating Receiver Operator Characteristics for classification against ground-truth for a comprehensive database of synthetic 3D video sequences comprising animations of fourteen people performing twenty-eight motions. Static shape similarity metrics shape distribution, spin image, shape histogram and spherical harmonics are evaluated using optimal parameter settings for each approach. Shape histograms with volume sampling are found to consistently give the best performance for different people and motions. Static shape similarity is extended over time to eliminate the temporal ambiguity. Time-filtering of the static shape similarity together with two novel shape-flow descriptors are evaluated against temporal ground-truth. This evaluation demonstrates that shape-flow with a multi-frame alignment of motion sequences achieves the best performance, is stable for different people and motions, and overcome the ambiguity in static shape similarity. Time-filtering of the static shape histogram similarity measure with a fixed window size achieves marginally lower performance for linear motions with the same computational cost as static shape descriptors. Performance of the temporal shape descriptors is validated for real 3D video sequence of nine actors performing a variety of movements. Time-filtered shape histograms are shown to reliably identify frames from 3D video sequences with similar shape and motion for people with loose clothing and complex motion.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Aguiar, E., Stoll, C., Theobalt, C., Ahmed, N., Seidel, H. P., & Thrun, S. (2008). Performance capture from sparse multi-view video. ACM Transactions on Graphics, 27(3), 1–10.
Article Google Scholar
Ankerst, M., Kastenmüller, G., Kriegel, H. P., & Seidl, T. (1999). 3D shape histograms for similarity search and classification in spatial databases. In SSD ’99: proceedings of the 6th international symposium on advances in spatial databases (pp. 207–226). London: Springer.
Google Scholar
Arikan, O., Forsyth, D. A., & O’Brien, J. F. (2003). Motion synthesis from annotations. ACM Transactions on Graphics, 22(3), 402–408.
Article Google Scholar
Belongie, S., Malik, J., & Puzicha, J. (2002). Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis & Machine Intelligence, 24(4), 509–522.
Article Google Scholar
Bobick, A. F., & Davis, J. W. (2001). The recognition of human movement using temporal templates. IEEE Transactions on Pattern Analysis & Machine Intelligence, 23(3), 257–267.
Article Google Scholar
Bustos, B., Keim, D., Saupe, D., & Schreck, T. (2007). Content-based 3D object retrieval. Computer Graphics and Applications. IEEE, 27(4), 22–27.
Google Scholar
Carranza, J., Theobalt, C., Magnor, M. A., & Seidel, H. P. (2003). Free-viewpoint video of human actors. ACM Transactions on Graphics, 22(3), 569–577.
Article Google Scholar
Chen, D. Y., Ouhyoung, M., Tian, X. P., & Shen, Y. T. (2003). On visual similarity based 3D model retrieval. Computer Graphics Forum (EUROGRAPHICS’03), 22(3), 223–232.
Article Google Scholar
Chua, C. S., & Jarvis, R. (1997). Point signatures: a new representation for 3D object recognition. International Journal of Computer Vision, 25(1), 63–85.
Article Google Scholar
Corney, J., Rea, H., Clark, D., Pritchard, J., Breaks, M., & Macleod, R. (2002). Coarse filters for shape matching. Computer Graphics and Applications, IEEE, 22(3), 65–74.
Article Google Scholar
Cutler, R., & Davis, L. S. (2000). Robust real-time periodic motion detection, analysis, and applications. IEEE Transactions on Pattern Analysis & Machine Intelligence, 22(8), 781–796.
Article Google Scholar
Del Bimbo, A., & Pala, P. (2006). Content-based retrieval of 3D models. ACM Transactions on Multimedia Computing, Communications, and Applications, 2(1), 20–43.
Article Google Scholar
Efros, A. A., Berg, A. C., Mori, G., & Malik, J. (2003). Recognizing action at a distance. In ICCV ’03: Proceedings of the ninth IEEE international conference on computer vision. Washington: IEEE Computer Society.
Google Scholar
El-Mehalawi, M. (2003). A database system of mechanical components based on geometric and topological similarity. part ii: indexing, retrieval, matching, and similarity assessment. Computer-Aided Design, 35(1), 95–105.
Article Google Scholar
Elad, A., & Kimmel, R. (2003). On bending invariant signatures for surfaces. IEEE Transactions on Pattern Analysis & Machine Intelligence, 25(10), 1285–1295.
Article Google Scholar
Gleicher, M., Joon, H., Lucas, S., & Jepsen, K. A. (2003). Snap-together motion: assembling run-time animation. ACM Transactions on Graphics, 22, 181–188.
Article Google Scholar
Gorelick, L., Blank, M., Shechtman, E., Irani, M., & Basri, R. (2007). Actions as space-time shapes. IEEE Transactions on Pattern Analysis & Machine Intelligence, 29(12), 2247–2253.
Article Google Scholar
Hilaga, M., Shinagawa, Y., Kohmura, T., & Kunii, T. L. (2001). Topology matching for fully automatic similarity estimation of 3D shapes. In SIGGRAPH ’01: Proceedings of the 28th annual conference on computer graphics and interactive techniques (pp. 203–212). New York: ACM Press.
Chapter Google Scholar
Huang, P., & Hilton, A. (2009). Human motion synthesis from 3D video. In Proceedings of the 2009 conference on computer vision and pattern recognition (CVPR’09) (pp. 1478–1485).
Huang, P., Starck, J., & Hilton, A. (2007a). A study of shape similarity for temporal surface sequences of people. In 3DIM ’07: Proceedings of the sixth international conference on 3D digital imaging and modeling (pp. 408–418). Washington: IEEE Computer Society.
Chapter Google Scholar
Huang, P., Starck, J., & Hilton, A. (2007b). Temporal 3D shape matching. In The fourth European conference on visual media production (CVMP’07) (pp. 1–10).
Iyer, N., Jayanti, S., Lou, K., Kalyanaraman, Y., & Ramani, K. (2005). Three-dimensional shape searching: state-of-the-art review and future trends. Computer-Aided Design, 37(5), 509–530.
Article Google Scholar
Jain, V., & Zhang, H. (2007). A spectral approach to shape-based retrieval of articulated 3D models. Computer-Aided Design, 39(5), 398–407.
Article Google Scholar
Johnson, A. E., & Hebert, M. (1999). Using spin images for efficient object recognition in cluttered 3D scenes. IEEE Transactions on Pattern Analysis & Machine Intelligence, 21(5), 433–449.
Article Google Scholar
Kanade, T., Rander, P., & Narayanan, P. J. (1997). Virtualized reality: Constructing virtual worlds from real scenes. IEEE MultiMedia, 4(1), 34–47.
Article Google Scholar
Kazhdan, M., Chazelle, B., Dobkin, D. P., Finkelstein, A., & Funkhouser, T. A. (2002). A reflective symmetry descriptor. In ECCV (Vol. 2, pp. 642–656).
Kazhdan, M., Funkhouser, T., & Rusinkiewicz, S. (2003). Rotation invariant spherical harmonic representation of 3D shape descriptors. In SGP ’03: Proceedings of the 2003 Eurographics/ACM SIGGRAPH symposium on geometry processing (pp. 156–164).
Körtgen, M., Park, G. J., Novotni, M., & Klein, R. (2003). 3D shape matching with 3D shape contexts. In The 7th central European seminar on computer graphics.
Kovar, L., Gleicher, M., & Pighin, F. (2002). Motion graphs. In SIGGRAPH ’02: Proceedings of the 29th annual conference on computer graphics and interactive techniques (Vol. 21, pp. 473–482). New York: ACM Press.
Chapter Google Scholar
Krüger, V., Kragic, D., Ude, A., & Geib, C. (2007). The meaning of action: A review on action recognition and mapping. Advanced Robotics, 21(13), 1473–1501.
Google Scholar
Lee, J., Chai, J., Reitsma, P. S. A., Hodgins, J. K., & Pollard, N. S. (2002). Interactive control of avatars animated with human motion data. ACM Transactions on Graphics, 21(3), 491–500.
Google Scholar
Mcwherter, D., Peabody, M., Regli, W. C., & Shokoufandeh, A. (2001). Solid model databases: Techniques and empirical results. Journal of Computing and Information Science in Engineering, 1(4), 300–310.
Article Google Scholar
Novotni, M., & Klein, R. (2003). 3D Zernike descriptors for content based shape retrieval. In SM ’03: Proceedings of the eighth ACM symposium on solid modeling and applications (pp. 216–225). New York: ACM Press.
Chapter Google Scholar
Ohbuchi, R., Minamitani, T., & Takei, T. (2003). Shape-similarity search of 3D models by using enhanced shape functions. In Theory and practice of computer graphics, 2003 proceedings (pp. 97–104).
Osada, R., Funkhouser, T., Chazelle, B., & Dobkin, D. (2002). Shape distributions. ACM Transactions on Graphics, 21(4), 807–832.
Article Google Scholar
Paquet, E. (2000). Description of shape information for 2D and 3D objects. Signal Processing: Image Communication, 16, 103–122.
Article Google Scholar
Schödl, A., Szeliski, R., Salesin, D. H., & Essa, I. (2000). Video textures. In SIGGRAPH ’00: Proceedings of the 27th annual conference on computer graphics and interactive techniques (pp. 489–498). New York: ACM Press/Addison-Wesley.
Chapter Google Scholar
Shum, H. Y., Hebert, M., & Ikeuchi, K. (1996). On 3D shape similarity. In Proceedings of the 1996 conference on computer vision and pattern recognition (CVPR ’96) (pp. 526–531).
Starck, J., & Hilton, A. (2003). Model-based multiple view reconstruction of people. In ICCV ’03: Proceedings of the ninth international conference on computer vision (pp. 915–922).
Starck, J., & Hilton, A. (2007). Surface capture for performance-based animation. IEEE Computer Graphics and Applications, 27(3), 21–31.
Article Google Scholar
Starck, J., Miller, G., & Hilton, A. (2005). Video-based character animation. In SCA ’05: Proceedings of the 2005 ACM SIGGRAPH/Eurographics symposium on computer animation (pp. 49–58). New York: ACM Press.
Chapter Google Scholar
Sundar, H., Silver, D., Gagvani, N., & Dickinson, S. (2003). Skeleton based shape matching and retrieval. In SMI ’03: Proceedings of the shape modeling international 2003 (p. 130).
Tangelder, J. W. H., & Veltkamp, R. C. (2004). A survey of content based 3D shape retrieval methods. In SMI ’04: Proceedings of the shape modeling international 2004 (pp. 145–156). Washington: IEEE Computer Society.
Chapter Google Scholar
Theobalt, C., Ahmed, N., Lensch, H., Magnor, M., & Seidel, H. P. (2007). Seeing people in different light-joint shape, motion, and reflectance capture. IEEE Transactions on Visualization and Computer Graphics, 13(4), 663–674.
Article Google Scholar
Vlasic, D., Baran, I., Matusik, W., & Popović, J. (2008). Articulated mesh animation from multi-view silhouettes. ACM Transactions on Graphics, 27(3), 1–9.
Article Google Scholar
Weinland, D., Ronfard, R., & Boyer, E. (2006). Free viewpoint action recognition using motion history volumes. Computer Vision and Image Understanding, 104(2), 249–257.
Article Google Scholar
Xu, J., Yamasaki, T., & Aizawa, K. (2006). Motion editing in 3D video database. In 3DPVT ’06: Proceedings of the third international symposium on 3D data processing, visualization, and transmission (pp. 472–479). Washington: IEEE Computer Society.
Chapter Google Scholar
Zaharia, T., & Preteux, F. (2001). Three-dimensional shape-based retrieval within the mpeg-7 framework. In Proceedings SPIE conference on nonlinear image processing and pattern analysis XII (Vol. 4304, pp. 133–145).
Zhang, C., & Chen, T. (2001). Efficient feature extraction for 2D/3D objects in mesh representation. In Image processing, 2001 proceedings 2001 international conference (Vol. 3, pp. 935–938).
Zitnick, C. L., Kang, S. B., Uyttendaele, M., Winder, S., & Szeliski, R. (2004). High-quality video view interpolation using a layered representation. ACM Transactions on Graphics, 23(3), 600–608.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Centre for Vision, Speech and Signal Processing (CVSSP), University of Surrey, Guildford, GU2 7XH, UK
Peng Huang, Adrian Hilton & Jonathan Starck

Authors

Peng Huang
View author publications
You can also search for this author in PubMed Google Scholar
Adrian Hilton
View author publications
You can also search for this author in PubMed Google Scholar
Jonathan Starck
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Peng Huang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, P., Hilton, A. & Starck, J. Shape Similarity for 3D Video Sequences of People. Int J Comput Vis 89, 362–381 (2010). https://doi.org/10.1007/s11263-010-0319-9

Download citation

Published: 05 February 2010
Issue Date: September 2010
DOI: https://doi.org/10.1007/s11263-010-0319-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Shape Similarity for 3D Video Sequences of People

Abstract

Access this article

Similar content being viewed by others

Human Action Search Based on Dynamic Shape Volumes

Elastic Shape Analysis of Functions, Curves and Trajectories

Binary Descriptor Based on Heat Diffusion for Non-rigid Shape Analysis

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Shape Similarity for 3D Video Sequences of People

Abstract

Access this article

Similar content being viewed by others

Human Action Search Based on Dynamic Shape Volumes

Elastic Shape Analysis of Functions, Curves and Trajectories

Binary Descriptor Based on Heat Diffusion for Non-rigid Shape Analysis

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation