On Affine Invariant Clustering and Automatic Cast Listing in Movies

Fitzgibbon, Andrew; Zisserman, Andrew

doi:10.1007/3-540-47977-5_20

Andrew Fitzgibbon⁷ &
Andrew Zisserman⁷

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2352))

Included in the following conference series:

European Conference on Computer Vision

3098 Accesses
34 Citations

Abstract

We develop a distance metric for clustering and classification algorithms which is invariant to affine transformations and includes priors on the transformation parameters. Such clustering requirements are generic to a number of problems in computer vision.

We extend existing techniques for affine-invariant clustering, and show that the new distance metric outperforms existing approximations to affine invariant distance computation, particularly under large transformations. In addition, we incorporate prior probabilities on the transformation parameters. This further regularizes the solution, mitigating arare but serious tendency of the existing solutions to diverge. For the particular special case of corresponding point sets we demonstrate that the affine invariant measure we introduced may be obtained in closed form.

As an application of these ideas we demonstrate that the faces of the principal cast of a feature film can be generated automatically using clustering with appropriate invariance. This is a very demanding test as it involves detecting and clustering over tens of thousands of images with the variances including changes in viewpoint, lighting, scale and expression.

Download to read the full chapter text

Chapter PDF

Cage Active Contours for image warping and morphing

Article Open access 09 February 2018

Hierarchical Clustering via Penalty-Based Aggregation and the Genie Approach

Clustering of Multi-image Sets Using Rényi Information Entropy

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Y. Amit and D. Geman. A computational model for visual selection. Neural Computation, 11(7):1691–1715, 1999.
Article Google Scholar
M. C. Burl, M. Weber, and P. Perona. A probabilistic approach to object recognition using local photometry and global geometry. In ECCV(2), pages 628–641, 1998.
Google Scholar
R. Byrd, R.B. Schnabel, and G. A. Shultz. A trust region algorithm for nonlinearly constrained optimization. SIAM J. Numer. Anal., 24:1152–1170, 1987.
Article MATH MathSciNet Google Scholar
A. R. Conn, N. I. M. Gould, and P. L. Toint. Trust-Region Methods. MPS/SIAM Series on Optimization. SIAM, Philadelphia, 2000.
MATH Google Scholar
F. De la Torre and M. J. Black. Robust principal component analysis for computer vision. In Proc. International Conference on Computer Vision, 2001.
Google Scholar
I. Dryden and K. Mardia. Statistical shape analysis. John Wiley & Sons, New York, 1998.
MATH Google Scholar
R. O. Duda and P. E. Hart. Pattern Classification and Scene Analysis. Wiley, 1973.
Google Scholar
D. Fasulo. An analysis of recent work on clustering algorithms. Technical Report UW-CSE-01-03-02, University of Washington, 1999.
Google Scholar
B. Frey and N. Jojic. Transformed component analysis: joint estimation of spatial transformations and image components. In Proc. International Conference on Computer Vision, pages 1190–1196, 1999.
Google Scholar
R. I. Hartley and A. Zisserman. Multiple View Geometry in Computer Vision. Cambridge University Press, ISBN: 0521623049, 2000.
Google Scholar
E. Hjelmås and B. K. Low. Face detection: A survey. Computer Vision and Image Understanding, 83(3):236–274, 2001.
Article MATH Google Scholar
M. Irani. Multi-frame optical flow estimation using subspace constraints. In ICCV, pages 626–633, 1999.
Google Scholar
M. Irani and P. Anandan. About direct methods. In W. Triggs, A. Zisserman, and R. Szeliski, editors, Vision Algorithms: Theory and Practice, volume 1883 of LNCS, pages 267–277. Springer, 2000.
Chapter Google Scholar
L. Kaufman and P.J. Rousseeuw. Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley & Sons, NY, USA, 1990.
Google Scholar
Y LeCun, L. Bottou, Y Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
Article Google Scholar
T. Leung and J. Malik. Recognizing surfaces using three-dimensional textons. In Proc. 7th International Conference on Computer Vision, Kerkyra, Greece, pages 1010–1017, Kerkyra, Greece, September 1999.
Google Scholar
T. Leung and J. Malik. Representing and recognizing the visual appearance of materials using three-dimensional textons. International Journal of Computer Vision, December 1999.
Google Scholar
K. Mikolajczyk, R. Choudhury, and C. Schmid. Face detection in a video sequence — a temporal approach. In Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2001.
Google Scholar
B. A. Olshausen and D.J. Field. Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381:607–9, 1996.
Article Google Scholar
W. Press, B. Flannery, S. Teukolsky, and W. Vetterling. Numerical Recipes in C. Cambridge University Press, 1988.
Google Scholar
C. Schmid. Constructing models for content-based image retrieval. In Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2001.
Google Scholar
H. Schneiderman and T. Kanade. A histogram-based method for detection of faces and cars. In Proc. ICIP, volume 3, pages 504–507, September 2000.
Google Scholar
B. Schölkopf, C. Burges, and V. Vapnik. Incorporating invariances in support vector learning machines. In Articial Neural Networks, ICANN’96, pages 47–52, 1996.
Google Scholar
J. Shi and J. Malik. Normalized cuts and image segmentation. In Proc. IEEE Conference on Computer Vision and Pattern Recognition, pages 731–743, 1997.
Google Scholar
H. Sidenbladh and M. J. Black. Learning image statistics for Bayesian tracking. In Proc. International Conference on Computer Vision, pages II:709–716, 2001.
Google Scholar
P. Simard, Y. Le Cun, and J. Denker. Efficient pattern recognition using a new transformation distance. In Advances in Neural Info. Proc. Sys. (NIPS), volume 5, pages 50–57, 1993.
Google Scholar
P. Simard, Y. Le Cun, J. Denker, and B. Victorri. Transformation invariance in pattern recognition—tangent distance and tangent propagation. In Lecture Notes in Computer Science, Vol. 1524, pages 239–274. Springer, 1998.
Google Scholar
C. Tomasi and T. Kanade. Shape and motion from image streams under orthography: A factorization approach. International Journal of Computer Vision, 9(2):137–154, November 1992.
Google Scholar
P. H. S. Torr and A. Zisserman. Feature based methods for structure and motion estimation. In W. Triggs, A. Zisserman, and R. Szeliski, editors, Vision Algorithms: Theory and Practice, volume 1883 of LNCS, pages 278–294. Springer, 2000.
Chapter Google Scholar
K. Toyama and A. Blake. Probabalistic tracking in a metric space. In Proc. International Conference on Computer Vision, pages II, 50–57, 2001.
Google Scholar
N. Vasconcelos and A. Lippman. Multiresolution tangent distance for affine-invariant classification. In Advances in Neural Info. Proc. Sys. (NIPS), volume 10, pages 843–849, 1998.
Google Scholar

Download references

Author information

Authors and Affiliations

Visual Geometry Group Department of Engineering Science, The University of Oxford, UK
Andrew Fitzgibbon & Andrew Zisserman

Authors

Andrew Fitzgibbon
View author publications
You can also search for this author in PubMed Google Scholar
Andrew Zisserman
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Centre for Mathematical Sciences, Lund University, Box 118, 22100, Lund, Sweden
Anders Heyden & Gunnar Sparr &
The IT University of Copenhagen, Glentevej 67-69, 2400, Copenhagen, NW, Denmark
Mads Nielsen
University of Copenhagen, Universitetsparken 1, 2100, Copenhagen, Denmark
Peter Johansen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fitzgibbon, A., Zisserman, A. (2002). On Affine Invariant Clustering and Automatic Cast Listing in Movies. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds) Computer Vision — ECCV 2002. ECCV 2002. Lecture Notes in Computer Science, vol 2352. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-47977-5_20

Download citation

DOI: https://doi.org/10.1007/3-540-47977-5_20
Published: 29 April 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43746-8
Online ISBN: 978-3-540-47977-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

On Affine Invariant Clustering and Automatic Cast Listing in Movies

Abstract

Chapter PDF

Similar content being viewed by others

Cage Active Contours for image warping and morphing

Hierarchical Clustering via Penalty-Based Aggregation and the Genie Approach

Clustering of Multi-image Sets Using Rényi Information Entropy

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

On Affine Invariant Clustering and Automatic Cast Listing in Movies

Abstract

Chapter PDF

Similar content being viewed by others

Cage Active Contours for image warping and morphing

Hierarchical Clustering via Penalty-Based Aggregation and the Genie Approach

Clustering of Multi-image Sets Using Rényi Information Entropy

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation