Discovering Object Classes from Activities

Srikantha, Abhilash; Gall, Juergen

doi:10.1007/978-3-319-10599-4_27

Abhilash Srikantha^19,20 &
Juergen Gall¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8694))

Included in the following conference series:

European Conference on Computer Vision

17k Accesses
2 Citations

Abstract

In order to avoid an expensive manual labelling process or to learn object classes autonomously without human intervention, object discovery techniques have been proposed that extract visually similar objects from weakly labelled videos. However, the problem of discovering small or medium sized objects is largely unexplored. We observe that videos with activities involving human-object interactions can serve as weakly labelled data for such cases. Since neither object appearance nor motion is distinct enough to discover objects in such videos, we propose a framework that samples from a space of algorithms and their parameters to extract sequences of object proposals. Furthermore, we model similarity of objects based on appearance and functionality, which is derived from human and object motion. We show that functionality is an important cue for discovering objects from activities and demonstrate the generality of the model on three challenging RGB-D and RGB datasets.

Download to read the full chapter text

Chapter PDF

Segmentation Free Object Discovery in Video

A Weakly-Supervised Approach for Discovering Common Objects in Airport Video Surveillance Footage

RGB-D Object Recognition: Features, Algorithms, and a Large Scale Benchmark

Keywords

References

Alexe, B., Deselaers, T., Ferrari, V.: What is an object? In: CVPR, pp. 73–80 (2010)
Google Scholar
Blaschko, M.B., Vedaldi, A., Zisserman, A.: Simultaneous object detection and ranking with weak supervision. In: NIPS, pp. 235–243 (2010)
Google Scholar
Bosch, A., Zisserman, A., Munoz, X.: Representing shape with a spatial pyramid kernel. In: ACM Int. Conf. on Image and Video Retrieval, pp. 401–408 (2007)
Google Scholar
Brox, T., Malik, J.: Object segmentation by long term analysis of point trajectories. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 282–295. Springer, Heidelberg (2010)
Chapter Google Scholar
Brox, T., Malik, J.: Large displacement optical flow: descriptor matching in variational motion estimation. PAMI 33(3), 500–513 (2011)
Article Google Scholar
Chum, O., Zisserman, A.: An exemplar model for learning object classes. In: CVPR, pp. 1–8 (2007)
Google Scholar
Comaniciu, D., Meer, P.: Mean shift: a robust approach toward feature space analysis. PAMI 24(5), 603–619 (2002)
Article Google Scholar
Delaitre, V., Fouhey, D.F., Laptev, I., Sivic, J., Gupta, A., Efros, A.A.: Scene semantics from long-term observation of people. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VI. LNCS, vol. 7577, pp. 284–298. Springer, Heidelberg (2012)
Chapter Google Scholar
Deselaers, T., Alexe, B., Ferrari, V.: Localizing objects while learning their appearance. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 452–466. Springer, Heidelberg (2010)
Chapter Google Scholar
Everingham, M., Gool, L.V., Williams, C., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. IJCV 88, 303–338 (2010)
Article Google Scholar
Fathi, A., Ren, X., Rehg, J.: Learning to recognize objects in egocentric activities. In: CVPR, pp. 3281–3288 (2011)
Google Scholar
Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient graph-based image segmentation. IJCV 59(2), 167–181 (2004)
Article Google Scholar
Filipovych, R., Ribeiro, E.: Recognizing primitive interactions by exploring actor-object states. In: CVPR (2008)
Google Scholar
Human Body Analysis. In: Fossati, A., Gall, J., Grabner, H., Ren, X., Konolige, K. (eds.) Consumer Depth Cameras for Computer Vision. Springer (2013)
Google Scholar
Fouhey, D.F., Delaitre, V., Gupta, A., Efros, A.A., Laptev, I., Sivic, J.: People watching: Human actions as a cue for single view geometry. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part V. LNCS, vol. 7576, pp. 732–745. Springer, Heidelberg (2012)
Chapter Google Scholar
Gall, J., Fossati, A., van Gool, L.: Functional categorization of objects using real-time markerless motion capture. In: CVPR, pp. 1969–1976 (2011)
Google Scholar
Gall, J., Yao, A., Razavi, N., Van Gool, L., Lempitsky, V.: Hough forests for object detection, tracking, and action recognition. PAMI 33(11), 2188–2202 (2011)
Article Google Scholar
Grabner, H., Gall, J., Van Gool, L.: What makes a chair a chair? In: CVPR, pp. 1529–1536 (2011)
Google Scholar
Gupta, A., Davis, L.: Objects in action: An approach for combining action understanding and object perception. In: CVPR, pp. 1–8 (2007)
Google Scholar
Gupta, A., Satkin, S., Efros, A.A., Hebert, M.: From 3D scene geometry to human workspace. In: CVPR, pp. 1961–1968 (2011)
Google Scholar
Jiang, Y., Koppula, H., Saxena, A.: Hallucinated humans as the hidden context for labeling 3D scenes. In: CVPR, pp. 2993–3000 (2013)
Google Scholar
Jones, M., Rehg, J.: Statistical color models with application to skin detection. IJCV 46(1), 81–96 (2002)
Article MATH Google Scholar
Kjellström, H., Romero, J., Kragic, D.: Visual object-action recognition: Inferring object affordances from human demonstration. CVIU 115, 81–90 (2010)
Google Scholar
Kolmogorov, V.: Convergent tree-reweighted message passing for energy minimization. PAMI 28(10), 1568–1583 (2006)
Article Google Scholar
Koppula, H., Gupta, R., Saxena, A.: Learning human activities and object affordances from rgb-d videos. IJRR 32(8), 951–970 (2013)
Google Scholar
Lee, Y.J., Grauman, K.: Learning the easy things first: Self-paced visual category discovery. In: CVPR, pp. 1721–1728 (2011)
Google Scholar
Leistner, C., Godec, M., Schulter, S., Saffari, A., Werlberger, M., Bischof, H.: Improving classifiers with unlabeled weakly-related videos. In: CVPR, pp. 2753–2760 (2011)
Google Scholar
Manen, S., Guillaumin, M., Van Gool, L.: Prime object proposals with randomized prim’s algorithm. In: ICCV, pp. 2536–2543 (2013)
Google Scholar
Moore, D., Essa, I., Hayes, M.: Exploiting human actions and object context for recognition tasks. In: ICCV, pp. 80–86 (1999)
Google Scholar
Ommer, B., Mader, T., Buhmann, J.: Seeing the Objects Behind the Dots: Recognition in Videos from a Moving Camera. IJCV 83, 57–71 (2009)
Article Google Scholar
Peursum, P., West, G., Venkatesh, S.: Combining image regions and human activity for indirect object recognition in indoor wide-angle views. In: ICCV, pp. 82–89 (2005)
Google Scholar
Pieropan, A., Ek, C.H., Kjellstrom, H.: Functional object descriptors for human activity modeling. In: ICRA, pp. 1282–1289 (2013)
Google Scholar
Prest, A., Leistner, C., Civera, J., Schmid, C., Ferrari, V.: Learning object class detectors from weakly annotated video. In: CVPR, pp. 3282–3289 (2012)
Google Scholar
Ramanan, D., Forsyth, D.A., Barnard, K.: Building models of animals from video. PAMI 28(8), 1319–1334 (2006)
Article Google Scholar
Rohrbach, M., Amin, S., Andriluka, M., Schiele, B.: A database for fine grained activity detection of cooking activities. In: CVPR, pp. 1194–1201 (2012)
Google Scholar
Rubinstein, M., Joulin, A., Kopf, J., Liu, C.: Unsupervised joint object discovery and segmentation in internet images. In: CVPR, pp. 1939–1946 (2013)
Google Scholar
Schulter, S., Leistner, C., Roth, P.M., Bischof, H.: Unsupervised object discovery and segmentation in videos. In: BMVC, pp. 391–404 (2013)
Google Scholar
Turek, M.W., Hoogs, A., Collins, R.: Unsupervised learning of functional categories in video scenes. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part II. LNCS, vol. 6312, pp. 664–677. Springer, Heidelberg (2010)
Chapter Google Scholar
Tuytelaars, T., Lampert, C.H., Blaschko, M.B., Buntine, W.: Unsupervised object discovery: A comparison. IJCV 88, 284–302 (2010)
Article Google Scholar
Winn, J.M., Jojic, N.: Locus: Learning object classes with unsupervised segmentation. In: ICCV, pp. 756–763 (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Bonn, Germany
Abhilash Srikantha & Juergen Gall
MPI for Intelligent Systems, Tuebingen, Germany
Abhilash Srikantha

Authors

Abhilash Srikantha
View author publications
You can also search for this author in PubMed Google Scholar
Juergen Gall
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Toronto, 6 King’s College Road, M5H 3S5, Toronto, ON, Canada
David Fleet
Faculty of Electrical Engineering, Department of Cybernetics, Czech Technical University in Prague, Technicka 2, 166 27, Prague 6, Czech Republic
Tomas Pajdla
Max-Planck-Institut für Informatik, Campus E1 4, 66123, Saarbrücken, Germany
Bernt Schiele
ESAT - PSI, iMinds, KU Leuven, Kasteelpark Arenberg 10, Bus 2441, 3001, Leuven, Belgium
Tinne Tuytelaars

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Srikantha, A., Gall, J. (2014). Discovering Object Classes from Activities. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds) Computer Vision – ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol 8694. Springer, Cham. https://doi.org/10.1007/978-3-319-10599-4_27

Download citation

DOI: https://doi.org/10.1007/978-3-319-10599-4_27
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10598-7
Online ISBN: 978-3-319-10599-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Discovering Object Classes from Activities

Abstract

Chapter PDF

Similar content being viewed by others

Segmentation Free Object Discovery in Video

A Weakly-Supervised Approach for Discovering Common Objects in Airport Video Surveillance Footage

RGB-D Object Recognition: Features, Algorithms, and a Large Scale Benchmark

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Discovering Object Classes from Activities

Abstract

Chapter PDF

Similar content being viewed by others

Segmentation Free Object Discovery in Video

A Weakly-Supervised Approach for Discovering Common Objects in Airport Video Surveillance Footage

RGB-D Object Recognition: Features, Algorithms, and a Large Scale Benchmark

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation