A Semi-automated Method for Object Segmentation in Infant’s Egocentric Videos to Study Object Perception

  • Qazaleh MirsharifEmail author
  • Sidharth Sadani
  • Shishir Shah
  • Hanako Yoshida
  • Joseph Burling
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 460)


Object segmentation in infant’s egocentric videos is a fundamental step in studying how children perceive objects in early stages of development. From the computer vision perspective, object segmentation in such videos poses quite a few challenges because the child’s view is unfocused, often with large head movements, effecting in sudden changes in the child’s point of view which leads to frequent change in object properties such as size, shape and illumination. In this paper, we develop a semi-automated, domain specific method, to address these concerns and facilitate the object annotation process for cognitive scientists, allowing them to select and monitor the object under segmentation. The method starts with an annotation of the desired object by user and employs graph cut segmentation and optical flow computation to predict the object mask for subsequent video frames automatically. To maintain accurate segmentation of objects, we use domain specific heuristic rules to re-initialize the program with new user input whenever object properties change dramatically. The evaluations demonstrate the high speed and accuracy of the presented method for object segmentation in voluminous egocentric videos. We apply the proposed method to investigate potential patterns in object distribution in child’s view at progressive ages.


Child’s egocentric video Cognitive development domain specific heuristic rules Head camera Object perception Object segmentation Optical flow 


  1. 1.
    Pereira, A.F., Smith, L.B., Yu, C.: A bottom-up view of toddler word learning. Psychonomic bulletin & review 21(1), 178–185 (2014)CrossRefGoogle Scholar
  2. 2.
    Pereira, A.F., Yu, C., Smith, L.B., Shen, H.: A first-person perspective on a parent-child social interaction during object play. In: Proceedings of the 31st Annual Meeting of the Cognitive Science Society (2009)Google Scholar
  3. 3.
    Smith, L.B., Yu, C., Pereira, A.F.: Not your mothers view: The dynamics of toddler visual experience. Developmental science 14(1), 9–17 (2011)CrossRefGoogle Scholar
  4. 4.
    Bambach, S., Crandall, D.J., Yu, C.: Understanding embodied visual attention in child-parent interaction. In: Development and Learning and Epigenetic Robotics (ICDL), 2013 IEEE Third Joint International Conference on. pp. 1–6. IEEE (2013)Google Scholar
  5. 5.
    Burling, J.M., Yoshida, H., Nagai, Y.: The significance of social input, early motion experiences, and attentional selection. In: Development and Learning and Epigenetic Robotics (ICDL), 2013 IEEE Third Joint International Conference on. pp. 1–2. IEEE (2013)Google Scholar
  6. 6.
    Xu, T., Chen, Y., Smith, L.: It’s the child’s body: The role of toddler and parent in selecting toddler’s visual experience. In: Development and Learning (ICDL), 2011 IEEE International Conference on. vol. 2, pp. 1–6. IEEE (2011)Google Scholar
  7. 7.
    Yoshida, H., Smith, L.B.: What’s in view for toddlers? Using a head camera to study visual experience. Infancy 13(3), 229–248 (2008)CrossRefGoogle Scholar
  8. 8.
    Smith, L., Yu, C., Yoshida, H., Fausey, C.M.: Contributions of Head-Mounted Cameras to Studying the Visual Environments of Infants and Young Children. Journal of Cognition and Development (just-accepted) (2014)Google Scholar
  9. 9.
    Bambach, S.: A Survey on Recent Advances of Computer Vision Algorithms for Egocentric Video. arXiv preprint arXiv:1501.02825 (2015)
  10. 10.
    Ren, X., Gu, C.: Figure-ground segmentation improves handled object recognition in egocentric video. In: Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on. pp. 3137–3144. IEEE (2010)Google Scholar
  11. 11.
    Ren, X., Philipose, M.: Egocentric recognition of handled objects: Benchmark and analysis. In: Computer Vision and Pattern Recognition Workshops, 2009. CVPR Workshops 2009. IEEE Computer Society Conference on. pp. 1–8. IEEE (2009)Google Scholar
  12. 12.
    Boykov, Y., Kolmogorov, V.: An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. Pattern Analysis and Machine Intelligence, IEEE Transactions on 26(9), 1124–1137 (2004)CrossRefzbMATHGoogle Scholar
  13. 13.
    Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. Pattern Analysis and Machine Intelligence, IEEE Transactions on 23(11), 1222–1239 (2001)CrossRefGoogle Scholar
  14. 14.
    Horn, B.K., Schunck, B.G.: Determining optical flow. In: 1981 Technical Symposium East. pp. 319–331. International Society for Optics and Photonics (1981)Google Scholar
  15. 15.
    Yoshida, H., Burling, J.M.: Dynamic shift in isolating referents: From social to self-generated input. In: Development and Learning and Epigenetic Robotics (ICDL), 2013 IEEE Third Joint International Conference on. pp. 1–2. IEEE (2013)Google Scholar

Copyright information

© Springer Science+Business Media Singapore 2017

Authors and Affiliations

  • Qazaleh Mirsharif
    • 1
    Email author
  • Sidharth Sadani
    • 2
  • Shishir Shah
    • 1
  • Hanako Yoshida
    • 3
  • Joseph Burling
    • 3
  1. 1.Department of Computer ScienceUniversity of HoustonHoustonUSA
  2. 2.Department of Electronics & CommunicationIndian Institute of TechnologyRoorkeeIndia
  3. 3.Department of PsychologyUniversity of HoustonHoustonUSA

Personalised recommendations