Advertisement

Human Pose Estimation Using Learnt Probabilistic Region Similarities and Partial Configurations

  • Timothy J. Roberts
  • Stephen J. McKenna
  • Ian W. Ricketts
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3024)

Abstract

A model of human appearance is presented for efficient pose estimation from real-world images. In common with related approaches, a high-level model defines a space of configurations which can be associated with image measurements and thus scored. A search is performed to identify good configuration(s). Such an approach is challenging because the configuration space is high dimensional, the search is global, and the appearance of humans in images is complex due to background clutter, shape uncertainty and texture.

The system presented here is novel in several respects. The formulation allows differing numbers of parts to be parameterised and allows poses of differing dimensionality to be compared in a principled manner based upon learnt likelihood ratios. In contrast with current approaches, this allows a part based search in the presence of self occlusion. Furthermore, it provides a principled automatic approach to other object occlusion. View based probabilistic models of body part shapes are learnt that represent intra and inter person variability (in contrast to rigid geometric primitives). The probabilistic region for each part is transformed into the image using the configuration hypothesis and used to collect two appearance distributions for the part’s foreground and adjacent background. Likelihood ratios for single parts are learnt from the dissimilarity of the foreground and adjacent background appearance distributions. It is important to note the distinction between this technique and restrictive foreground/background specific modelling. It is demonstrated that this likelihood allows better discrimination of body parts in real world images than contour to edge matching techniques. Furthermore, the likelihood is less sparse and noisy, making coarse sampling and local search more effective. A likelihood ratio for body part pairs with similar appearances is also learnt. Together with a model of inter-part distances this better describes correct higher dimensional configurations. Results from applying an optimization scheme to the likelihood model for challenging real world images are presented.

Keywords

Body Part Probabilistic Region Part Likelihood Single Part Object Occlusion 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Gavrila, D.M.: The visual analysis of human movement: A survey. Computer Vision and Image Understanding 73(1), 82–98 (1999)zbMATHCrossRefGoogle Scholar
  2. 2.
    Moeslund, T.B., Granum, E.: A survey of computer vision-based human motion capture. Computer Vision and Image Understanding 81(3), 231–268 (2001)zbMATHCrossRefGoogle Scholar
  3. 3.
    Wachter, S., Nagel, H.H.: Tracking persons in monocular image sequences. Computer Vision and Image Understanding 74(3), 174–192 (1999)CrossRefGoogle Scholar
  4. 4.
    Deutscher, J., Davison, A., Reid, I.: Automatic partitioning of high dimensional search spaces associated with articulated body motion capture. In: IEEE Conference on Computer Vision and Pattern Recognition, Hawaii, vol. 2, pp. 669–676 (2001)Google Scholar
  5. 5.
    Roberts, T.J., McKenna, S.J., Ricketts, I.W.: Adaptive learning of statistical appearance models for 3D human tracking. In: British Machine Vision Conference, Cardiff, pp. 333–342 (2002)Google Scholar
  6. 6.
    Sminchisescu, C., Triggs, B.: Covariance scaled sampling for monocular 3D body tracking. In: IEEE Conference on Computer Vision and Pattern Recognition, Hawaii, vol. 1, pp. 447–454 (2001)Google Scholar
  7. 7.
    Ramanan, D., Forsyth, D.A.: Finding and tracking people from the bottom up. In: IEEE Conference on Computer Vision and Pattern Recognition, Madison, Wisconsin (June 2003)Google Scholar
  8. 8.
    Ronfard, R., Schmid, C., Triggs, B.: Learning to parse pictures of people. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 700–714. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  9. 9.
    Cham, T.J., Rehg, J.M.: A multiple hypothesis approach to figure tracking. In: IEEE Conference on Computer Vision and Pattern Recognition, Fort Collins, Colorado, USA, vol. 2, pp. 239–245 (1999)Google Scholar
  10. 10.
    Deutscher, J., North, B., Bascle, B., Blake, A.: Tracking through singularities and discontinuities by random sampling. In: IEEE International Conference on Computer Vision, September 1999, pp. 1144–1149 (1999)Google Scholar
  11. 11.
    Deutscher, J., Blake, A., Reid, I.: Articulated body motion capture by annealed particle filtering. In: IEEE Conference on Computer Vision and Pattern Recognition, South Carolina, USA, vol. 2, pp. 126–133 (2000)Google Scholar
  12. 12.
    Sidenbladh, H., Black, M.J.: Learning image statistics for Bayesian tracking. In: IEEE International Conference on Computer Vision, Vancouver, vol. 2, pp. 709–716 (2001)Google Scholar
  13. 13.
    Schiele, B., Crowley, J.L.: Recognition without correspondence using multidimensional receptive field histograms. International Journal of Computer Vision 36(1), 31–50 (2000)CrossRefGoogle Scholar
  14. 14.
    Puzicha, J., Rubner, Y., Tomasi, C., Buhmann, J.M.: Empirical evaluation of dissimilarity measures for color and texture. In: IEEE International Conference on Computer Vision, pp. 1165–1173 (1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Timothy J. Roberts
    • 1
  • Stephen J. McKenna
    • 1
  • Ian W. Ricketts
    • 1
  1. 1.Division of Applied ComputingUniversity of DundeeDundeeScotland

Personalised recommendations