Single-Image Insect Pose Estimation by Graph Based Geometric Models and Random Forests

  • Minmin Shen
  • Le Duan
  • Oliver Deussen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9913)


We propose a new method for detailed insect pose estimation, which aims to detect landmarks as the tips of an insect’s antennae and mouthparts from a single image. In this paper, we formulate this problem as inferring a mapping from the appearance of an insect to its corresponding pose. We present a unified framework that jointly learns a mapping from the local appearance (image patch) and the global anatomical structure (silhouette) of an insect to its corresponding pose. Our main contribution is that we propose a data driven approach to learn the geometric prior for modeling various insect appearance. Combined with the discriminative power of Random Forests (RF) model, our method achieves high precision of landmark localization. This approach is evaluated using three challenging datasets of insects which we make publicly available. Experiments show that it achieves improvement over the traditional RF regression method, and comparably precision to human annotators.


Insect pose estimation Landmark detection Random forest 


  1. 1.
    Dell, A.I., Bender, J.A., Branson, K., Couzin, I.D., de Polavieja, G.G., Noldus, L.P., Pérez-Escudero, A., Perona, P., Straw, A.D., Wikelski, M., et al.: Automated image-based tracking and its application in ecology. Trends Ecol. Evol. 29(7), 417–428 (2014)CrossRefGoogle Scholar
  2. 2.
    Branson, K., Belongie, S.: Tracking multiple mouse contours (without too many samples). In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, pp. 1039–1046. IEEE (2005)Google Scholar
  3. 3.
    Khan, Z., Balch, T., Dellaert, F.: MCMC data association and sparse factorization updating for real time multitarget tracking with merged and multiple measurements. IEEE Trans. Pattern Anal. Mach. Intell. 28(12), 1960–1972 (2006)CrossRefGoogle Scholar
  4. 4.
    Branson, K., Robie, A.A., Bender, J., Perona, P., Dickinson, M.H.: High-throughput ethomics in large groups of drosophila. Nat. Method 6(6), 451–457 (2009)CrossRefGoogle Scholar
  5. 5.
    Huston, S.J., Stopfer, M., Cassenaer, S., Aldworth, Z.N., Laurent, G.: Neural encoding of odors during active sampling and in turbulent plumes. Neuron 88(2), 403–418 (2015)CrossRefGoogle Scholar
  6. 6.
    Shen, M., Li, C., Huang, W., Szyszka, P., Shirahama, K., Grzegorzek, M., Merhof, D., Duessen, O.: Interactive tracking of insect posture. Pattern Recogn. 48(11), 3560–3571 (2015)CrossRefGoogle Scholar
  7. 7.
    Dollár, P., Welinder, P., Perona, P.: Cascaded pose regression. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1078–1085. IEEE (2010)Google Scholar
  8. 8.
    Gall, J., Yao, A., Razavi, N., Van Gool, L., Lempitsky, V.: Hough forests for object detection, tracking, and action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 33(11), 2188–2202 (2011)CrossRefGoogle Scholar
  9. 9.
    Shotton, J., Sharp, T., Kipman, A., Fitzgibbon, A., Finocchio, M., Blake, A., Cook, M., Moore, R.: Real-time human pose recognition in parts from single depth images. Commun. ACM 56(1), 116–124 (2013)CrossRefGoogle Scholar
  10. 10.
    Fanelli, G., Gall, J., Van Gool, L.: Real time head pose estimation with random regression forests. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 617–624. IEEE (2011)Google Scholar
  11. 11.
    Tang, D., Yu, T.H., Kim, T.K.: Real-time articulated hand pose estimation using semi-supervised transductive regression forests. In: IEEE International Conference on Computer Vision (ICCV), pp. 3224–3231. IEEE (2013)Google Scholar
  12. 12.
    Donner, R., Menze, B.H., Bischof, H., Langs, G.: Global localization of 3d anatomical structures by pre-filtered hough forests and discrete optimization. Med. Image Anal. 17(8), 1304–1314 (2013)CrossRefGoogle Scholar
  13. 13.
    Chen, C., Xie, W., Franke, J., Grutzner, P., Nolte, L.P., Zheng, G.: Automatic x-ray landmark detection and shape segmentation via data-driven joint estimation of image displacements. Med. Image Anal. 18(3), 487–499 (2014)CrossRefGoogle Scholar
  14. 14.
    Veeraraghavan, A., Chellappa, R., Srinivasan, M.: Shape and behavior encoded tracking of bee dances. IEEE Trans. Pattern Anal. Mach. Intell. 3, 463–476 (2008)CrossRefGoogle Scholar
  15. 15.
    Landgraf, T., Rojas, R.: Tracking honey bee dances from sparse optical flow fields. FB Mathematik und Informatik FU, pp. 1–37 (2007)Google Scholar
  16. 16.
    Perera, A., Srinivas, C., Hoogs, A., Brooksby, G., Hu, W.: Multi-object tracking through simultaneous long occlusions and split-merge conditions. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, pp. 666–673 (2006)Google Scholar
  17. 17.
    Balch, T., Khan, Z., Veloso, M.: Automatically tracking and analyzing the behavior of live insect colonies. In: Proceedings of the Fifth International Conference on Autonomous Agents, pp. 521–528. ACM (2001)Google Scholar
  18. 18.
    Fiaschi, L., Diego, F., Gregor, K., Schiegg, M., Koethe, U., Zlatic, M., Hamprecht, F., et al.: Tracking indistinguishable translucent objects over time using weakly supervised structured learning. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2736–2743. IEEE (2014)Google Scholar
  19. 19.
    Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  20. 20.
    Cheng, Y.: Mean shift, mode seeking, and clustering. IEEE Trans. Pattern Anal. Mach. Intell. 17(8), 790–799 (1995)CrossRefGoogle Scholar
  21. 21.
    Andriluka, M., Roth, S., Schiele, B.: Pictorial structures revisited: people detection and articulated pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1014–1021. IEEE (2009)Google Scholar
  22. 22.
    Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)CrossRefGoogle Scholar
  23. 23.
    Frigui, H., Gader, P.: Detection and discrimination of land mines in ground-penetrating radar based on edge histogram descriptors and a possibilistic K-Nearest neighbor classifier. Fuzzy Syst. 17(1), 185–199 (2011)CrossRefGoogle Scholar
  24. 24.
    Li, S.Z.: Shape matching based on invariants. In: Omidvar, O. (ed.) Shape Analysis, Progress in Neural Networks, pp. 203–228. Ablex, Norwood (1999)Google Scholar
  25. 25.
    Zhang, D., Liu, G.: Review of shape representation and description techniques. Pattern Recogn. 37(1), 1–19 (2004)CrossRefGoogle Scholar
  26. 26.
    Zhang, D., Lu, G.: A comparative study of curvature scale space and fourier descriptors for shape-based image retrieval. J. Visual Commun. Image Represent. 14(1), 39–57 (2003)CrossRefGoogle Scholar
  27. 27.
    Hu, M.K.: Visual pattern recognition by moment invariants. IRE Trans. Inf. Theor. 8(2), 179–187 (1962)CrossRefzbMATHGoogle Scholar
  28. 28.
    Comaniciu, D., Meer, P.: Mean shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24(5), 603–619 (2002)CrossRefGoogle Scholar
  29. 29.
    Yang, Y., Ramanan, D.: Articulated human detection with flexible mixtures of parts. IEEE Trans. Pattern Anal. Mach. Intell. 35(12), 2878–2890 (2013)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.INCIDE CenterUniversity of KonstanzKonstanzGermany

Personalised recommendations