Optimization Problems Associated with Manifold-Valued Curves with Applications in Computer Vision

  • Rushil AnirudhEmail author
  • Pavan TuragaEmail author
  • Anuj SrivastavaEmail author


A commonly occurring requirement in many computer vision applications is the need to represent, compare, and manipulate manifold-valued curves, while allowing for enough flexibility to operate in resource constrained environments. We address these concerns in this chapter, by proposing a dictionary learning scheme that takes geometry and time into account, while performing better than the original data in applications such as activity recognition. We are able to do this with the use of the transport square-root velocity function, which provides an elastic representation for trajectories on Riemannian manifolds. Since these operations can be computationally very expensive, we also present a geometry-based symbolic approximation framework, as a result of which low-bandwidth transmission and accurate real-time analysis for recognition or searching through sequential data become fairly straightforward. We discuss the different optimization problems encountered in this context—learning a sparse representation for actions using extrinsic and intrinsic features, solving the registration problem between two Riemannian trajectories, and learning an optimal clustering scheme for symbolic approximation.



The research reported in this chapter was supported by National Science Foundation Grants 1320267 and 1319658.


  1. 1.
    Absil P-A, Mahony R, Sepulchre R (2008) Optimization algorithms on matrix manifolds. Princeton University Press, Princeton, NJCrossRefzbMATHGoogle Scholar
  2. 2.
    Allauzen C, Raffinot M (2000) Simple optimal string matching algorithm. In: Combinatorial pattern matching, vol 1848 of Lecture notes in computer science, pp 364–374. Springer, HeidelbergGoogle Scholar
  3. 3.
    Anirudh R, Ramamurthy K, Thiagarajan JJ, Turaga P, Spanias A (2013) A heterogeneous dictionary model for representation and recognition of human actions. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp 3472–3476Google Scholar
  4. 4.
    Anirudh R, Turaga P, Su J, Srivastava A (2015) Elasting functional coding of human actions: from vector fields to latent variables. In: CVPR, pp 3147–3155Google Scholar
  5. 5.
    Anirudh R, Turaga P, Su J, Srivastava A (2016) Elastic functional coding of Riemannian trajectories. IEEE Trans Pattern Anal Mach Intell 39(5):922–936CrossRefGoogle Scholar
  6. 6.
    Begelfor E, Werman M (2006) Affine invariance revisited. In: IEEE conference on computer vision and pattern recognition. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol 2. IEEE, pp 2087–2094Google Scholar
  7. 7.
    Berndt D, Clifford J (1994) Using dynamic time warping to find patterns in time series. In: KDD Workshop, vol 10, pp 359–370. Seattle, WAGoogle Scholar
  8. 8.
    Boothby WM (2003) An introduction to differentiable manifolds and Riemannian geometry. Revised 2nd edn. Academic, New YorkGoogle Scholar
  9. 9.
    Chakrabarti K, Keogh EJ, Mehrotra S, Pazzani MJ (2002) Locally adaptive dimensionality reduction for indexing large time series databases. ACM Trans Database Syst 27(2):188–228CrossRefGoogle Scholar
  10. 10.
    Chaudhry R, Ravichandran A, Hager G, Vidal R (2009) Histograms of oriented optical flow and Binet-Cauchy kernels on nonlinear dynamical systems for the recognition of human actions. In: IEEE Conference on Computer Vision and Pattern Recognition, 2009 (CVPR 2009). IEEE, pp 1932–1939Google Scholar
  11. 11.
    Cherian A, Sra S (2015) Riemannian dictionary learning and sparse coding for positive definite matrices. arXiv preprint arXiv:1507.02772Google Scholar
  12. 12.
    Desieno D (1988) Adding a conscience to competitive learning. IEEE International Conference on Neural Networks 1:117–124CrossRefGoogle Scholar
  13. 13.
    Devanne M, Wannous H, Berretti S, Pala P, Daoudi M, Del Bimbo A (2014) 3-d human action recognition by shape analysis of motion trajectories on riemannian manifold. IEEE Transactions on Cybernetics PP(99):1–1Google Scholar
  14. 14.
    Devroye L, Szpankowski W, Rais B (1992) A note on the height of suffix trees. SIAM J Comput 21(1):48–53MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Faraki M, Harandi M, Porikli F (2015) More about VLAD: A leap from Euclidean to Riemannian manifolds. In: Computer Vision and Pattern Recognition (CVPR), 2015 IEEE Conference on, pp 4951–4960Google Scholar
  16. 16.
    Fletcher PT, Lu C, Pizer SM, Joshi SC (2004) Principal geodesic analysis for the study of nonlinear statistics of shape. IEEE Transactions on Medical Imaging 23(8):995–1005CrossRefGoogle Scholar
  17. 17.
    Gaur U, Zhu Y, Song B, Chowdhury AKR (2011) A “string of feature graphs” model for recognition of complex activities in natural videos. In: ICCV, pp 2595–2602Google Scholar
  18. 18.
    Goodall CR, Mardia KV (1999) Projective shape analysis. J Comput Graph Stat 8(2):143–168 (1999)Google Scholar
  19. 19.
    Gorelick L, Blank M, Shechtman E, Irani M, Basri R (2007) Actions as space-time shapes. IEEE Trans Pattern Anal Mach Intell 29(12):2247–2253CrossRefGoogle Scholar
  20. 20.
    Grove K, Karcher H (1973) How to conjugate C1-close group actions. Math Z 132:11–20MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    Guha T, Ward R (2012) Learning sparse representations for human action recognition. IEEE Trans PAMI 34(8):1576–1588CrossRefGoogle Scholar
  22. 22.
    Harandi MT, Salzmann M, Hartley R (2014) From manifold to manifold: Geometry-aware dimensionality reduction for SPD matrices. In: ECCV 2014, pp 17–32zbMATHGoogle Scholar
  23. 23.
    Harandi MT, Sanderson C, Shen C, Lovell BC (2013) Dictionary learning and sparse coding on grassmann manifolds: An extrinsic solution. In: ICCV, pp 3120–3127Google Scholar
  24. 24.
    He Z, Cichocki A, Li Y, Xie S, Sanei S (2009) K-hyperline clustering learning for sparse component analysis. Signal Process 89(6):1011–1022CrossRefzbMATHGoogle Scholar
  25. 25.
    Ho J, Xie Y, Vemuri BC (2013) On a nonlinear generalization of sparse coding and dictionary learning. In: ICML (3), pp 1480–1488Google Scholar
  26. 26.
    Jordan MI (1998) Learning in Graphical Models. Cambridge, MA: MIT PressCrossRefGoogle Scholar
  27. 27.
    Kohonen T (1995) Self-Organizing Maps. Springer, BerlinCrossRefzbMATHGoogle Scholar
  28. 28.
    Li W, Zhang Z, Liu Z (2010) Action recognition based on a bag of 3d points. In: Computer Vision and Pattern Recognition Workshops (CVPRW), 2010, IEEE, pp 9–14Google Scholar
  29. 29.
    Lin J, Keogh EJ, Lonardi S, Chi Chiu BY (2003) A symbolic representation of time series, with implications for streaming algorithms. In: DMKD, pp 2–11Google Scholar
  30. 30.
    Lin J, Li Y (2010) Finding approximate frequent patterns in streaming medical data. In: CBMS, pp 13–18Google Scholar
  31. 31.
    Murray RM, Li Z, Sastry SS, Sastry SS (1994) A mathematical introduction to robotic manipulation. CRC PressGoogle Scholar
  32. 32.
    Pennec X (2006) Intrinsic statistics on Riemannian manifolds: Basic tools for geometric measurements. J Math Imag Vis 25(1):127–154MathSciNetCrossRefGoogle Scholar
  33. 33.
    Ripley BD (1996) Pattern Recognition and Neural Networks. Cambridge University Press, CambridgeCrossRefzbMATHGoogle Scholar
  34. 34.
    Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290:2323–2326CrossRefGoogle Scholar
  35. 35.
    Seidenari L, Varano V, Berretti S, Bimbo AD, Pala P (2013) Recognizing actions from depth cameras as weakly aligned multi-part bag-of-poses. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR Workshops 2013, pp 479–485Google Scholar
  36. 36.
    Srivasatava A, Klassen E (2004) Bayesian geometric subspace tracking. Adv Appl Probab 36(1): 43–56MathSciNetCrossRefzbMATHGoogle Scholar
  37. 37.
    Srivastava A, Jermyn I, Joshi S (2007) Riemannian analysis of probability density functions with applications in vision. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 1–8Google Scholar
  38. 38.
    Srivastava A, Klassen E, Joshi SH, Jermyn IH (2011) Shape analysis of elastic curves in Euclidean spaces. IEEE Trans Pattern Anal Mach Intell 33:1415–1428CrossRefGoogle Scholar
  39. 39.
    Su J, Kurtek S, Klassen E, Srivastava A (2014) Statistical analysis of trajectories on Riemannian manifolds: Bird migration, hurricane tracking, and video surveillance. Ann Appl Stat 8(1):530–552MathSciNetCrossRefzbMATHGoogle Scholar
  40. 40.
    Su J, Srivastava A, de Souza FDM, Sarkar S (2014) Rate-invariant analysis of trajectories on riemannian manifolds with application in visual speech recognition. In: CVPR 2014, Columbus, OH, USA, June 23–28, 2014, pp 620–627Google Scholar
  41. 41.
    Tenenbaum JB, Silva VD, Langford JC (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500): 2319–2323CrossRefGoogle Scholar
  42. 42.
    Thiagarajan J, Ramamurthy K, Spanias A (2011) Optimality and stability of the K-hyperline clustering algorithm. Pattern Recog Lett 32(9):1299–1304CrossRefGoogle Scholar
  43. 43.
    Turaga PK, Chellappa R (2009) Locally time-invariant models of human activities using trajectories on the Grassmannian. In: CVPR, pp 2435–2441Google Scholar
  44. 44.
    Turaga PK, Veeraraghavan A, Srivastava A, Chellappa R (2011) Statistical computations on Grassmann and Stiefel manifolds for image and video-based recognition. IEEE Trans Pattern Anal Mach Intell 33(11):2273–2286CrossRefGoogle Scholar
  45. 45.
    Tuzel O, Porikli FM, Meer P (2006) Region covariance: A fast descriptor for detection and classification. European Conference on Computer Vision II:589–600Google Scholar
  46. 46.
    Veeraraghavan A, Chellappa R Roy-Chowdhury AK (2006) The function space of an activity. IEEE CVPR, pp 959–968Google Scholar
  47. 47.
    Veeraraghavan A, Chowdhury AKR (2006) The function space of an activity. In: CVPR (1), pp 959–968Google Scholar
  48. 48.
    Veeraraghavan A, Chowdhury AKR, Chellappa R (2005) Matching shape sequences in video with applications in human movement analysis. IEEE Trans Pattern Anal Mach Intell 27(12):1896–1909CrossRefGoogle Scholar
  49. 49.
    Vemulapalli R, Arrate F, Chellappa R (2014) Human action recognition by representing 3d skeletons as points in a lie group. In: (CVPR), 2014, pp 588–595Google Scholar
  50. 50.
    Xia L, Chen C, Aggarwal J (2012) View invariant human action recognition using histograms of 3d joints. In: Computer Vision and Pattern Recognition Workshops (CVPRW) 2012, IEEE, pp 20–27Google Scholar
  51. 51.
    Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc Ser B (Statistical Methodology) 68(1):49–67MathSciNetCrossRefzbMATHGoogle Scholar
  52. 52.
    Zador P (1982) Asymptotic quantization error of continuous signals and the quantization dimension. IEEE Transactions on Information Theory 28(2):139–149MathSciNetCrossRefzbMATHGoogle Scholar
  53. 53.
    Zhang S, Kasiviswanathan S, Yuen PC, Harandi M (2015) Online dictionary learning on symmetric positive definite manifolds with vision applications. In: AAAI, pp 3165–3173Google Scholar
  54. 54.
    Zhang Z, Su J, Klassen E, Le H, Srivastava A (2015) Video-based action recognition using rate-invariant analysis of covariance trajectories. CoRR, abs/1503.06699Google Scholar
  55. 55.
    Zhao G, Barnard M, Pietikäinen M (2009) Lipreading with local spatiotemporal descriptors. IEEE Trans Multimedia 11(7):1254–1265CrossRefGoogle Scholar
  56. 56.
    Zhou F, De la Torre F (2012) Generalized time warping for multi-modal alignment of human motion. In: (CVPR), 2012, IEEE, pp 1282–1289Google Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  1. 1.Lawrence Livermore National LaboratoryUniversity ParkUSA
  2. 2.Arizona State University,TempeUSA
  3. 3.Florida State UniversityTallahasseeUSA

Personalised recommendations