Journal of Real-Time Image Processing

, Volume 11, Issue 4, pp 731–749 | Cite as

Real time motion estimation using a neural architecture implemented on GPUs

  • Jose Garcia-RodriguezEmail author
  • Sergio Orts-Escolano
  • Anastassia Angelopoulou
  • Alexandra Psarrou
  • Jorge Azorin-Lopez
  • Juan Manuel Garcia-Chamizo
Special Issue Paper


This work describes a neural network based architecture that represents and estimates object motion in videos. This architecture addresses multiple computer vision tasks such as image segmentation, object representation or characterization, motion analysis and tracking. The use of a neural network architecture allows for the simultaneous estimation of global and local motion and the representation of deformable objects. This architecture also avoids the problem of finding corresponding features while tracking moving objects. Due to the parallel nature of neural networks, the architecture has been implemented on GPUs that allows the system to meet a set of requirements such as: time constraints management, robustness, high processing speed and re-configurability. Experiments are presented that demonstrate the validity of our architecture to solve problems of mobile agents tracking and motion analysis.


Motion estimation Neural architectures Topology preservation Real time GPGPU 



This work was partially funded by the Spanish Government DPI2013-40534-R grant and Valencian Government GV/2013/005 grant. Experiments were made possible with a generous donation of hardware from NVDIA.


  1. 1.
    Wu, S.F., Kittler, J.: General motion estimation and segmentation. Proc. SPIE 1360, 1198–1209 (1990)Google Scholar
  2. 2.
    Viola, P., Jones, M., Snow, D.: Detecting pedestrians using patterns of motion and appearance. In: Proceedings of the Ninth IEEE International Conference on Computer Vision, vol. 2, pp. 734–741 (2003)Google Scholar
  3. 3.
    Irani, M., Anandan, P.: About direct methods. In: Proceedings of the International Workshop on Vision Algorithms: Theory and Practice, ICCV’99, pp. 267–277. Springer, London (2000)Google Scholar
  4. 4.
    Torr, P.H.S., Zisserman, A.: Feature based methods for structure and motion estimation. In: Proceedings of the International Workshop on Vision Algorithms: Theory and Practice, ICCV’99, pp. 278–294. Springer, London (2000)Google Scholar
  5. 5.
    Lucas, B.D., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: Proceedings of the 7th International Joint Conference on Artificial Intelligence (IJCAI’81), vol. 2, pp. 674–679. Morgan Kaufmann Publishers Inc., San Francisco (1981)Google Scholar
  6. 6.
    Baker, S., Matthews, I.: Lucas–kanade 20 years on: a unifying framework. Int. J. Comput. Vis. 56(3), 221–255 (2004)CrossRefGoogle Scholar
  7. 7.
    Barron, J., Fleet, D., Beauchemin, S.: Performance of optical flow techniques. Int. J. Comput. Vis. 12(1), 43–77 (1994)CrossRefGoogle Scholar
  8. 8.
    Botella, G., Meyer-Base, U., Garcia, A.: Bio-inspired robust optical flow processor system for VLSI implementation. Electron. Lett. 45(25), 1304–1305 (2009)CrossRefGoogle Scholar
  9. 9.
    Botella, G., Garcia, A., Rodriguez-Alvarez, M., Ros, E., Meyer-Baese, U., Molina, M.: Robust bioinspired architecture for optical-flow computation. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 18(4), 616–629 (2010)CrossRefGoogle Scholar
  10. 10.
    Barranco, F., Tomasi, M., Diaz, J., Vanegas, M., Ros, E.: Parallel architecture for hierarchical optical flow estimation based on fpga. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 20(6), 1058–1067 (2012)CrossRefGoogle Scholar
  11. 11.
    Diaz, J., Ros, E., Pelayo, F., Ortigosa, E., Mota, S.: Fpga-based real-time optical-flow system. IEEE Trans. Circuits Syst. Video Technol. 16(2), 274–279 (2006)CrossRefGoogle Scholar
  12. 12.
    Botella, G., Martín H, J.A., Santos, M., Meyer-Baese, U.: Fpga-based multimodal embedded sensor system integrating low- and mid-level vision. Sensors 11(8), 8164–8179 (2011)CrossRefGoogle Scholar
  13. 13.
    Ayuso, F., Botella, G., Garcia, C., Prieto, M., Tirado, F.: Gpu-based acceleration of bio-inspired motion estimation model. Concurr. Comput. Pract. Exp. 25(8), 1037–1056 (2013)CrossRefGoogle Scholar
  14. 14.
    Tao, M., Bai, J., Kohli, P., Paris, S.: Simpleflow: a non-iterative, sublinear optical flow algorithm. Comput. Graph. Forum 31(2pt1), 345–353 (2012)Google Scholar
  15. 15.
    Zach, C., Pock, T., Bischof, H.: A duality based approach for realtime TV-L1 optical flow. In: Proceedings of the 29th DAGM Conference on Pattern Recognition, pp. 214–223. Springer, Berlin (2007)Google Scholar
  16. 16.
    Sanchez-Perez, J., Meinhardt-Llopis, E., Facciolo, G.: TV-L1 optical flow estimation. Image Process. On Line 3, 137–150 (2013)Google Scholar
  17. 17.
    Stiller, C., Konrad, J.: Estimating motion in image sequences: a tutorial on modeling and computation of 2D motion. IEEE Signal Process. Mag. 16(4), 7091 (1999)CrossRefGoogle Scholar
  18. 18.
    Li, Z., Yang, Q.: A fast adaptive motion estimation algorithm. In: Proceedings of the 2012 International Conference on Computer Science and Electronics Engineering (ICCSEE), vol. 3, pp. 656–660 (2012)Google Scholar
  19. 19.
    Hu, W., Tan, T., Wang, L., Maybank, S.: A survey on visual surveillance of object motion and behaviors. Trans. Syst. Man Cybern. Part C 34(3), 334–352 (2004)CrossRefGoogle Scholar
  20. 20.
    Haritaoglu, I., Harwood, D., Davis, L.S.: W4: real-time surveillance of people and their activities. IEEE Trans. Pattern Anal. Mach. Intell. 22, 809–830 (2000)CrossRefGoogle Scholar
  21. 21.
    Wren, C., Azarbayejani, A., Darrell, T., Pentland, A.: Pfinder: real-time tracking of the human body. IEEE Trans. Pattern Anal. Mach. Intell. 19(7), 780–785 (1997)CrossRefGoogle Scholar
  22. 22.
    T. Olson, F.B.: Moving object detection and event recognition algorithms for smart cameras. In: Proceedings of the DARPA Image Understanding Workshop, pp. 159–175 (1997)Google Scholar
  23. 23.
    Lipton, A.J., Fujiyoshi, H., Patil, R.S.: Moving target classification and tracking from real-time video. In: Proceedings of the 4th IEEE Workshop on Applications of Computer Vision (WACV’98), p. 8. IEEE Computer Society, Washington, DC (1998)Google Scholar
  24. 24.
    Collins, R.T., Lipton, A.J., Kanade, T.: Introduction to the special section on video surveillance. IEEE Trans. Pattern Anal. Mach. Intell. 22(7), 745–746 (2000)CrossRefGoogle Scholar
  25. 25.
    Howarth, R., Buxton, H.: Conceptual descriptions from monitoring and watching image sequences. Image Vis. Comput. 18(2), 105–135 (2000)CrossRefGoogle Scholar
  26. 26.
    Hu, W., Xie, D., Tan, T.: A hierarchical self-organizing approach for learning the patterns of motion trajectories. Trans. Neural Netw. 15(1), 135–144 (2004)CrossRefGoogle Scholar
  27. 27.
    Toth, D., Aach, T., Metzler, V.: Illumination-invariant change detection. In: Proceedings of the 4th IEEE Southwest Symposium on Image Analysis and Interpretation, pp. 3–7 (2000)Google Scholar
  28. 28.
    Lou, J., Yang, H., Hu, W., Tan, T.: Visual vehicle tracking using an improved ekf. In: Proceedings of Asian Conference on Computer Vision (ACCV), pp. 296–301 (2002)Google Scholar
  29. 29.
    Tian, Y., Tan, T.-N., Sun, H.-Z.: a novel robust algorithm for real-time object tracking. Acta Autom. Sin. 28(05), 851 (2002)Google Scholar
  30. 30.
    Andr, E., Herzog, G., Rist, T.: On the simultaneous interpretation of real world image sequences and their natural language description: the system soccer. In: Proceedings of the 8th ECAI, pp. 449–454 (1988)Google Scholar
  31. 31.
    Brand, M., Kettnaker, V.: Discovery and segmentation of activities in video. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 844–851 (2000)CrossRefGoogle Scholar
  32. 32.
    Haag, M., Theilmann, W., Schäfer, K., Nagel, H.H.: Integration of image sequence evaluation and fuzzy metric temporal logic programming. In: Proceedings of the 21st Annual German Conference on Artificial Intelligence: Advances in Artificial Intelligence, KI’97, pp. 301–312. Springer, London (1997)Google Scholar
  33. 33.
    Fritzke, B.: A self-organizing network that can follow non-stationary distributions. In: Proceedings of the 7th International Conference on Artificial Neural Networks (ICANN’97), pp. 613–618. Springer, London (1997)Google Scholar
  34. 34.
    Fritzke, B.: A growing neural gas network learns topologies, vol. 7, pp. 625–632. MIT Press, Cambridge (1995)Google Scholar
  35. 35.
    Frezza-Buet, H.: Following non-stationary distributions by controlling the vector quantization accuracy of a growing neural gas network. Neurocomputing 71, 1191–1202 (2008)CrossRefGoogle Scholar
  36. 36.
    Cao, X., Suganthan, P.: Video shot motion characterization based on hierarchical overlapped growing neural gas networks. Multimed. Syst. 9(4), 378–385 (2003)CrossRefGoogle Scholar
  37. 37.
    Fritzke, B.: A growing neural gas network learns topologies. In: Tesauro, G., Touretzky, D.S., Leen, T.K. (eds.) Advances in Neural Information Processing Systems, vol. 7, pp. 625–632. MIT Press, Cambridge (1995)Google Scholar
  38. 38.
    Wen-mei, W.H.: GPU Computing Gems Emerald Edition. 1st edn. Morgan Kaufmann Publishers Inc., San Francisco (2011)Google Scholar
  39. 39.
    Nickolls, J., Dally, W.J.: The GPU computing era. IEEE Micro 30, 56–69 (2010)CrossRefGoogle Scholar
  40. 40.
    Horn, D.R., Sugerman, J., Houston, M., Hanrahan, P.: Interactive k-d tree GPU raytracing. In: Proceedings of the 2007 Symposium on Interactive 3D Graphics and Games (I3D’07), pp. 167–174 (2007)Google Scholar
  41. 41.
    CUDA Programming Guide: Version 4.2 (2012)Google Scholar
  42. 42.
    Martinetz, T., Berkovich, S., Schulten, K.: ‘Neural-gas’ network for vector quantization and its application to time-series prediction. IEEE Trans. Neural Netw. I 4(4), 558–569 (1993)CrossRefGoogle Scholar
  43. 43.
    Fritzke, B.: Growing cell structures—a self-organizing network for unsupervised and supervised learning. Neural Netw. 7, 1441–1460 (1993)CrossRefGoogle Scholar
  44. 44.
    Cedras, C., Shah, M.: Motion-based recognition: a survey. Image Vis. Comput. 13, 129–155 (1995)CrossRefGoogle Scholar
  45. 45.
    Dubuisson, M.P., Jain, A.: A modified hausdorff distance for object matching. In: Proceedings of the 12th IAPR International Conference on Pattern Recognition, Conference A: Computer Vision and Image Processing, vol. 1, pp. 566–568 (1994)Google Scholar
  46. 46.
    Wang, X., Tieu, K., Grimson, E.: Learning semantic scene models by trajectory analysis. In: Proceedings of the 9th European Conference on Computer Vision (ECCV’06), vol. III, pp. 110–123. Springer, Berlin (2006)Google Scholar
  47. 47.
    Han, M., Xu, W., Tao, H., Gong, Y.: An algorithm for multiple object trajectory tracking. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2004), vol. 1, pp. I-864-I-871 (2004)Google Scholar
  48. 48.
    Kim, D., Kim, D.: A novel fitting algorithm using the icp and the particle filters for robust 3d human body motion tracking. In: Aghajan, H.K., Prati, A. (eds.) VNBA, ACM, pp. 69–76 (2008)Google Scholar
  49. 49.
    Gao, T., Li, G., Lian, S., Zhang, J.: Tracking video objects with feature points based particle filtering. Multimed. Tools Appl. 58(1), 1–21 (2012)CrossRefGoogle Scholar
  50. 50.
    Argyros, A.A., Lourakis, M.I.A.: Real-time tracking of multiple skin-colored objects with a possibly moving camera. In: ECCV, pp. 368–379 (2004)Google Scholar
  51. 51.
    Sullivan, J., Carlsson, S.: Tracking and labelling of interacting multiple targets. In: Proceedings of the 9th European Conference on Computer Vision (ECCV’06), vol. III, pp. 619–632. Springer, Berlin (2006)Google Scholar
  52. 52.
    Papadourakis, V., Argyros, A.: Multiple objects tracking in the presence of long-term occlusions. Comput. Vis. Image Underst. 114(7), 835–846 (2010)CrossRefGoogle Scholar
  53. 53.
    Zhu, L., Zhou, J., Song, J.: Tracking multiple objects through occlusion with online sampling and position estimation. Pattern Recognit. 41(8), 2447–2460 (2008)CrossRefzbMATHGoogle Scholar
  54. 54.
    Fisher, R.: Pets04 surveillance ground truth data set. In:Proceedings of the Sixth IEEE International Workshop on Performance Evaluation of Tracking and Surveillance, pp. 1–5 (2004)Google Scholar
  55. 55.
    Garcia-Rodriguez, J., Angelopoulou, A., Garcia-Chamizo, J., Psarrou, A., Orts-Escolano, S., Morell-Gimenez, V.: Fast autonomous growing neural gas. In: Proceedings of the 2011 International Joint Conference on Neural Networks (IJCNN), pp. 725–732 (2011)Google Scholar
  56. 56.
    Farnebäck, G.: Two-frame motion estimation based on polynomial expansion. In: Proceedings of the 13th Scandinavian Conference on Image Analysis (SCIA’03), pp. 363–370. Springer, Berlin (2003)Google Scholar
  57. 57.
    Nageswaran, J.M., Dutt, N., Krichmar, J.L., Nicolau, A., Veidenbaum, A.: Efficient simulation of large-scale spiking neural networks using CUDA graphics processors. In: Proceedings of the International Joint Conference on Neural Networks (IJCNN), pp. 2145–2152 (2009)Google Scholar
  58. 58.
    Jang, H., Park, A., Jung, K.: Neural network implementation using cuda and openmp. In: Proceedings of the Digital Image Computing: Techniques and Applications (DICTA’08), pp. 155–161 (2008)Google Scholar
  59. 59.
    Juang, C.F., Chen, T.C., Cheng, W.Y.: Speedup of implementing fuzzy neural networks with high-dimensional inputs through parallel processing on graphic processing units. IEEE Trans. Fuzzy Syst. 19(4), 717–728 (2011)Google Scholar
  60. 60.
    Garcia-Rodriguez, J., Angelopoulou, A., Morell, V., Orts, S., Psarrou, A., Garcia-Chamizo, J.M.: Fast image representation with GPU-based growing neural gas. In: IWANN, vol. 2, pp. 58–65 (2011)Google Scholar
  61. 61.
    Igarashi, J., Shouno, O., Fukai, T., Tsujino, H.: special issue: real-time simulation of a spiking neural network model of the basal ganglia circuitry using general purpose computing on graphics processing units. Neural Netw. 24(2011), 950–960 (2011)CrossRefGoogle Scholar
  62. 62.
    Garcia-Rodriguez, J., Garcia-Chamizo, J.M.: Surveillance and human–computer interaction applications of self-growing models. Appl. Soft Comput. 11(7), 4413–4431 (2011)CrossRefGoogle Scholar
  63. 63.
    Nasse, F., Thurau, C., Fink, G.: Face detection using GPU-based convolutional neural networks. In: Jiang, X., Petkov, N. (eds.) Computer Analysis of Images and Patterns, vol. 5702 of LNCS, pp. 83–90. Springer, New York (2009)Google Scholar
  64. 64.
    Oh, S., Jung, K.: View-point insensitive human pose recognition using neural network and CUDA, vol. 3, pp. 657–660. World Academy of Science, Engineering and Technology, USA (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Jose Garcia-Rodriguez
    • 1
    Email author
  • Sergio Orts-Escolano
    • 1
  • Anastassia Angelopoulou
    • 2
  • Alexandra Psarrou
    • 2
  • Jorge Azorin-Lopez
    • 1
  • Juan Manuel Garcia-Chamizo
    • 1
  1. 1.Department of Computing TechnologyUniversity of AlicanteAlicanteSpain
  2. 2.Faculty of Science and TechnologyUniversity of WestminsterCanvendishUK

Personalised recommendations