Video Summarization Using a Self-Growing and Self-Organized Neural Gas Network

  • Dim P. Papadopoulos
  • Savvas A. Chatzichristofis
  • Nikos Papamarkos
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6930)


In this paper, a novel method to generate video summaries is proposed, which is allocated mainly for being applied to on-line videos. The novelty of this approach lies in the fact that the video summarization problem is considered as a single query image retrieval problem. According to the proposed method, each frame is considered as a separate image and is described by the recently proposed Compact Composite Descriptors(CCDs) and a visual word histogram. In order to classify the frames into clusters, the method utilizes a powerful Self-Growing and Self-Organized Neural Gas (SGONG) network. Its main advantage is that it adjusts the number of created neurons and their topology in an automatic way. Thus, after training, the SGONG give us the appropriate number of output classes and their centers. The extraction of a representative key frame from every cluster leads to the generation of the video abstract. A significant characteristic of the proposed method is its ability to calculate dynamically the appropriate number of clusters. Experimental results are presented to indicate the effectiveness of the proposed approach.


Image Retrieval Visual Word Hand Gesture Recognition Video Summary Video Abstraction 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aly, M., Welinder, P., Munich, M.E., Perona, P.: Automatic discovery of image families: Global vs. local features. In: ICIP, pp. 777–780 (2009)Google Scholar
  2. 2.
    Arampatzis, A., Zagoris, K., Chatzichristofis, S.A.: Dynamic two-stage image retrieval from large multimodal databases. In: Clough, P., Foley, C., Gurrin, C., Jones, G.J.F., Kraaij, W., Lee, H., Mudoch, V. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 326–337. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  3. 3.
    Atsalakis, A., Papamarkos, N.: Color reduction and estimation of the number of dominant colors by using a self-growing and self-organized neural gas. Eng. Appl. of AI 19(7), 769–786 (2006)CrossRefGoogle Scholar
  4. 4.
    Bailer, W., Dumont, E., Essid, S., Merialdo, B.: A collaborative approach to automatic rushes video summarization. In: 15th IEEE International Conference on Image Processing, 2008. ICIP 2008, pp. 29–32 (2008)Google Scholar
  5. 5.
    Bay, H., Ess, A., Tuytelaars, T., Gool, L.J.V.: Speeded-up robust features (surf). Computer Vision and Image Understanding 110(3), 346–359 (2008)CrossRefGoogle Scholar
  6. 6.
    Borth, D., Ulges, A., Schulze, C., Breuel, T.M.: Keyframe extraction for video tagging and summarization. In: Proc. Informatiktage, pp. 45–48 (2008)Google Scholar
  7. 7.
    Chatzichristofis, S.A., Arampatzis, A., Boutalis, Y.S.: Investigating the behavior of compact composite descriptors in early fusion, late fusion, and distributed image retrieval. Radioengineering 19 (4), 725–733 (2010)Google Scholar
  8. 8.
    Chatzichristofis, S.A., Boutalis, Y.S.: Content based radiology image retrieval using a fuzzy rule based scalable composite descriptor. Multimedia Tools Appl. 46(2-3), 493–519 (2010)CrossRefGoogle Scholar
  9. 9.
    Chatzichristofis, S.A., Boutalis, Y.S., Lux, M.: Spcd - spatial color distribution descriptor - a fuzzy rule based compact composite descriptor appropriate for hand drawn color sketches retrieval. In: Filipe, J., Fred, A.L.N., Sharp, B. (eds.) ICAART (1), pp. 58–63. INSTICC Press (2010)Google Scholar
  10. 10.
    Ciocca, G., Schettini, R.: An innovative algorithm for key frame extraction in video summarization. Journal of Real-Time Image Processing 1(1), 69–88 (2006)CrossRefGoogle Scholar
  11. 11.
    Cula, O.G., Dana, K.J.: Compact representation of bidirectional texture functions. In: CVPR (1), pp. 1041–1047 (2001)Google Scholar
  12. 12.
    Dumont, E., Merialdo, B.: Sequence alignment for redundancy removal in video rushes summarization. In: Proceedings of the 2nd ACM TRECVid Video Summarization Workshop, pp. 55–59. ACM, New York (2008)CrossRefGoogle Scholar
  13. 13.
    Fritzke, B.: Growing grid - a self-organizing network with constant neighborhood range and adaptation strength. Neural Processing Letters 2(5), 9–13 (1995)CrossRefGoogle Scholar
  14. 14.
    Gustafson, D.E., Kessel, W.C.: Fuzzy clustering with a fuzzy covariance matrix. In: IEEE Conference on Decision and Control Including the 17th Symposium on Adaptive Processes, vol. 17 (1978)Google Scholar
  15. 15.
    Hanjalic, A., Zhang, H.J.: An integrated scheme for automated video abstraction based on unsupervised cluster-validity analysis. IEEE Transactions on Circuits and Systems for Video Technology 9(8), 1280–1289 (1999)CrossRefGoogle Scholar
  16. 16.
    Kogler, M., Lux, M.: Bag of visual words revisited: an exploratory study on robust image retrieval exploiting fuzzy codebooks. In: Proceedings of the Tenth International Workshop on Multimedia Data Mining, MDMKDD 2010, pp. 3:1–3:6. ACM, New York (2010)Google Scholar
  17. 17.
    Kohonen, T.: The self-organizing map. Proceedings of the IEEE 78(9), 1464–1480 (1990)CrossRefGoogle Scholar
  18. 18.
    Lie, W.N., Hsu, K.C.: Video summarization based on semantic feature analysis and user preference. In: 2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing, pp. 486–491. IEEE, Los Alamitos (2008)CrossRefGoogle Scholar
  19. 19.
    Lux, M., Schoffmann, K., Marques, O., Boszormenyi, L.: A novel tool for quick video summarization using keyframe extraction techniques. In: Proceedings of the 9th Workshop on Multimedia Metadata (WMM 2009). CEUR Workshop Proceedings, vol. 441, pp. 19–20 (2009)Google Scholar
  20. 20.
    Kogler, M., del Fabro, M., Lux, M., Schoffmann, K., Boszormenyi, L.: Global vs. local feature in video summarization: Experimental results. In: SeMuDaTe 2009, 10th International Workshop of the Multimedia Metadata Community on Semantic Multimedia Database Technologies, SeMuDaTe 2009 (2009)Google Scholar
  21. 21.
    Matos, N., Pereira, F.: Using mpeg-7 for generic audiovisual content automatic summarization. In: Ninth International Workshop on Image Analysis for Multimedia Interactive Services, WIAMIS 2008, pp. 41–45 (2008)Google Scholar
  22. 22.
    Money, A.G., Agius, H.: Video summarisation: A conceptual framework and survey of the state of the art. Journal of Visual Communication and Image Representation 19(2), 121–143 (2008)CrossRefGoogle Scholar
  23. 23.
    Popescu, A., Moellic, P.A., Kanellos, I., Landais, R.: Lightweight web image reranking. In: Proceedings of the Seventeen ACM International Conference on Multimedia, pp. 657–660. ACM, New York (2009)CrossRefGoogle Scholar
  24. 24.
    Chatzichristofis, S.A., Zagoris, K., Boutalis, Y.S., Papamarkos, N.: Accurate image retrieval based on compact composite descriptors and relevance feedback information. International Journal of Pattern Recognition and Artificial Intelligence (IJPRAI) 2, 207–244 (2010)CrossRefGoogle Scholar
  25. 25.
    Stergiopoulou, E., Papamarkos, N.: Hand gesture recognition using a neural network shape fitting technique. Eng. Appl. of AI 22(8), 1141–1158 (2009)CrossRefGoogle Scholar
  26. 26.
    Tamura, H., Mori, S., Yamawaki, T.: Textural features corresponding to visual perception. IEEE Transactions on Systems, Man and Cybernetics 8(6), 460–473 (1978)CrossRefGoogle Scholar
  27. 27.
    Truong, B.T., Venkatesh, S.: Video abstraction: A systematic review and classification. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP) 3(1), 1551–6857 (2007)Google Scholar
  28. 28.
    Xu, M., Maddage, N.C., Xu, C., Kankanhalli, M., Tian, Q.: Creating audio keywords for event detection in soccer video. In: Proc. of ICME, vol. 2, pp. 281–284 (2003)Google Scholar
  29. 29.
    Zhang, D., Chang, S.F.: Event detection in baseball video using superimposed caption recognition. In: Proceedings of the tenth ACM international conference on Multimedia, pp. 315–318. ACM, New York (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Dim P. Papadopoulos
    • 1
  • Savvas A. Chatzichristofis
    • 1
  • Nikos Papamarkos
    • 1
  1. 1.Department of Electrical and Computer EngineeringDemocritus University of ThraceXanthiGreece

Personalised recommendations