Advertisement

Organizing Videos Streams for Clustering and Estimation of Popular Scenes

  • Sebastiano Battiato
  • Giovanni M. Farinella
  • Filippo L. M. MilottaEmail author
  • Alessandro Ortis
  • Filippo Stanco
  • Valeria D’Amico
  • Luca Addesso
  • Giovanni Torrisi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10484)

Abstract

The huge diffusion of mobile devices with embedded cameras has opened new challenges in the context of the automatic understanding of video streams acquired by multiple users during events, such as sport matches, expos, concerts. Among the other goals there is the interpretation of which visual contents are the most relevant and popular (i.e., where users look). The popularity of a visual content is an important cue exploitable in several fields that include the estimation of the mood of the crowds attending to an event, the estimation of the interest of parts of a cultural heritage, etc. In live social events people capture and share videos which are related to the event. The popularity of a visual content can be obtained through the “visual consensus” among multiple video streams acquired by the different users devices. In this paper we address the problem of detecting and summarizing the “popular scenes” captured by users with a mobile camera during events. For this purpose, we have developed a framework called RECfusion in which the key popular scenes of multiple streams are identified over time. The proposed system is able to generate a video which captures the interests of the crowd starting from a set of the videos by considering scene content popularity. The frames composing the final popular video are automatically selected from the different video streams by considering the scene recorded by the highest number of users’ devices (i.e., the most popular scene).

Keywords

Video analysis Clustering Social cameras Scene understanding 

Notes

Acknowledgments

This work has been performed in collaboration with Telecom Italia JOL WAVE in the project FIR2014-UNICT-DFA17D.

References

  1. 1.
    Finlayson, G., Hordley, S., Schaefer, G., Tian, G.Y.: Illuminant and device invariant colour using histogram equalisation. Pattern Recogn. 38(2), 179–190 (2005)CrossRefGoogle Scholar
  2. 2.
    Finlayson, G., Schaefer, G.: Colour indexing across devices and viewing conditions. In: International Workshop on Content-Based Multimedia Indexing (2001)Google Scholar
  3. 3.
    Arev, I., Park, H.S., Sheikh, Y., Hodgins, J., Shamir, A.: Automatic editing of footage from multiple social cameras. ACM Trans. Graph. 33, 81 (2014)CrossRefGoogle Scholar
  4. 4.
    Park, H.S., Jain, E., Sheikh, Y.: 3D social saliency from head-mounted cameras. In: Advances in Neural Information Processing Systems, pp. 431–439 (2012)Google Scholar
  5. 5.
    Hoshen, Y., Ben-Artzi, G., Peleg, S.: Wisdom of the crowd in egocentric video curation. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 587–593 (2014)Google Scholar
  6. 6.
    Saini, M.K., Gadde, R., Yan, S., Ooi, W.T.: Movimash: online mobile video mashup. In: ACM International Conference on Multimedia, pp. 139–148 (2012)Google Scholar
  7. 7.
    Bano, S., Cavallaro, A.: ViComp: composition of user-generated videos. Multimedia Tools Appl. 75, 7187–7210 (2016)CrossRefGoogle Scholar
  8. 8.
    Mittal, A., Moorthy, A.K., Bovik, A.C.: No-reference image quality assessment in the spatial domain. IEEE Trans. Image Process. 21(12), 4695–4708 (2012)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Nagasaka, A., Miyatake, T.: Real-time video mosaics using luminance-projection correlation. Trans. Inst. Electron. Inf. Commun. Eng. 82(10), 1572–1580 (1999). http://ci.nii.ac.jp/naid/110003183527/en/. ISSN 09151923Google Scholar
  10. 10.
    Farinella, G.M., Ravì, D., Tomaselli, V., Guarnera, M., Battiato, S.: Representing scenes for real-time context classification on mobile devices. Pattern Recogn. 48(4), 1086–1100 (2015)CrossRefGoogle Scholar
  11. 11.
    Farinella, G.M., Battiato, S.: Scene classification in compressed and constrained domain. Comput. Vis. 5(5), 320–334 (2011)CrossRefGoogle Scholar
  12. 12.
    Naccari, F., Battiato, S., Bruna, A., Capra, A., Castorina, A.: Natural scenes classification for color enhancement. IEEE Trans. Consum. Electron. 51(1), 234–239 (2005)CrossRefGoogle Scholar
  13. 13.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)CrossRefGoogle Scholar
  14. 14.
    Ortis, A., Farinella, G.M., D’Amico, V., Addesso, L., Torrisi, G., Battiato, S.: Recfusion: automatic video curation driven by visual content popularity. In: ACM Multimedia, MM 2015, pp. 1179–1182. ACM (2015)Google Scholar
  15. 15.
    Domke, J., Aloimonos, Y.: Deformation and viewpoint invariant color histograms. In: British Machine Vision Conference, pp. 509–518 (2006)Google Scholar
  16. 16.
    Milotta, F.L.M., Battiato, S., Stanco, F., D’Amico, V., Torrisi, G., Addesso, L.: RECfusion: automatic scene clustering and tracking in video from multiple sources. In: EI - Mobile Devices and Multimedia: Enabling Technologies, Algorithms, and Applications (2016)Google Scholar
  17. 17.
    Ballan, L., Brostow, G.J., Puwein, J., Pollefeys, M.: Unstructured video-based rendering: interactive exploration of casually captured videos. In: ACM Transactions on Graphics, pp. 1–11 (2010)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Sebastiano Battiato
    • 1
  • Giovanni M. Farinella
    • 1
  • Filippo L. M. Milotta
    • 1
    • 2
    Email author
  • Alessandro Ortis
    • 1
    • 2
  • Filippo Stanco
    • 1
  • Valeria D’Amico
    • 2
  • Luca Addesso
    • 2
  • Giovanni Torrisi
    • 2
  1. 1.Department of Mathematics and Computer ScienceUniversity of CataniaCataniaItaly
  2. 2.JOL WAVETelecom ItaliaCataniaItaly

Personalised recommendations