Skip to main content
Log in

Videolization: knowledge graph based automated video generation from web content

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Web content nowadays can also be accessed through new generation of Internet connected TVs. However, these products failed to change users’ behavior when consuming online content. Users still prefer personal computers to access Web content. Certainly, most of the online content is still designed to be accessed by personal computers or mobile devices. In order to overcome the usability problem of Web content consumption on TVs, this paper presents a knowledge graph based video generation system that automatically converts textual Web content into videos using semantic Web and computer graphics based technologies. As a use case, Wikipedia articles are automatically converted into videos. The effectiveness of the proposed system is validated empirically via opinion surveys. Fifty percent of survey users indicated that they found generated videos enjoyable and 42 % of them indicated that they would like to use our system to consume Web content on their TVs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Listing 1
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15

Similar content being viewed by others

Notes

  1. http://www.nydailynews.com/life-style/average-american-watches-5-hours-tv-day-article-1.1711954

  2. https://www.npdgroupblog.com/internet-connected-tvs-are-used-to-watch-tv-and-thats-about-all/

  3. http://en.wikipedia.org/

  4. https://studio.stupeflix.com

  5. https://www.somedia.net

  6. http://getwinston.com/project/apptour

  7. http://www.wibbitz.com

  8. https://animoto.com

  9. https://www.magisto.com

  10. https://sezion.com

  11. https://videolicious.com

  12. https://www.wevideo.com

  13. https://www.mongodb.com/

  14. https://dumps.wikimedia.org/enwiki/

  15. http://mary.dfki.de/

  16. https://www.vocalware.com

  17. https://http://www.shutterstock.com/

  18. http://www.adobe.com/products/aftereffects.html

  19. https://stupeflix-sxml.readthedocs.io/en/latest/index.html

  20. http://mpeg.chiariglione.org/standards/mpeg-7

  21. https://libav.org/

  22. https://www.opengl.org/

  23. https://goo.gl/c759nv

  24. https://goo.gl/K18mw9

References

  1. Bailer W, Schallauer P (2006) Detailed audiovisual profile: enabling interoperability between mpeg-7 based systems. In: 2006 12th International Multi-Media Modelling Conference, pp 8. doi:10.1109/MMMC.2006.1651323

  2. Borman A, Mihalcea R, Tarau P (2005) Picnet: Augmenting semantic resources with pictorial representations. In: AAAI Spring Symposium: Knowledge Collection from Volunteer Contributors, AAAI, pp 1–7

  3. Cai R, Zhang L, Jing F, Lai W, Ma WY (2007) Automated music video generation using web image resource. In: 2007 IEEE International Conference on Acoustics, Speech and Signal Processing-ICASSP’07, IEEE, vol 2, pp II–737

  4. Cornolti M, Ferragina P, Ciaramita M (2013) A framework for benchmarking entity-annotation systems

  5. Coyne B, Sproat R (2001) Wordseye: An automatic text-to-scene conversion system. In: Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, ACM, New York, NY, USA, SIGGRAPH ’01, pp 487–496

  6. Ferragina P, Scaiella U (2010) Tagme: On-the-fly annotation of short text fragments (by wikipedia entities). In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, ACM, New York, NY, USA, CIKM ’10, pp 1625–1628

  7. Hansen V (2006) Interactive television design – designing for interactive television v 1.0 bbci & interactive tv programmes. BBC

  8. Heath D, Ventura D (2016) Creating images by learning image semantics using vector space models. http://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/view/11961

  9. Hoffart J, Suchanek FM, Berberich K, Lewis-Kelham E, de Melo G, Weikum G (2011a) Yago2: Exploring and querying world knowledge in time, space, context, and many languages. In: Proceedings of the 20th International Conference Companion on World Wide Web, ACM, New York, NY, USA, WWW ’11, pp 229–232. doi:10.1145/1963192.1963296

  10. Hoffart J, Yosef MA, Bordino I, Fürstenau H, Pinkal M, Spaniol M, Taneva B, Thater S, Weikum G (2011b) Robust disambiguation of named entities in text. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Stroudsburg, PA, USA, EMNLP ’11, pp 782–792

  11. Kulkarni S, Singh A, Ramakrishnan G, Chakrabarti S (2009) Collective annotation of wikipedia entities in web text. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, New York, NY, USA, KDD ’09, pp 457–466

  12. Liu Y, Zhang D, Lu G, Ma WY (2007) A survey of content-based image retrieval with high-level semantics. Pattern Recognition 40(1):262–282. doi:10.1016/j.patcog.2006.04.045. http://www.sciencedirect.com/science/article/pii/S0031320306002184

    Article  MATH  Google Scholar 

  13. Meij E, Weerkamp W, de Rijke M (2012) Adding semantics to microblog posts. In: Proceedings of the Fifth ACM International Conference on Web Search and Data Mining, ACM, New York, NY, USA, WSDM ’12, pp 563–572

  14. Mendes PN, Jakob M, García-Silva A, Bizer C (2011) Dbpedia spotlight: Shedding light on the web of documents. In: Proceedings of the 7th International Conference on Semantic Systems, ACM, New York, NY, USA, I-Semantics ’11, pp 1–8

  15. Mihalcea R, Leong CW (2008) Toward communicating simple sentences using pictorial representations. Machine Translation 22(3):153–173

    Article  Google Scholar 

  16. Milne D, Witten IH (2008) Learning to link with wikipedia. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management, ACM, New York, NY, USA, CIKM ’08, pp 509–518

  17. Nenkova A, McKeown K (2012) blubberdiblubb A survey of text summarization techniques. In: Aggarwal CC, Zhai C, blubberdiblubb (eds) Mining Text Data, Springer, pp 43–76

  18. Ohya H, Morishima S (2012) Automatic music video creation system by reusing existing contents in video-sharing service based on hmm

  19. Ratinov L, Roth D, Downey D, Anderson M (2011) Local and global algorithms for disambiguation to wikipedia. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1, Association for Computational Linguistics, Stroudsburg, PA, USA, HLT ’11, pp 1375–1384

  20. Shim H, Kang B, Kwag K (2009) Web2animation - automatic generation of 3d animation from the web text. In: Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 01, IEEE Computer Society, Washington, DC, USA, WI-IAT ’09, pp 596–601

  21. Socher R, Karpathy A, Le QV, Manning CD, Ng AY (2014) Grounded compositional semantics for finding and describing images with sentences. TACL 2:207–218

    Google Scholar 

  22. Sumi K, Tanaka K (2005) Transforming web contents into a storybook with dialogues and animations. In: Special Interest Tracks and Posters of the 14th International Conference on World Wide Web, ACM, New York, NY, USA, WWW ’05, pp 1076–1077

  23. Tanaka K (2007) Research on fusion of the web and tv broadcasting. In: Proceedings of the Second International Conference on Informatics Research for Development of Knowledge Society Infrastructure, IEEE Computer Society, Washington, DC, USA, ICKS ’07, pp 129–136

  24. Tao D, Cheng J, Gao X, Li X, Deng C (2016a) Robust sparse coding for mobile image labeling on the cloud. IEEE Transactions on Circuits and Systems for Video Technology PP(99):1–1. doi:10.1109/TCSVT.2016.2539778

    Google Scholar 

  25. Tao D, Guo Y, Song M, Li Y, Yu Z, Tang YY (2016b) Person re-identification by dual-regularized kiss metric learning. IEEE Transactions on Image Processing 25(6):2726–2738. doi:10.1109/TIP.2016.2553446

    Article  MathSciNet  Google Scholar 

  26. UzZaman N, Bigham JP, Allen JF (2011) Multimodal summarization of complex sentences. In: Proceedings of the 16th International Conference on Intelligent User Interfaces, ACM, New York, NY, USA, IUI ’11, pp 43–52. doi:10.1145/1943403.1943412

  27. Witten IH, Milne D, 2008 An effective, low-cost measure of semantic relatedness obtained from wikipedia links. In: Proceeding of AAAI Workshop on Wikipedia and Artificial Intelligence: an Evolving Synergy, AAAI Press, Chicago, USA, pp 25-30

  28. Wu X, Xu B, Qiao Y, Tang X (2012) Automatic music video generation: cross matching of music and image. In: Proceedings of the 20th ACM international conference on Multimedia, ACM, pp 1381–1382

  29. Zhu X, Goldberg AB, Eldawy M, Dyer CR, Strock B (2007) A text-to-picture synthesis system for augmenting communication. In: Proceedings of the 22Nd National Conference on Artificial Intelligence - Volume 2, AAAI Press, AAAI’07, pp 1590–1595

  30. Zitnick CL, Parikh D, Vanderwende L (2013) Learning the visual interpretation of sentences. In: ICCV, IEEE, pp 1681–1688

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Murat Kalender.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kalender, M., Eren, M.T., Wu, Z. et al. Videolization: knowledge graph based automated video generation from web content. Multimed Tools Appl 77, 567–595 (2018). https://doi.org/10.1007/s11042-016-4275-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-016-4275-4

Keywords

Navigation