Abstract
Web content nowadays can also be accessed through new generation of Internet connected TVs. However, these products failed to change users’ behavior when consuming online content. Users still prefer personal computers to access Web content. Certainly, most of the online content is still designed to be accessed by personal computers or mobile devices. In order to overcome the usability problem of Web content consumption on TVs, this paper presents a knowledge graph based video generation system that automatically converts textual Web content into videos using semantic Web and computer graphics based technologies. As a use case, Wikipedia articles are automatically converted into videos. The effectiveness of the proposed system is validated empirically via opinion surveys. Fifty percent of survey users indicated that they found generated videos enjoyable and 42 % of them indicated that they would like to use our system to consume Web content on their TVs.
Similar content being viewed by others
Notes
References
Bailer W, Schallauer P (2006) Detailed audiovisual profile: enabling interoperability between mpeg-7 based systems. In: 2006 12th International Multi-Media Modelling Conference, pp 8. doi:10.1109/MMMC.2006.1651323
Borman A, Mihalcea R, Tarau P (2005) Picnet: Augmenting semantic resources with pictorial representations. In: AAAI Spring Symposium: Knowledge Collection from Volunteer Contributors, AAAI, pp 1–7
Cai R, Zhang L, Jing F, Lai W, Ma WY (2007) Automated music video generation using web image resource. In: 2007 IEEE International Conference on Acoustics, Speech and Signal Processing-ICASSP’07, IEEE, vol 2, pp II–737
Cornolti M, Ferragina P, Ciaramita M (2013) A framework for benchmarking entity-annotation systems
Coyne B, Sproat R (2001) Wordseye: An automatic text-to-scene conversion system. In: Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, ACM, New York, NY, USA, SIGGRAPH ’01, pp 487–496
Ferragina P, Scaiella U (2010) Tagme: On-the-fly annotation of short text fragments (by wikipedia entities). In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, ACM, New York, NY, USA, CIKM ’10, pp 1625–1628
Hansen V (2006) Interactive television design – designing for interactive television v 1.0 bbci & interactive tv programmes. BBC
Heath D, Ventura D (2016) Creating images by learning image semantics using vector space models. http://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/view/11961
Hoffart J, Suchanek FM, Berberich K, Lewis-Kelham E, de Melo G, Weikum G (2011a) Yago2: Exploring and querying world knowledge in time, space, context, and many languages. In: Proceedings of the 20th International Conference Companion on World Wide Web, ACM, New York, NY, USA, WWW ’11, pp 229–232. doi:10.1145/1963192.1963296
Hoffart J, Yosef MA, Bordino I, Fürstenau H, Pinkal M, Spaniol M, Taneva B, Thater S, Weikum G (2011b) Robust disambiguation of named entities in text. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Stroudsburg, PA, USA, EMNLP ’11, pp 782–792
Kulkarni S, Singh A, Ramakrishnan G, Chakrabarti S (2009) Collective annotation of wikipedia entities in web text. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, New York, NY, USA, KDD ’09, pp 457–466
Liu Y, Zhang D, Lu G, Ma WY (2007) A survey of content-based image retrieval with high-level semantics. Pattern Recognition 40(1):262–282. doi:10.1016/j.patcog.2006.04.045. http://www.sciencedirect.com/science/article/pii/S0031320306002184
Meij E, Weerkamp W, de Rijke M (2012) Adding semantics to microblog posts. In: Proceedings of the Fifth ACM International Conference on Web Search and Data Mining, ACM, New York, NY, USA, WSDM ’12, pp 563–572
Mendes PN, Jakob M, García-Silva A, Bizer C (2011) Dbpedia spotlight: Shedding light on the web of documents. In: Proceedings of the 7th International Conference on Semantic Systems, ACM, New York, NY, USA, I-Semantics ’11, pp 1–8
Mihalcea R, Leong CW (2008) Toward communicating simple sentences using pictorial representations. Machine Translation 22(3):153–173
Milne D, Witten IH (2008) Learning to link with wikipedia. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management, ACM, New York, NY, USA, CIKM ’08, pp 509–518
Nenkova A, McKeown K (2012) blubberdiblubb A survey of text summarization techniques. In: Aggarwal CC, Zhai C, blubberdiblubb (eds) Mining Text Data, Springer, pp 43–76
Ohya H, Morishima S (2012) Automatic music video creation system by reusing existing contents in video-sharing service based on hmm
Ratinov L, Roth D, Downey D, Anderson M (2011) Local and global algorithms for disambiguation to wikipedia. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1, Association for Computational Linguistics, Stroudsburg, PA, USA, HLT ’11, pp 1375–1384
Shim H, Kang B, Kwag K (2009) Web2animation - automatic generation of 3d animation from the web text. In: Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 01, IEEE Computer Society, Washington, DC, USA, WI-IAT ’09, pp 596–601
Socher R, Karpathy A, Le QV, Manning CD, Ng AY (2014) Grounded compositional semantics for finding and describing images with sentences. TACL 2:207–218
Sumi K, Tanaka K (2005) Transforming web contents into a storybook with dialogues and animations. In: Special Interest Tracks and Posters of the 14th International Conference on World Wide Web, ACM, New York, NY, USA, WWW ’05, pp 1076–1077
Tanaka K (2007) Research on fusion of the web and tv broadcasting. In: Proceedings of the Second International Conference on Informatics Research for Development of Knowledge Society Infrastructure, IEEE Computer Society, Washington, DC, USA, ICKS ’07, pp 129–136
Tao D, Cheng J, Gao X, Li X, Deng C (2016a) Robust sparse coding for mobile image labeling on the cloud. IEEE Transactions on Circuits and Systems for Video Technology PP(99):1–1. doi:10.1109/TCSVT.2016.2539778
Tao D, Guo Y, Song M, Li Y, Yu Z, Tang YY (2016b) Person re-identification by dual-regularized kiss metric learning. IEEE Transactions on Image Processing 25(6):2726–2738. doi:10.1109/TIP.2016.2553446
UzZaman N, Bigham JP, Allen JF (2011) Multimodal summarization of complex sentences. In: Proceedings of the 16th International Conference on Intelligent User Interfaces, ACM, New York, NY, USA, IUI ’11, pp 43–52. doi:10.1145/1943403.1943412
Witten IH, Milne D, 2008 An effective, low-cost measure of semantic relatedness obtained from wikipedia links. In: Proceeding of AAAI Workshop on Wikipedia and Artificial Intelligence: an Evolving Synergy, AAAI Press, Chicago, USA, pp 25-30
Wu X, Xu B, Qiao Y, Tang X (2012) Automatic music video generation: cross matching of music and image. In: Proceedings of the 20th ACM international conference on Multimedia, ACM, pp 1381–1382
Zhu X, Goldberg AB, Eldawy M, Dyer CR, Strock B (2007) A text-to-picture synthesis system for augmenting communication. In: Proceedings of the 22Nd National Conference on Artificial Intelligence - Volume 2, AAAI Press, AAAI’07, pp 1590–1595
Zitnick CL, Parikh D, Vanderwende L (2013) Learning the visual interpretation of sentences. In: ICCV, IEEE, pp 1681–1688
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kalender, M., Eren, M.T., Wu, Z. et al. Videolization: knowledge graph based automated video generation from web content. Multimed Tools Appl 77, 567–595 (2018). https://doi.org/10.1007/s11042-016-4275-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-016-4275-4