Abstract
This paper presents a novel multimedia information system, called SAPTE, for supporting the discourse analysis and information retrieval of television programs from their corresponding video recordings. Unlike most common systems, SAPTE uses both content independent and dependent metadata, which are determined by the application of discourse analysis techniques as well as image and audio analysis methods. The proposed system was developed in partnership with the free-to-air Brazilian TV channel Rede Minas in an attempt to provide TV researchers with computational tools to assist their studies about this media universe. The system is based on the Matterhorn framework for managing video libraries, combining: (1) discourse analysis techniques for describing and indexing the videos, by considering aspects, such as, definitions of the subject of analysis, the nature of the speaker and the corpus of data resulting from the discourse; (2) a state of the art decoder software for large vocabulary continuous speech recognition, called Julius; (3) image and frequency domain techniques to compute visual signatures for the video recordings, containing color, shape and texture information; and (4) hashing and k-d tree methods for data indexing. The capabilities of SAPTE were successfully validated, as demonstrated by our experimental results, indicating that SAPTE is a promising computational tool for TV researchers.
Similar content being viewed by others
Notes
Available at: http://lear.inrialpes.fr/people/jegou/data.php
Available at: ftp://db.stanford.edu/pub/wangz/image.vary.jpg.tar
References
Abrahamsson H, Nordmark M (2012) Program popularity and viewer behaviour in a large TV-on-demand system. In: Proceedings of the ACM conference on internet measurement conference, ACM, pp 199–210
Al-Surmi M (2012) Authenticity and TV shows: a multidimensional analysis perspective. TESOL Q 46(4):671–694
Andrade AAB, Sabino JLMF, Silva GD, Pádua FLC (2012) Perfil de Potenciais Usuarios de Um Sistema de Informação Multimídia para Recuperação de Vídeos Televisivos. In: Proceedings of the XVII Brazilian conference on communication sciences (INTERCOM-SE). Intercom 2012, Ouro Preto - MG, vol 1, pp 1–13
Avila SEFD, Araujo ADA (2009) VSUMM: an approach based on color features for automatic summarization and a subjective evaluation method. In: Proceedings of the XXII Brazilian symposium on computer graphics and image processing, SIBGRAPI. Rio de Janeiro, p 10. doi:10.1109/SIBGRAPI.2008.31
Baaziz N, Abahmane O, Missaoui R (2010) Texture feature extraction in the spatial-frequency domain for content-based image retrieval. Comput Res Repos. arXiv:1012.5208
Bai H, Wang L, Qin G, Zhang J, Tao K, Chang X, Dong Y (2011) TV program segmentation using multi-modal information fusion. In: Proceedings of the ACM international conference on multimedia retrieval. ACM Press, pp 1–8
Baker P (2006) Using corpora in discourse analysis. Continuum
Biber D, Jones JK (2005) Merging corpus linguistic and discourse analytic research goals: discourse units in biology research articles. Corpus Linguist Linguist Theory 1(2):151–182
Brown E, Srinivasan S, Coden A, Ponceleon D, Cooper J, Amir A, Pieper J (2001) Toward speech as a knowledge resource. IBM Syst J 40(4):526–528
Brown JS, Duguid P (1991) Organizational learning and communities-of-practice: toward a unified view of working, learning, and innovation. Organ Sci 2(1):40–57
Cesar P, Chorianopoulos K (2009) The evolution of TV systems, content, and users toward interactivity. Found Trends Human-Comp Inter 2(4):373–395
Chang SF, Chen W, Meng HJ, Sundaram H, Zhong D (1997) VideoQ: an automated content based video search system using visual cues. In: Proceedings of the 5th ACM international conference on Multimedia. ACM, pp 313–324
Chang T, Kuo CJ (1993) Texture analysis and classification with tree-structured wavelet transform. IEEE Trans Image Process 2(4):429–441
Charaudeau P (2002) A communicative conception of discourse. Discourse Studies 4(3):301–318
Chatzigiorgaki M, Skodras AN (2009) Real-time keyframe extraction towards video content identification. In: Proceedings of the international conference on digital signal processing. IEEE Press, pp 934–939
Chen BW, Wang JC, Wang JF (2009) A novel video summarization based on mining the story-structure and semantic relations among concept entities. IEEE Trans Multimedia 11(2):295–312
Chen LH, Lai YC, Mark Liao HY (2008) Movie scene segmentation using background information. Pattern Recognition 41:1056–1065
Cheng F (2012) Connection between news narrative discourse and ideology-based on narrative perspective analysis of News Probe. Asian Social Science 8(12):75
Chiu CY, Wang JH, Chang HC (2007) Efficient histogram-based indexing for video copy detection. In: Proceedings of the IEEE international symposium on multimedia workshops. IEEE Computer Society, pp 265–270
Croft WB, Metzler D, Strohman T (2010) Search engines: information retrieval in practice. Pearson Education, Inc
Datta R, Joshi D, Li J, Wang JZ (2008) Image retrieval: ideas, influences, and trends of the new age. ACM Comput Surv (CSUR) 40(2):1–60
Duguid A (2010) Newspaper discourse informalisation: a diachronic comparison from keywords. Corpora 5(2):109–138
Fontaine G, Borgne-Bachschmidt L, Leiba M et al (2010) Scenarios for the internet migration of the television industry. Communicataions Strategies 1(77):21–34
Geetha P, Narayanan V (2008) A survey of content-based video retrieval. J Comput Sci 4(6):474–486
Gospodnetić O, Hatcher E (2005) Lucene in action: a guide to the java search engine. Manning Publications
Hearst MA (1993) TextTiling: a quantitative approach to discourse segmentation. Technical Report
Hollink L, Schreiber G, Huurnink B, Van Liempt M, de Rijke M, Smeulders A, Oomen J, De Jong A (2009) A multidisciplinary approach to unlocking television broadcast archives. Interdisc Sci Rev 34(2-3):2–3
Hu MK (1962) Visual pattern recognition by moment invariants. IRE Trans Inf Theory 8(2):179–187
Chen H, Li C (2010) A practical method for video scene segmentation. In: Proceedings of the 3rd IEEE international conference on computer science and information technology, vol 9, pp 153–156
Huurnink B, Snoek C, de Rijke M, Smeulders A (2012) Content-based analysis improves audiovisual archive retrieval. IEEE Trans Multimed 14(4):1166–1178
Jégou H, Douze M, Schmid C, Pérez P (2010) Aggregating local descriptors into a compact image representation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3304–3311
Jonathan C, Archer D, Davies M (2008) Pragmatic annotation. Wiley
Jorgensen MW, Phillips LJ (2002) Discourse analysis as theory and method. Sage
Kawahara T, Lee A, Takeda K, Itou K, Shikano K (2004) Recent progress of open-source LVCSR engine Julius and Japanese model repository. In: 8th international conference on spoken language processing
Ketterl M, Schult OA, Hochman A (2010) Opencast Matterhorn: a community-driven open source software project for producing, managing, and distributing academic video. ITSE 7(3):168–180
Ketterl M, Schulte O, Hochman A (2009) Opencast Matterhorn: a community-driven open source solution for creation, management and distribution of audio and video in academia. In: Proceedings of the 11th IEEE international symposium on multimedia. IEEE, pp 687–692
Khalid MS, Ilyas MU, Sarfaraz MS, Ajaz MA (2006) Bhattacharyya coefficient in correlation of gray-scale objects. J Multimedia 1(1):209–214
Lagoze C, Van de Sompel H (2003) The making of the open archives initiative protocol for metadata harvesting. Library Hi Tech 21(2):118–128
Lave J, Wenger E (2002) Legitimate peripheral participation in communities of practice. Supporting Lifelong Learning 1:111–126
Li Y, Narayanan S, Kuo C (2004) Content-based movie analysis and indexing based on audiovisual cues. IEEE Trans Circ Syst Video Tech 14(8):1073–1085
Long F, Zhang H, Feng DD (2003) Multimedia information retrieval and management - technological fundamentals and applications. In: Science, chap Fundamenta, p 476. Springer-Verlag, Berlin
Lowe DG (1999) Object recognition from local scale-invariant features. In: Proceedings of the international conference on computer vision, ICCV ’99. IEEE Computer Society, vol 2
Lux M (2009) Caliph & Emir: MPEG-7 photo annotation and retrieval. In: Proceedings of the 17th ACM international conference on Multimedia. ACM
Lv Q, Josephson W, Wang Z, Charikar M, Li K (2006) Ferret: a toolkit for content-based similarity search of feature-rich data. In: Proceedings of the EuroSys conference, ACM, Leuven, Belgium
Mann WC, Thompson SA (1998) Rhetorical structure theory: toward a functional theory of text organization. Text 8(3):243–281
Manson G, Berrani SA (2010) Automatic TV broadcast structuring. Int J DigitalMultimedia Broadcasting. doi:10.1155/2010/153160
Marchionini G, Wildemuth BM, Geisler G (2006) The open video digital library: a Möbius strip of research and practice. J Am Soc Info Sci Tech 57(12):1629–1643
Marcu D (2000) The rhetorical parsing of unrestricted texts: a surface-based approach. Computational Linguistics 26(3):395–448
Neto N, Patrick C, Klautau A, Trancoso I (2011) Free tools and resources for Brazilian Portuguese speech recognition. J Braz Comput Soc 17:53–68
Obrist M, Bernhaupt R, Tscheligi M (2008) Interactive TV for the Home: an ethnographic study on users’ requirements and experiences. Int J Hum Comput Interact 24(2):174–196
(2014). Opencast Matterhorn: official release documentation for opencast Matterhorn (Matterhorn adopter guides). https://opencast.jira.com/wiki
Pan Z, Kosicki GM (1993) Framing analysis: an approach to news discourse. Political Communication 10(1):55–75
Passonneau RJ, Litman DJ (1997) Discourse segmentation by human and automated means. Computational Linguistics 23(1):103–139
Pereira MHR, Pádua FLC, Silva GD, Assis GT, Zenha TM (2012) A multimedia information system to support the discourse analysis of video recordings of television programs. 7th Iberian conference on information systems and technologies (CISTI), vol 1, pp 58–63
(2013) Rede Minas: Television broadcaster TV Rede Minas. http://redeminas.tv/
Rey JM (2001) Changing gender roles in popular culture: dialogue in star trek episodes from 1966 to 1993. In: Conrad S, Biber D (eds) Variation in english: multidimensional studies, pp 138–55
Rubin N (2009) Preserving digital public television: not just an archive, but a new attitude to preserve public broadcasting. Library Trends 57(3):393–412
Sabino JLMF (2011) Análise Discursiva de Entrevistas e Debates Televisivos como Parâmetro para Indexação e Recuperação de Informações em um Banco de Dados Audiovisuais. Master’s Thesis in Linguistics, Centro Federal de Educação Tecnológica de Minas Gerais (CEFET-MG), Belo Horizonte
Sabino JLMF, Silva GD, Pádua FLC (2010) Parâmetros Discursivos para Indexação da Programação Televisiva em um Banco de Dados Audiovisuais: Análise do Programa Rede Mídia, vol 1, pp 1–14
Sadlier DA, Marlow S, O’Connor N, Murphy N (2002) Automatic TV advertisement detection from MPEG Bitstream. Pattern Recognit 35(12):2719–2726
Sandhu R, Georgiou T, Tannenbaum A (2008) A new distribution metric for image segmentation. Medical Imaging, vol 6914
Schiffrin D, Tannen D, Hamilton HE (2008) The handbook of discourse analysis. www.wiley.com/
Smeaton AF (2007) Techniques used and open challenges to the analysis, indexing and retrieval of digital video. Inf Syst 32(4):545–559
Smeaton AF, Lee H, McDonald K (2004) Experiences of creating four video library collections with the Físchlár System. Int J Digit Libr 4(1):42–44
Souza CL (2012) Recuperação de Vídeos Baseada em Conteúdo em um Sistema de Informação para Apoio à Análise do Discurso Televisivo. Master’s Thesis in Mathematical and Computational Modeling, Centro Federal de Educação Tecnológica de Minas Gerais (CEFET-MG), Belo Horizonte – MG
Spaniol M, Klamma R, Jan βen H, Renzel D (2006) LAS: a lightweight application server for MPEG-7 services in community engines. In: Proceedings of the I-KNOW, vol 6, pp 6–8
Spyrou E, Avrithis Y (2007) Keyframe extraction using local visual semantics in the form of a region thesaurus. In: Proceedings of the international workshop on semantic media adaptation and personalization. IEEE Computer Society, pp 98–103
Stamou G, Van Ossenbruggen J, Pan JZ, Schreiber G, Smith JR (2006) Multimedia annotations on the semantic web. MultiMedia, IEEE 13(1):86–90
Stegmaier F, Bailer W, Burger T, Suarez-Figueroa MC, Mannens E, Evain J, Kosch H (2013) Unified access to media metadata on the web. MultiMedia, IEEE 20(2):22–29
Stegmeier J (2013) Toward a computer-aided methodology for discourse analysis. SPIL 41:91–114
Upton TA, Cohen MA (2009) An approach to corpus-based discourse analysis: the move analysis as example. Discourse Studies 11(5):585–605
Van Dijk TA (1987) News analysis. L Erlbaum Associates
Van Dijk TA (2013) News as discourse. Routledge
Wactlar H, Christel M, Gong Y, Hauptmann A (1999) Lessons learned from building a terabyte digital video library. Computer 32(2):66–73
Weibel SL, Koch T (2000) The Dublin core metadata initiative. D-lib Magazine 6(12):1082–9873
Van de Wouwer G, Scheunders P, Livens S, Van Dyck D (1999) Wavelet correlation signatures for color texture characterization. Pattern Recogn 32(3):443–451
Yuan J, Zheng Q, Sun Z, Wang S (2012) Research on the technology of video semantic retrieval based on structured semantic strings. Foundations of intelligent systems, advances in intelligent and soft computing, vol 122. Springer Berlin Heidelberg, pp 721–730
Zeadally S, Moustafa H, Siddiqui F (2011) Internet protocol television (IPTV): architecture, trends, and challenges. Syst J IEEE 5(4):518–527
Zheng Q, Zhou Z (2011) An MPEG-7 compatible video retrieval system with support for semantic queries. International conference on consumer electronics, communications and networks (CECNet), vol 122, pp 1035–1041
Acknowledgments
The authors gratefully acknowledge the financial support of FAPEMIG-Brazil under Procs. APQ-01180-10 and APQ-02269-11; CEFET-MG under Procs. PROPESQ-088/12 and PROPESQ-076/09; CAPES-Brazil and CNPq-Brazil.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Pereira, M.H.R., de Souza, C.L., Pádua, F.L.C. et al. SAPTE: A multimedia information system to support the discourse analysis and information retrieval of television programs. Multimed Tools Appl 74, 10923–10963 (2015). https://doi.org/10.1007/s11042-014-2311-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-014-2311-9