Skip to main content
Log in

SAPTE: A multimedia information system to support the discourse analysis and information retrieval of television programs

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

This paper presents a novel multimedia information system, called SAPTE, for supporting the discourse analysis and information retrieval of television programs from their corresponding video recordings. Unlike most common systems, SAPTE uses both content independent and dependent metadata, which are determined by the application of discourse analysis techniques as well as image and audio analysis methods. The proposed system was developed in partnership with the free-to-air Brazilian TV channel Rede Minas in an attempt to provide TV researchers with computational tools to assist their studies about this media universe. The system is based on the Matterhorn framework for managing video libraries, combining: (1) discourse analysis techniques for describing and indexing the videos, by considering aspects, such as, definitions of the subject of analysis, the nature of the speaker and the corpus of data resulting from the discourse; (2) a state of the art decoder software for large vocabulary continuous speech recognition, called Julius; (3) image and frequency domain techniques to compute visual signatures for the video recordings, containing color, shape and texture information; and (4) hashing and k-d tree methods for data indexing. The capabilities of SAPTE were successfully validated, as demonstrated by our experimental results, indicating that SAPTE is a promising computational tool for TV researchers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Notes

  1. Available at: http://lear.inrialpes.fr/people/jegou/data.php

  2. Available at: ftp://db.stanford.edu/pub/wangz/image.vary.jpg.tar

References

  1. Abrahamsson H, Nordmark M (2012) Program popularity and viewer behaviour in a large TV-on-demand system. In: Proceedings of the ACM conference on internet measurement conference, ACM, pp 199–210

  2. Al-Surmi M (2012) Authenticity and TV shows: a multidimensional analysis perspective. TESOL Q 46(4):671–694

    Google Scholar 

  3. Andrade AAB, Sabino JLMF, Silva GD, Pádua FLC (2012) Perfil de Potenciais Usuarios de Um Sistema de Informação Multimídia para Recuperação de Vídeos Televisivos. In: Proceedings of the XVII Brazilian conference on communication sciences (INTERCOM-SE). Intercom 2012, Ouro Preto - MG, vol 1, pp 1–13

  4. Avila SEFD, Araujo ADA (2009) VSUMM: an approach based on color features for automatic summarization and a subjective evaluation method. In: Proceedings of the XXII Brazilian symposium on computer graphics and image processing, SIBGRAPI. Rio de Janeiro, p 10. doi:10.1109/SIBGRAPI.2008.31

  5. Baaziz N, Abahmane O, Missaoui R (2010) Texture feature extraction in the spatial-frequency domain for content-based image retrieval. Comput Res Repos. arXiv:1012.5208

  6. Bai H, Wang L, Qin G, Zhang J, Tao K, Chang X, Dong Y (2011) TV program segmentation using multi-modal information fusion. In: Proceedings of the ACM international conference on multimedia retrieval. ACM Press, pp 1–8

  7. Baker P (2006) Using corpora in discourse analysis. Continuum

  8. Biber D, Jones JK (2005) Merging corpus linguistic and discourse analytic research goals: discourse units in biology research articles. Corpus Linguist Linguist Theory 1(2):151–182

    Google Scholar 

  9. Brown E, Srinivasan S, Coden A, Ponceleon D, Cooper J, Amir A, Pieper J (2001) Toward speech as a knowledge resource. IBM Syst J 40(4):526–528

    Article  Google Scholar 

  10. Brown JS, Duguid P (1991) Organizational learning and communities-of-practice: toward a unified view of working, learning, and innovation. Organ Sci 2(1):40–57

    Article  Google Scholar 

  11. Cesar P, Chorianopoulos K (2009) The evolution of TV systems, content, and users toward interactivity. Found Trends Human-Comp Inter 2(4):373–395

    Article  Google Scholar 

  12. Chang SF, Chen W, Meng HJ, Sundaram H, Zhong D (1997) VideoQ: an automated content based video search system using visual cues. In: Proceedings of the 5th ACM international conference on Multimedia. ACM, pp 313–324

  13. Chang T, Kuo CJ (1993) Texture analysis and classification with tree-structured wavelet transform. IEEE Trans Image Process 2(4):429–441

    Article  Google Scholar 

  14. Charaudeau P (2002) A communicative conception of discourse. Discourse Studies 4(3):301–318

    Article  Google Scholar 

  15. Chatzigiorgaki M, Skodras AN (2009) Real-time keyframe extraction towards video content identification. In: Proceedings of the international conference on digital signal processing. IEEE Press, pp 934–939

  16. Chen BW, Wang JC, Wang JF (2009) A novel video summarization based on mining the story-structure and semantic relations among concept entities. IEEE Trans Multimedia 11(2):295–312

    Article  Google Scholar 

  17. Chen LH, Lai YC, Mark Liao HY (2008) Movie scene segmentation using background information. Pattern Recognition 41:1056–1065

    Article  MATH  Google Scholar 

  18. Cheng F (2012) Connection between news narrative discourse and ideology-based on narrative perspective analysis of News Probe. Asian Social Science 8(12):75

    Article  Google Scholar 

  19. Chiu CY, Wang JH, Chang HC (2007) Efficient histogram-based indexing for video copy detection. In: Proceedings of the IEEE international symposium on multimedia workshops. IEEE Computer Society, pp 265–270

  20. Croft WB, Metzler D, Strohman T (2010) Search engines: information retrieval in practice. Pearson Education, Inc

  21. Datta R, Joshi D, Li J, Wang JZ (2008) Image retrieval: ideas, influences, and trends of the new age. ACM Comput Surv (CSUR) 40(2):1–60

    Article  Google Scholar 

  22. Duguid A (2010) Newspaper discourse informalisation: a diachronic comparison from keywords. Corpora 5(2):109–138

    Article  Google Scholar 

  23. Fontaine G, Borgne-Bachschmidt L, Leiba M et al (2010) Scenarios for the internet migration of the television industry. Communicataions Strategies 1(77):21–34

    Google Scholar 

  24. Geetha P, Narayanan V (2008) A survey of content-based video retrieval. J Comput Sci 4(6):474–486

    Article  Google Scholar 

  25. Gospodnetić O, Hatcher E (2005) Lucene in action: a guide to the java search engine. Manning Publications

  26. Hearst MA (1993) TextTiling: a quantitative approach to discourse segmentation. Technical Report

  27. Hollink L, Schreiber G, Huurnink B, Van Liempt M, de Rijke M, Smeulders A, Oomen J, De Jong A (2009) A multidisciplinary approach to unlocking television broadcast archives. Interdisc Sci Rev 34(2-3):2–3

    Article  Google Scholar 

  28. Hu MK (1962) Visual pattern recognition by moment invariants. IRE Trans Inf Theory 8(2):179–187

    Article  MATH  Google Scholar 

  29. Chen H, Li C (2010) A practical method for video scene segmentation. In: Proceedings of the 3rd IEEE international conference on computer science and information technology, vol 9, pp 153–156

  30. Huurnink B, Snoek C, de Rijke M, Smeulders A (2012) Content-based analysis improves audiovisual archive retrieval. IEEE Trans Multimed 14(4):1166–1178

    Article  Google Scholar 

  31. Jégou H, Douze M, Schmid C, Pérez P (2010) Aggregating local descriptors into a compact image representation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3304–3311

  32. Jonathan C, Archer D, Davies M (2008) Pragmatic annotation. Wiley

  33. Jorgensen MW, Phillips LJ (2002) Discourse analysis as theory and method. Sage

  34. Kawahara T, Lee A, Takeda K, Itou K, Shikano K (2004) Recent progress of open-source LVCSR engine Julius and Japanese model repository. In: 8th international conference on spoken language processing

  35. Ketterl M, Schult OA, Hochman A (2010) Opencast Matterhorn: a community-driven open source software project for producing, managing, and distributing academic video. ITSE 7(3):168–180

    Google Scholar 

  36. Ketterl M, Schulte O, Hochman A (2009) Opencast Matterhorn: a community-driven open source solution for creation, management and distribution of audio and video in academia. In: Proceedings of the 11th IEEE international symposium on multimedia. IEEE, pp 687–692

  37. Khalid MS, Ilyas MU, Sarfaraz MS, Ajaz MA (2006) Bhattacharyya coefficient in correlation of gray-scale objects. J Multimedia 1(1):209–214

  38. Lagoze C, Van de Sompel H (2003) The making of the open archives initiative protocol for metadata harvesting. Library Hi Tech 21(2):118–128

    Article  Google Scholar 

  39. Lave J, Wenger E (2002) Legitimate peripheral participation in communities of practice. Supporting Lifelong Learning 1:111–126

    Google Scholar 

  40. Li Y, Narayanan S, Kuo C (2004) Content-based movie analysis and indexing based on audiovisual cues. IEEE Trans Circ Syst Video Tech 14(8):1073–1085

    Article  Google Scholar 

  41. Long F, Zhang H, Feng DD (2003) Multimedia information retrieval and management - technological fundamentals and applications. In: Science, chap Fundamenta, p 476. Springer-Verlag, Berlin

    Google Scholar 

  42. Lowe DG (1999) Object recognition from local scale-invariant features. In: Proceedings of the international conference on computer vision, ICCV ’99. IEEE Computer Society, vol 2

  43. Lux M (2009) Caliph & Emir: MPEG-7 photo annotation and retrieval. In: Proceedings of the 17th ACM international conference on Multimedia. ACM

  44. Lv Q, Josephson W, Wang Z, Charikar M, Li K (2006) Ferret: a toolkit for content-based similarity search of feature-rich data. In: Proceedings of the EuroSys conference, ACM, Leuven, Belgium

  45. Mann WC, Thompson SA (1998) Rhetorical structure theory: toward a functional theory of text organization. Text 8(3):243–281

    Google Scholar 

  46. Manson G, Berrani SA (2010) Automatic TV broadcast structuring. Int J DigitalMultimedia Broadcasting. doi:10.1155/2010/153160

  47. Marchionini G, Wildemuth BM, Geisler G (2006) The open video digital library: a Möbius strip of research and practice. J Am Soc Info Sci Tech 57(12):1629–1643

    Article  Google Scholar 

  48. Marcu D (2000) The rhetorical parsing of unrestricted texts: a surface-based approach. Computational Linguistics 26(3):395–448

    Article  Google Scholar 

  49. Neto N, Patrick C, Klautau A, Trancoso I (2011) Free tools and resources for Brazilian Portuguese speech recognition. J Braz Comput Soc 17:53–68

    Article  Google Scholar 

  50. Obrist M, Bernhaupt R, Tscheligi M (2008) Interactive TV for the Home: an ethnographic study on users’ requirements and experiences. Int J Hum Comput Interact 24(2):174–196

    Article  Google Scholar 

  51. (2014). Opencast Matterhorn: official release documentation for opencast Matterhorn (Matterhorn adopter guides). https://opencast.jira.com/wiki

  52. Pan Z, Kosicki GM (1993) Framing analysis: an approach to news discourse. Political Communication 10(1):55–75

    Article  Google Scholar 

  53. Passonneau RJ, Litman DJ (1997) Discourse segmentation by human and automated means. Computational Linguistics 23(1):103–139

    Google Scholar 

  54. Pereira MHR, Pádua FLC, Silva GD, Assis GT, Zenha TM (2012) A multimedia information system to support the discourse analysis of video recordings of television programs. 7th Iberian conference on information systems and technologies (CISTI), vol 1, pp 58–63

  55. (2013) Rede Minas: Television broadcaster TV Rede Minas. http://redeminas.tv/

  56. Rey JM (2001) Changing gender roles in popular culture: dialogue in star trek episodes from 1966 to 1993. In: Conrad S, Biber D (eds) Variation in english: multidimensional studies, pp 138–55

  57. Rubin N (2009) Preserving digital public television: not just an archive, but a new attitude to preserve public broadcasting. Library Trends 57(3):393–412

    Article  Google Scholar 

  58. Sabino JLMF (2011) Análise Discursiva de Entrevistas e Debates Televisivos como Parâmetro para Indexação e Recuperação de Informações em um Banco de Dados Audiovisuais. Master’s Thesis in Linguistics, Centro Federal de Educação Tecnológica de Minas Gerais (CEFET-MG), Belo Horizonte

  59. Sabino JLMF, Silva GD, Pádua FLC (2010) Parâmetros Discursivos para Indexação da Programação Televisiva em um Banco de Dados Audiovisuais: Análise do Programa Rede Mídia, vol 1, pp 1–14

  60. Sadlier DA, Marlow S, O’Connor N, Murphy N (2002) Automatic TV advertisement detection from MPEG Bitstream. Pattern Recognit 35(12):2719–2726

    Article  MATH  Google Scholar 

  61. Sandhu R, Georgiou T, Tannenbaum A (2008) A new distribution metric for image segmentation. Medical Imaging, vol 6914

  62. Schiffrin D, Tannen D, Hamilton HE (2008) The handbook of discourse analysis. www.wiley.com/

  63. Smeaton AF (2007) Techniques used and open challenges to the analysis, indexing and retrieval of digital video. Inf Syst 32(4):545–559

    Article  Google Scholar 

  64. Smeaton AF, Lee H, McDonald K (2004) Experiences of creating four video library collections with the Físchlár System. Int J Digit Libr 4(1):42–44

    Article  Google Scholar 

  65. Souza CL (2012) Recuperação de Vídeos Baseada em Conteúdo em um Sistema de Informação para Apoio à Análise do Discurso Televisivo. Master’s Thesis in Mathematical and Computational Modeling, Centro Federal de Educação Tecnológica de Minas Gerais (CEFET-MG), Belo Horizonte – MG

  66. Spaniol M, Klamma R, Jan βen H, Renzel D (2006) LAS: a lightweight application server for MPEG-7 services in community engines. In: Proceedings of the I-KNOW, vol 6, pp 6–8

  67. Spyrou E, Avrithis Y (2007) Keyframe extraction using local visual semantics in the form of a region thesaurus. In: Proceedings of the international workshop on semantic media adaptation and personalization. IEEE Computer Society, pp 98–103

  68. Stamou G, Van Ossenbruggen J, Pan JZ, Schreiber G, Smith JR (2006) Multimedia annotations on the semantic web. MultiMedia, IEEE 13(1):86–90

    Article  Google Scholar 

  69. Stegmaier F, Bailer W, Burger T, Suarez-Figueroa MC, Mannens E, Evain J, Kosch H (2013) Unified access to media metadata on the web. MultiMedia, IEEE 20(2):22–29

    Article  Google Scholar 

  70. Stegmeier J (2013) Toward a computer-aided methodology for discourse analysis. SPIL 41:91–114

    Article  Google Scholar 

  71. Upton TA, Cohen MA (2009) An approach to corpus-based discourse analysis: the move analysis as example. Discourse Studies 11(5):585–605

    Article  Google Scholar 

  72. Van Dijk TA (1987) News analysis. L Erlbaum Associates

  73. Van Dijk TA (2013) News as discourse. Routledge

  74. Wactlar H, Christel M, Gong Y, Hauptmann A (1999) Lessons learned from building a terabyte digital video library. Computer 32(2):66–73

    Article  Google Scholar 

  75. Weibel SL, Koch T (2000) The Dublin core metadata initiative. D-lib Magazine 6(12):1082–9873

    Article  Google Scholar 

  76. Van de Wouwer G, Scheunders P, Livens S, Van Dyck D (1999) Wavelet correlation signatures for color texture characterization. Pattern Recogn 32(3):443–451

    Article  Google Scholar 

  77. Yuan J, Zheng Q, Sun Z, Wang S (2012) Research on the technology of video semantic retrieval based on structured semantic strings. Foundations of intelligent systems, advances in intelligent and soft computing, vol 122. Springer Berlin Heidelberg, pp 721–730

  78. Zeadally S, Moustafa H, Siddiqui F (2011) Internet protocol television (IPTV): architecture, trends, and challenges. Syst J IEEE 5(4):518–527

    Article  Google Scholar 

  79. Zheng Q, Zhou Z (2011) An MPEG-7 compatible video retrieval system with support for semantic queries. International conference on consumer electronics, communications and networks (CECNet), vol 122, pp 1035–1041

Download references

Acknowledgments

The authors gratefully acknowledge the financial support of FAPEMIG-Brazil under Procs. APQ-01180-10 and APQ-02269-11; CEFET-MG under Procs. PROPESQ-088/12 and PROPESQ-076/09; CAPES-Brazil and CNPq-Brazil.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Moisés H. R. Pereira.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pereira, M.H.R., de Souza, C.L., Pádua, F.L.C. et al. SAPTE: A multimedia information system to support the discourse analysis and information retrieval of television programs. Multimed Tools Appl 74, 10923–10963 (2015). https://doi.org/10.1007/s11042-014-2311-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-014-2311-9

Keywords

Navigation