Practice-Oriented Evaluation of Unsupervised Labeling of Audiovisual Content in an Archive Production Environment

  • Victor de Boer
  • Roeland J. F. Ordelman
  • Josefien Schuurman
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9316)


In this paper we report on an evaluation of unsupervised labeling of audiovisual content using collateral text data sources to investigate how such an approach can provide acceptable results given requirements with respect to archival quality, authority and service levels to external users. We conclude that with parameter settings that are optimized using a rigorous evaluation of precision and accuracy, the quality of automatic term-suggestion are sufficiently high. Having implemented the procedure in our production work-flow allows us to gradually develop the system further and also assess the effect of the transformation from manual to automatic from an end-user perspective. Additional future work will be on deploying different information sources including annotations based on multimodal video analysis such as speaker recognition and computer vision.


Audiovisual access Information extraction Thesaurus Audiovisual archives Practice-oriented evaluation 



This research was funded by the MediaManagement Programme at the Netherlands Institute for Sound and Vision, the Dutch National Research Programme COMMIT/ and supported by NWO CATCH program ( and the Dutch Ministry of Culture.


  1. 1.
    Gazendam, L., Wartena, C., Malaisé, V., Schreiber, G., de Jong, A., Brugman, H.: Automatic annotation suggestions for audiovisual archives: evaluation aspects. Interdisc. Sci. Rev. 34(2–3), 172–188 (2009)CrossRefGoogle Scholar
  2. 2.
    Ordelman, R., Heeren, W., Huijbregts, M., de Jong, F., Hiemstra, D.: Towards affordable disclosure of spoken heritage archives. J. Digital Inf. 10(6), 17 (2009)Google Scholar
  3. 3.
    Declerck, T., Kuper, J., Saggion, H., Samiotou, A., Wittenburg, J.P., Contreras, J.: Contribution of NLP to the content indexing of multimedia documents. In: Enser, P.G.B., Kompatsiaris, Y., O’Connor, N.E., Smeaton, A.F., Smeulders, A.W.M. (eds.) CIVR 2004. LNCS, vol. 3115, pp. 610–618. Springer, Heidelberg (2004) CrossRefGoogle Scholar
  4. 4.
    Iivonen, M.: Consistency in the selection of search concepts and search terms. Inf. Process. Manage. 31(2), 173–190 (1995)CrossRefGoogle Scholar
  5. 5.
    Bizer, C., Heath, T., Berners-Lee, T.: Linked data - the story so far. Int. J. Semantic Web Inf. Syst. 5(3), 1–22 (2009)CrossRefGoogle Scholar
  6. 6.
    Likert, R.: A technique for the measurement of attitudes. Arch. Psychol. 22, 1–55 (1932)Google Scholar
  7. 7.
    Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S.J., McClosky, D.: The stanford corenlp natural language processing toolkit. In: Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 55–60 (2014)Google Scholar
  8. 8.
    Bontcheva, K., Tablan, V., Maynard, D., Cunningham, H.: Evolving gate to meet new challenges in language engineering. Nat. Lang. Eng. 10, 349–373 (9 2004)Google Scholar
  9. 9.
    Tommasi, T., Aly, R., McGuinness, K., Chatfield, K., Arandjelovic, R., Parkhi, O., Ordelman, R., Zisserman, A., Tuytelaars, T.: Beyond metadata: searching your archive based on its audio-visual content. In: IBC 2014, Amsterdam, The Netherlands (2014)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Victor de Boer
    • 1
    • 2
  • Roeland J. F. Ordelman
    • 1
    • 3
  • Josefien Schuurman
    • 1
  1. 1.Netherlands Institute for Sound and VisionHilversumThe Netherlands
  2. 2.The Network InstituteVU University AmsterdamAmsterdamThe Netherlands
  3. 3.University of TwenteEnschedeThe Netherlands

Personalised recommendations