Content-based video retrieval in historical collections of the German Broadcasting Archive

Published: 08 March 2018

Volume 20, pages 167–183, (2019)
Cite this article

International Journal on Digital Libraries Aims and scope Submit manuscript

Markus Mühling ORCID: orcid.org/0000-0001-7391-264X¹,
Manja Meister³,
Nikolaus Korfhage¹,
Jörg Wehling³,
Angelika Hörth³,
Ralph Ewerth^2,4 &
…
Bernd Freisleben¹

686 Accesses
19 Citations
Explore all metrics

Abstract

The German Broadcasting Archive maintains the cultural heritage of radio and television broadcasts of the former German Democratic Republic (GDR). The uniqueness and importance of the video material fosters a large scientific interest in the video content. In this paper, we present a system for automatic video content analysis and retrieval to facilitate search in historical collections of GDR television recordings. It relies on a distributed, service-oriented architecture and includes video analysis algorithms for shot boundary detection, concept classification, person recognition, text recognition and similarity search. The combination of different search modalities allows users to obtain answers for a wide range of queries, leading to satisfactory results in short time. The performance of the system is evaluated using 2500 h of GDR television recordings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Fig. 2

Fig. 3

Fig. 4

Fig. 5

Fig. 6

Fig. 7

Fig. 8

Fig. 9

Fig. 10

Fig. 11

Fig. 12

Similar content being viewed by others

Content-Based Video Retrieval in Historical Collections of the German Broadcasting Archive

Chapter © 2016

VERGE: An Interactive Search Engine for Browsing Video Collections

Chapter © 2014

Efficient Search and Browsing of Large-Scale Video Collections with Vibro

Chapter © 2022

Notes

References

Ahonen, T., Hadid, A., Pietikainen, M.: Face recognition with local binary patterns. In: Proceedings of the IEEE European Conference on Computer Vision. pp. 469–481 (2004)
Albertson, D., Ju, B.: Design criteria for video digital libraries: categories of important features emerging from users’ responses. Online Inf. Rev. 39(2), 214–228 (2015)
Article Google Scholar
Belhumeur, P.N., Kriegman, D.J.: Eigenfaces versus fisherfaces: recognition using class specific linear projection. IEEE Trans. Pattern Anal. Mach. Intell. 19(7), 711–720 (1997)
Article Google Scholar
Breuel, T.M., Ul-Hasan, A., Al-Azawi, M.A., Shafait, F.: High-performance OCR for printed English and Fraktur using LSTM networks. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 683–687 (2013)
Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: delving deep into convolutional nets. In: Proceedings of the British Machine Vision Conference, pp. 1–11 (2014)
Christel, M., Kanade, T., Mauldin, M., Reddy, R., Sirbu, M., Stevens, S., Wactlar, H.: Informedia digital video library. Commun. ACM 38(4), 57–58 (1995)
Article Google Scholar
Chua, T.S., Tang, J., Hong, R., Li, H., Luo, Z., Zheng, Y.: NUS-WIDE: A real-world web image database from National University of Singapore. In: Proceedings of the ACM International Conference on Image and Video Retrieval, pp. 48:1–48:9 (2009)
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR ’09), pp. 2–9 (2009)
Ewerth, R., Ballafkir, K., Mühling, M., Seiler, D., Freisleben, B.: Long-term incremental web-supervised learning of visual concepts via random savannas. IEEE Trans. Multimed. 14(4), 1008–1020 (2012)
Article Google Scholar
Ewerth, R., Freisleben, B.: Video cut detection without thresholds. In: Proceedings of the 11th International Workshop on Signals, Systems and Image Processing (IWSSIP ’04), pp. 227–230. Poznan, Poland (2004)
Ewerth, R., Freisleben, B.: Unsupervised detection of gradual video shot changes with motion-based false alarm removal. In: Proceedings of the 11th Conference on Advanced Concepts for Intelligent Vision Systems, pp. 253–264 (2009)
Ewerth, R., Mühling, M., Freisleben, B.: Self-supervised learning of face appearances in TV casts and movies. Int. J. Semant. Comput. 1(2), 185–204 (2007)
Article Google Scholar
Ewerth, R., Mühling, M., Freisleben, B.: Robust video content analysis via transductive learning. ACM Trans. Intell. Syst. Technol. (TIST) 3(3), 1–26 (2011)
Google Scholar
Ewerth, R., Schwalb, M., Tessmann, P., Freisleben, B.: Segmenting Moving Objects in MPEG Videos in the Presence of Camera Motion. In: Image Analysis and Processing, 2007. ICIAP 2007. 14th International Conference on IEEE, pp. 819–824 (2007)
Gllavata, J., Ewerth, R.: Text detection in images based on unsupervised classification of high-frequency wavelet coefficients. In: Proceedings of 17th International Conference on Pattern Recognition (ICPR ’04), pp. 425–428. IEEE (2004)
Gong, Y., Jia, Y., Leung, T., Toshev, A., Ioffe, S.: Deep Convolutional Ranking for Multilabel Image Annotation. arXiv preprint arXiv:1312.4894 (2013)
Graves, A., Mohamed, A., Hinton, G.: Speech recognition with deep recurrent neural networks. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2013), pp. 6645–6649 (2013)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Hentschel, C., Blümel, I., Sack, H.: Automatic annotation of scientific video material based on visual concept detection. In: Proceedings of the 13th International Conference on Knowledge Management and Knowledge Technologies, p. 16 (2013)
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the ACM International Conference on Multimedia, pp. 675–678 (2014)
Krizhevsky, A., Hinton, G.: Using very deep Autoencoders for content-based image retrieval. In: Proceedings of the European Symposium on Artificial Neural Networks, pp. 1–7 (2011)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1–9 (2012)
Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Sov. Phys. Doklady. 10, 707–710 (1966)
MathSciNet Google Scholar
Lin, K., Yang, H., Hsiao, J., Chen, C.: Deep learning of binary hash codes for fast image retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 27–35 (2015)
Liu, L., Wang, L., Liu, X.: In defense of soft-assignment coding. In: Proceedings of the 13th IEEE International Conference on Computer Vision, pp. 2486–2493 (2011)
Lowe, D.G.: Object recognition from local scale-invariant features. In: Proceedings of the 7th IEEE International Conference on Computer Vision, vol. 2, pp. 1150–1157 (1999)
Marchionini, G., Geisler, G.: The open video digital library. D-Lib. Mag. 8(12), 1082–9873 (2002)
Google Scholar
Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide-baseline stereo from maximally stable extremal regions. Image Vis. Comput. 22(10), 761–767 (2004)
Article Google Scholar
Mühling, M., Markus, M., Ewerth, R., Freisleben, B.: Improving cross-domain concept detection via object-based features. In: Proceedings of the International Conference on Computer Analysis of Images and Patterns (CAIP ’15) (2015)
Mühling, M., Ewerth, R., Freisleben, B.: On the spatial extents of SIFT descriptors for visual concept detection. In: Proceedings of the 8th International Conference on Computer Vision Systems, pp. 71–80. Springer (2011)
Mühling, M., Ewerth, R., Shi, B., Freisleben, B.: Multi-class object detection with hough forests using local histograms of visual words. In: Proceedings of 14th International Conference on Computer Analysis of Images and Patterns, pp. 386–393. Springer (2011)
Mühling, M., Ewerth, R., Zhou, J., Freisleben, B.: Multimodal video concept detection via bag of auditory words and multiple kernel learning. In: Proceedings of the 18th International Conference on Advances in Multimedia Modeling, pp. 40–50. Springer (2012)
Neumann, L., Matas, J.: Real-time scene text localization and recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2012)
Sack, H., Plank, M.: AV-Portal: The German National Library of Science and Technology’s Semantic Video Portal. ERCIM News 96 (2014)
Salakhutdinov, R., Hinton, G.: Semantic hashing. Int. J. Approx. Reason. 50(7), 969–978 (2009)
Article Google Scholar
Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: A unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 815–823 (2015)
Smeulders, A.W., Worring, M., Santini, S., Gupta, A., Jain, R.: Content-based image retrieval at the end of the early years. IEEE Trans. Pattern. Anal. Mach. Intell. 22(12), 1349–1380 (2000)
Article Google Scholar
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: DeepFace: Closing the gap to human-level performance in face verification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2014)
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 511–518 (2001)
Wan, J., Wang, D., Hoi, S.C.H., Wu, P.: Deep learning for content-based image retrieval: a comprehensive study. In: Proceedings of the ACM International Conference on Multimedia, pp. 157–166 (2014)
Yianilos, P.N.: Data structures and algorithms for nearest neighbor search in general metric spaces. In: Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 311–321 (1993)
Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. Adv. Neural Inf. Process. Syst. 27, 487–495 (2014)
Google Scholar

Download references

Acknowledgements

This work is financially supported by the German Research Foundation (DFG; Funding Programme: “Förderung herausragender Forschungsbibliotheken”; Project: “Bild- und Szenenrecherche in historischen Beständen des DDR-Fernsehens im Deutschen Rundfunkarchiv durch automatische inhaltsbasierte Videoanalyse”; CR 456/1-1, EW 134/1-1, FR 791/12-1).

Author information

Authors and Affiliations

Department of Mathematics and Computer Science, University of Marburg, Hans-Meerwein-Str. 6, 35032, Marburg, Germany
Markus Mühling, Nikolaus Korfhage & Bernd Freisleben
German National Library of Science and Technology (TIB), Welfengarten 1B, 30167, Hannover, Germany
Ralph Ewerth
German Broadcasting Archive, Marlene-Dietrich-Allee 20, 14482, Potsdam, Germany
Manja Meister, Jörg Wehling & Angelika Hörth
L3S Research Center, Leibniz Universität Hannover, Appelstraße 4, 30167, Hannover, Germany
Ralph Ewerth

Authors

Markus Mühling
View author publications
You can also search for this author in PubMed Google Scholar
Manja Meister
View author publications
You can also search for this author in PubMed Google Scholar
Nikolaus Korfhage
View author publications
You can also search for this author in PubMed Google Scholar
Jörg Wehling
View author publications
You can also search for this author in PubMed Google Scholar
Angelika Hörth
View author publications
You can also search for this author in PubMed Google Scholar
Ralph Ewerth
View author publications
You can also search for this author in PubMed Google Scholar
Bernd Freisleben
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Markus Mühling.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mühling, M., Meister, M., Korfhage, N. et al. Content-based video retrieval in historical collections of the German Broadcasting Archive. Int J Digit Libr 20, 167–183 (2019). https://doi.org/10.1007/s00799-018-0236-z

Download citation

Received: 31 January 2017
Revised: 10 February 2018
Accepted: 19 February 2018
Published: 08 March 2018
Issue Date: 01 June 2019
DOI: https://doi.org/10.1007/s00799-018-0236-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.