Skip to main content

Visual Summarization of Scholarly Videos Using Word Embeddings and Keyphrase Extraction

Part of the Lecture Notes in Computer Science book series (LNISA,volume 11799)

Abstract

Effective learning with audiovisual content depends on many factors. Besides the quality of the learning resource’s content, it is essential to discover the most relevant and suitable video in order to support the learning process most effectively. Video summarization techniques facilitate this goal by providing a quick overview over the content. It is especially useful for longer recordings such as conference presentations or lectures. In this paper, we present a domain specific approach that generates a visual summary of video content using solely textual information. For this purpose, we exploit video annotations that are automatically generated by speech recognition and video OCR (optical character recognition). Textual information is represented by semantic word embeddings and extracted keyphrases. We demonstrate the feasibility of the proposed approach through its incorporation into the TIB AV-Portal (http://av.tib.eu/), which is a platform for scientific videos. The accuracy and usefulness of the generated video content visualizations is evaluated in a user study.

Keywords

  • Video summarization
  • Word embeddings
  • Scientific videos

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-30760-8_28
  • Chapter length: 9 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   69.99
Price excludes VAT (USA)
  • ISBN: 978-3-030-30760-8
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   89.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.

Notes

  1. 1.

    https://nlp.stanford.edu/software/tagger.shtml.

  2. 2.

    https://plot.ly/api/.

References

  1. Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. TACL 5, 135–146 (2017)

    Google Scholar 

  2. Boudin, F.: PKE: an open source python-based keyphrase extraction toolkit. In: International Conference on Computational Linguistics, Conference System Demonstrations, pp. 69–73. Osaka, Japan (2016)

    Google Scholar 

  3. Boudin, F.: Unsupervised keyphrase extraction with multipartite graphs. In: Conference for Computational Linguistics: Human Language Technologies, New Orleans, Louisiana, USA, pp. 667–672 (2018)

    Google Scholar 

  4. Chang, W., Yang, J., Wu, Y.: A keyword-based video summarization learning platform with multimodal surrogates. In: International Conference on Advanced Learning Technologies, Athens, Georgia, USA, pp. 37–41 (2011)

    Google Scholar 

  5. Elhamifar, E., Kaluza, M.C.D.P.: Online summarization via submodular and convex optimization. In: Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, pp. 1818–1826 (2017)

    Google Scholar 

  6. Florescu, C., Caragea, C.: Positionrank: an unsupervised approach to keyphrase extraction from scholarly documents. In: Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada, pp. 1105–1115 (2017)

    Google Scholar 

  7. Hasan, K.S., Ng, V.: Automatic keyphrase extraction: a survey of the state of the art. In: Meeting of the Association for Computational Linguistics, 22–27 June 2014, Baltimore, MD, USA, pp. 1262–1273 (2014)

    Google Scholar 

  8. Havre, S., Hetzler, E.G., Whitney, P., Nowell, L.T.: Themeriver: visualizing thematic changes in large document collections. IEEE Trans. Vis. Comput. Graph. 8(1), 9–20 (2002)

    CrossRef  Google Scholar 

  9. Lu, Z., Grauman, K.: Story-driven summarization for egocentric video. In: Conference on Computer Vision and Pattern Recognition, Portland, OR, USA, pp. 2714–2721 (2013)

    Google Scholar 

  10. Ma, Y., Hua, X., Lu, L., Zhang, H.: A generic framework of user attention model and its application in video summarization. IEEE Trans. Multimedia 7(5), 907–919 (2005)

    CrossRef  Google Scholar 

  11. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Conference on Neural Information Processing Systems, Lake Tahoe, Nevada, USA, pp. 3111–3119 (2013)

    Google Scholar 

  12. Over, P., Smeaton, A.F., Awad, G.: The TRECVid 2008 BBC rushes summarization evaluation. In: ACM Workshop on Video Summarization, Vancouver, British Columbia, Canada, pp. 1–20 (2008)

    Google Scholar 

  13. Paley, W.B.: Textarc: showing word frequency and distribution in text. In: Poster presented at IEEE Symposium on Information Visualization, vol. 2002 (2002)

    Google Scholar 

  14. Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, pp. 1532–1543 (2014)

    Google Scholar 

  15. Wang, M., Hong, R., Li, G., Zha, Z., Yan, S., Chua, T.: Event driven web video summarization by tag localization and key-shot identification. IEEE Trans. Multimedia 14(4), 975–985 (2012)

    CrossRef  Google Scholar 

  16. Yanai, K., Barnard, K.: Image region entropy: a measure of “visualness” of web images associated with one concept. In: ACM International Conference on Multimedia, Singapore, pp. 419–422 (2005)

    Google Scholar 

  17. Zhao, B., Li, X., Lu, X.: HSA-RNN: hierarchical structure-adaptive RNN for video summarization. In: Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, pp. 7405–7414 (2018)

    Google Scholar 

Download references

Acknowledgments

Part of this work is financially supported by the Leibniz Association, Germany (Leibniz Competition 2018, funding line “Collaborative Excellence”, project SALIENT [K68/2017]).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christian Otto .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Zhou, H., Otto, C., Ewerth, R. (2019). Visual Summarization of Scholarly Videos Using Word Embeddings and Keyphrase Extraction. In: Doucet, A., Isaac, A., Golub, K., Aalberg, T., Jatowt, A. (eds) Digital Libraries for Open Knowledge. TPDL 2019. Lecture Notes in Computer Science(), vol 11799. Springer, Cham. https://doi.org/10.1007/978-3-030-30760-8_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-30760-8_28

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-30759-2

  • Online ISBN: 978-3-030-30760-8

  • eBook Packages: Computer ScienceComputer Science (R0)