Skip to main content

Convolutional-Block-Attention Dual Path Networks for Slide Transition Detection in Lecture Videos

  • Conference paper
  • First Online:
Digital TV and Wireless Multimedia Communication (IFTC 2019)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1181))

  • 642 Accesses

Abstract

Slide transition detection is used to find the images where the slide content changes, which form a summary of the lecture video and save the time for watching the lecture videos. 3D Convolutional Networks (3D ConvNet) has been regarded as an efficient approach to learn spatio-temporal features in videos. However, 3D ConvNet gives the same weight to all features in the image, and can’t focus on key feature information. We solve this problem by using the attention mechanism, which highlights more effective features information by suppressing invalid ones. Furthermore, 3D ConvNet usually costs much training time and needs lots of memory. Dual Path Network (DPN) combines the two network structures of ResNext and DenseNet and has the advantages of them. ResNext adds input directly to the convolved output, which takes advantage of extracted features from the previous hierarchy. DenseNet concatenates the output of each layer to the input of each layer, which extracts new features from the previous hierarchy. Based on the two networks, DPN not only saves training time and memory, but also extracts more effective features and improves training results. Consequently, we present a novel ConvNet architecture based on Convolutional Block Attention and DPN for slide transition detection in lecture videos. Experimental results show that the proposed novel ConvNet architecture achieves the better results than other slide detection approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ma, D., Agam, G.: Lecture video segmentation and indexing. J. Proc. SPIE 8297(1), 48 (2012)

    Google Scholar 

  2. Li, K., Wang, J., Wang, H., Dai, Q.: Structuring lecture videos by automatic projection screen localization and analysis. IEEE Trans. Pattern Anal. Mach. Intell. 37(6), 1233–1246 (2015)

    Article  Google Scholar 

  3. Jaiswal, S., Misra, M.: Automatic indexing of lecture videos using syntactic similarity measures. In: 2018 5th International Conference on Signal Processing and Integrated Networks (SPIN), pp. 164–169 (2018)

    Google Scholar 

  4. Yang, H., Meinel, C.: Content based lecture video retrieval using speech and video text information. IEEE Trans. Learn. Technol. 7(2), 142–154 (2014)

    Article  Google Scholar 

  5. Wang, F., et al.: Residual attention network for image classification. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6450–6458 (2017)

    Google Scholar 

  6. Zhu, Y., Zhao, C., Guo, H., Wang, J., Zhao, X., Lu, H.: Attention CoupleNet: fully convolutional attention coupling network for object detection. IEEE Trans. Image Process. 28(1), 113–126 (2019)

    Article  MathSciNet  Google Scholar 

  7. Fu, J., et al.: Dual attention network for scene segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)

    Google Scholar 

  8. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

    Google Scholar 

  9. Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5987–5995 (2017)

    Google Scholar 

  10. Zagoruyko, S., Komodakis, N.: Wide residual networks. In: BMVC (2016)

    Google Scholar 

  11. Huang, G., Liu, Z., Weinberger, K.Q., Maaten, L.: Densely connected convolutional networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)

    Google Scholar 

  12. Chen, Y., Li, J., Xiao, H., et al.: Dual path networks. arXiv preprint arXiv:1707.01629 (2017)

  13. Woo, S., Park, J., Lee, J.Y., et al.: CBAM: convolutional block attention module. arXiv preprint arXiv:1807.06521 (2018)

  14. Gong, Y., Liu, X.: Video summarization using singular value decomposition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 174–180 (2000)

    Google Scholar 

  15. Mohanta, P.P., Saha, S.K., Chanda, B.: A model-based shot boundary detection technique using frame transition parameter. IEEE Trans. Multimed. 14(1), 223–233 (2012)

    Article  Google Scholar 

Download references

Acknowledgment

This work was supported by the Project of National Natural Science Foundation of (No. 61601278), “Chen Guang” project supported by Shanghai Municipal Education Commission and Shanghai Education Development Foundation (No. 17CG41).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ran Ma .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Guan, M., Li, K., Ma, R., An, P. (2020). Convolutional-Block-Attention Dual Path Networks for Slide Transition Detection in Lecture Videos. In: Zhai, G., Zhou, J., Yang, H., An, P., Yang, X. (eds) Digital TV and Wireless Multimedia Communication. IFTC 2019. Communications in Computer and Information Science, vol 1181. Springer, Singapore. https://doi.org/10.1007/978-981-15-3341-9_9

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-3341-9_9

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-3340-2

  • Online ISBN: 978-981-15-3341-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics