Education and Information Technologies

, Volume 24, Issue 6, pp 3243–3268 | Cite as

A semi-automatic metadata extraction model and method for video-based e-learning contents

  • Saurabh Pal
  • Pijush Kanti Dutta PramanikEmail author
  • Tripti Majumdar
  • Prasenjit Choudhury


Video-based learning offers a learner a self-paced, lucid, memorizable, and a flexible way of learning. The availability of abundant educational video materials on the web has certainly abetted an individual’s learning means. But the lack of necessary information about the videos makes it difficult for the learner to search and select the exact video as per his/her requirement and suitability in terms of the learner’s learning capability and the material’s relevancy, difficulty level, etc. Educational video recommendation systems also suffer from a similar problem. Extracting the required metadata, by different means, from the learning videos is a plausible solution. Despite the credible research efforts on video metadata extraction, the problem of educational video metadata extraction has been overlooked. This paper proposes a comprehensive approach to extract educational metadata from a learning video. A semiautomatic mechanism that includes manual and computational approaches is introduced for metadata extraction and to evaluate the values of these metadata. Along with identifying a set of specific metadata attributes from IEEE LOM, few additional attributes are suggested which are imperative to assess the suitability of a video-based learning object in terms of the personalized preference and suitability of a learner. The test results are validated by comparing with the manually extracted metadata by experts, on the same videos. The outcome establishes the promising effectiveness of the approach.


Video metadata extraction Video-based learning Metadata IEEE LOM Speech-to-text conversion Educational recommendation system 



  1. Algur, S. P., & Bhat, P. (2016). Web Video Mining: Metadata Predictive Analysis using Classification Techniques. International Journal of Information Technology and Computer Science, 2, 68–76.Google Scholar
  2. Alves, M. B., Damásio, C. V., & Correia, N. (2015). Extracting facebook multimedia contents metadata as media annotation. In P. Klinov & D. Mouromtsev (Eds.), Knowledge Engineering and Semantic Web (pp. 243–252). Moscow: Springer.CrossRefGoogle Scholar
  3. Anusuya, M. A., & Katti, S. K. (2009). Speech Recognition by Machine A Review. International Journal of Computer Science and Information Security, 6(3), 181–205.Google Scholar
  4. Balagopalan, A. et al. (2012). Automatic keyphrase extraction and segmentation of video lectures . Kerala, IEEE International Conference on Technology Enhanced Education (ICTEE).Google Scholar
  5. Balasubramanian, V., Doraisamy, S. G., & Kanakarajan, N. K. (2016). A multimodal approach for extracting content descriptive metadata from lecture videos. Journal of Intelligent Information Systems, 46(1), 121–145.CrossRefGoogle Scholar
  6. Bolettieri, P., Falchi, F., Gennaro, C., & Rabitti, F. (2007). Automatic metadata extraction and indexing for reusing e-learning multimedia object. Bavaria: ACM Workshop on The Many Faces of Multimedia Semantics.CrossRefGoogle Scholar
  7. Changuel, S., & Labroche, N. (2012). Content independent metadata production as a machine learning problem. In P. Perner (Ed.), Machine learning and data mining in pattern Recognition (pp. 306–320). Heidelberg: Springer.CrossRefGoogle Scholar
  8. CSU Northridge Oviatt Library (2019). What are digital learning objects?. [Online] Available at: Accessed 12 Mar 2019.
  9. Gibbon, D. C., Liu, Z., Basso, A., & Shahraray, B. (2013). Automated content metadata extraction services based on MPEG standards. The Computer Journal, 56(5), 628–645.CrossRefGoogle Scholar
  10. Gruber, T. (1995). Toward principles for the design of ontologies used for knowledge sharing. International Journal of Human-Computer Studies, 45(5–6), 907–928.CrossRefGoogle Scholar
  11. Gunter, G. A., & Kenny, R. (2004). Video in the classroom: learning objects or objects of learning? Chicago: Association for Educational Communications and Technology.Google Scholar
  12. Hentschel, C., Blümel, I., & Sack, H. (2013). Automatic annotation of scientific video material based on visual concept detection. Graz: International Conference on Knowledge Management and Knowledge Technologies.CrossRefGoogle Scholar
  13. IEEE Computer Society. (2002). 1484.12.1 IEEE Standard for Learning Object Metadata. New York: The Institute of Electrical and Electronics Engineers.Google Scholar
  14. Institute for Teaching and Learning Innovation (2018). Pedagogical benefits. [Online] Available at: Accessed Sept 2018.
  15. Khurana, K., & Chandak, M. B. (2013). Study of various video annotation techniques. International Journal of Advanced Research in Computer and Communication Engineering, 2(1), 909–914.Google Scholar
  16. Kothawade, A. Y., & Patil, D. R. (2016). Retrieving Instructional Video Content from Speech and Text Information. In S. Satapathy, Y. Bhatt, A. Joshi, & D. Mishra (Eds.), Advances in Intelligent Systems and Computing (pp. 311–322). Singapore: Springer.Google Scholar
  17. Lee, H.-Y., et al. (2014). Spoken knowledge organization by semantic structuring and a prototype course lecture system for personalized learning. IEEE/ACM Transaction on Audio, Speech, and Language Processing, 22(5), 883–898.CrossRefGoogle Scholar
  18. Linfield College (2018). Why use digital video? [Online] Available at: Accessed Sept 2018].
  19. LoveToKnow (2018). Keyword outline example. [Online] Available at: Accessed Sept 2018.
  20. Maniar, N., Bennett, E., Hand, S., & Allan, G. (2008). The effect of mobile phone screen size on video based learning. Journal of Software, 3(4), 51–61.CrossRefGoogle Scholar
  21. Mori, S., Nishida, H., & Yamada, H. (1999). Optical character recognition. New York: John Wiley & Sons.Google Scholar
  22. Noy, N. F., & Mcguinness, D. L. (2001). Ontology development 101: A guide to creating your first ontology. Stanford: Stanford University.Google Scholar
  23. Othman, E. H., Abdelali, S., & Jaber, E. B. (2016). Education data mining: Mining MOOCs video using meta data based approach. Tangier: IEEE International Colloquium on Information Science and Technology (CiSt).Google Scholar
  24. Pal, S., Mukhopadhyay, M., Pramanik, P. K. D., & Choudhury, P. (2018). Assessing the learning difficulty of text-based learning materials. Da Nang city: Frontiers of Intelligent Computing: Theory and Application.Google Scholar
  25. Pal, S., Pramanik, P. K. D. & Choudhury, P., 2019. A step towards smart learning: Designing an interactive video-based M-learning system for educational institutes. International Journal of Web-Based Learning and Teaching Technologies , 14(4).Google Scholar
  26. Pramanik, P. K. D., Choudhury, P. & Saha, A., 2017. Economical Supercomputing thru smartphone crowd computing: An assessment of opportunities, benefits, deterrents, and applications from India’s Perspective. Coimbatore, International Conference on Advanced Computing and Communication Systems.Google Scholar
  27. Radha, N. (2016). Video retrieval using speech and text in video. Coimbatore: International Conference on Inventive Computation Technologies (ICICT).CrossRefGoogle Scholar
  28. Rafferty, J., Nugent, C., Liu, J. & Chen, L. (2015). Automatic metadata generation through analysis of narration within instructional video. Journal of Medical System, 39, (9).Google Scholar
  29. Rangaswamy, S., Ghosh, S., Jha, S., & Ramalingam, S. (2016). Metadata extraction and classification of YouTube videos using sentiment analysis. Orlando: IEEE International Carnahan Conference on Security Technology (ICCST).CrossRefGoogle Scholar
  30. Rouse, M. (2005). Ontology. [Online] Available at: Accessed Sept 2018.
  31. Singh, R. K., & Singh, R. (2014). Emerging role of ontology in semantic web:developmental prospective. International Journal of Advanced Research in Computer Science and Software Engineering, 4(7), 301–307.Google Scholar
  32. Spyrou, E., Tolias, G., Mylonas, P., & Avrithis, Y. (2009). Concept detection and keyframe extraction using a visual thesaurus. Multimedia Tools and Applications, 41(3), 337–373.CrossRefGoogle Scholar
  33. Truong, T.-D., et al. (2018). Video search based on semantic extraction and locally regional object proposal. In K. Schoeffmann et al. (Eds.), MultiMedia Modeling (pp. 451–456). Bangkok: Springer.CrossRefGoogle Scholar
  34. VARK Learn Limited (2018). The VARK Modalities. [Online] Available at: Accessed 9 12 2018].
  35. Waitelonis, J., Plank, M., & Sack, H. (2016). TIB|AV-Portal: Integrating Automatically Generated Video Annotations into the Web of Data. In N. Fuhr, L. Kovács, T. Risse, & W. Nejdl (Eds.), Research and advanced technology for digital libraries (pp. 429–433). Hannover: Springer.CrossRefGoogle Scholar
  36. Yang, H., & Meinel, C. (2014). Content based lecture video retrieval using speech and video text information. IEEE Transactions on Learning Technologies, 7(2), 142–154.CrossRefGoogle Scholar
  37. Yang, H., et al. (2011). Lecture video indexing and analysis using video OCR technology. Dijon: International Conference on Signal Image Technology & Internet-Based Systems.CrossRefGoogle Scholar
  38. Zhou, H., & Pang, G. K. (2010). Metadata extraction and organization for intelligent video surveillance. Xi'an: IEEE International Conference on Mechatronics and Automation.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of Computer Science & EngineeringBengal Institute of TechnologyKolkataIndia
  2. 2.Department of Computer Science & EngineeringNational Institute of TechnologyDurgapurIndia

Personalised recommendations