Abstract
The aim of this paper is to categorize movies into genres using the previews. Our study attempts to combine audio, visual and text features to classify a collection of movie previews into action, biography, comedy, and horror. For each of the collected previews, the audio and visual features are extracted and the text features are drawn from social tags via social websites. The probabilistic latent semantic analysis (PLSA) is used to incorporate the features from these three different aspects of information. The standard PLSA processes one type of information only. Therefore double-model and triple-model PLSAs are extended in order to combine two or three different types of information. We compare these various variants of PLSA approaches with unimodal PLSAs, which use either audio, visual or text features only. The experimental results show not only that one of the triple-model PLSAs achieves the highest accuracy, but also that social tags (text features) play an important role for classifying movies genres.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Brezeale, D., Cook, D.J.: Using closed captions and visual features to classify movies by genre. In: Poster Session of the Seventh International Workshop on Multimedia Data Mining (2006)
Moncrieff, S., Venkatesh, S., Dorai, C.: Horror film genre typing and scene labeling via audio analysis. In: Multimedia and Expo, 2003. ICME’03. Proceedings. 2003 International Conference on IEEE vol. 2, pp. II–193 (2003)
Arijon, D.: Grammar of the Film Language. Focal Press, London (1976)
Nam, J., Tewfik, A.H.: Combined audio and visual streams analysis for video sequence segmentation. In: 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing. Munich (1997)
Pfeiffer, S., Fischer, S., Effelsberg, W.: Automatic audio content analysis. In: Proceedings of the fourth ACM international conference on Multimedia, New York, NY, pp. 21–30 (1997)
Wang, Y., Huang, J., Liu, Z., Chen, T.: Multimedia content classification using motion and audio information. In: Proceedings of 1997 IEEE International Symposium on Circuits and Systems on IEEE, vol. 2, pp. 1488–1491 (1997)
Liu, Z., Wang, Y., Chen, T.: Audio feature extraction and analysis for scene segmentation and classification. J. VLSI Sig. Proc. Syst. Sig., Image Video Technol. 20(1-2), 61–79 (1998)
Jain, S.K., Jadon, R.S.: Movies genres classifier using neural network. In: 24th IEEE International Symposium on Computer and Information Sciences. Guzelyurt (2009)
Rasheed, Z., Shah, M.: Movie genre classification by exploiting audio-visual features of previews. In: 16th International Conference on Pattern Recognition, IEEE, vol. 2, pp. 1086–1089 (2002)
Internet Movie Database: http://www.imdb.com/
Douban Movies: http://movie.douban.com/
Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. Mach. Learn. 42, 177–196 (2001)
Levy, M., Sandler, M.: Music information retrieval using social tags and audio. IEEE Trans. Multimedia 11(3), 383–395 (2009)
Lienhart, R., Romberg, S., Horster, E.: Multilayer PLSA for multimodal image retrieval. In: Proceedings of ACM International Conference on Image and Video Retrieval. New York, NY, USA (2009)
Jang, J.S.: Audio signal processing and recognition. Available at the links for on-line courses at: http://jang/books/audioSignalProcessing
Zhang, T., Jay Kuo, C.-C.: Audio content analysis for online audiovisual data segmentation and classification. In: IEEE Transactions on Speech and Audio Processing (2001)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Hong, HZ., Hwang, JI.G. (2015). Multimodal PLSA for Movie Genre Classification. In: Schwenker, F., Roli, F., Kittler, J. (eds) Multiple Classifier Systems. MCS 2015. Lecture Notes in Computer Science(), vol 9132. Springer, Cham. https://doi.org/10.1007/978-3-319-20248-8_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-20248-8_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-20247-1
Online ISBN: 978-3-319-20248-8
eBook Packages: Computer ScienceComputer Science (R0)