Skip to main content

Multimodal PLSA for Movie Genre Classification

  • Conference paper
  • First Online:
Multiple Classifier Systems (MCS 2015)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9132))

Included in the following conference series:

Abstract

The aim of this paper is to categorize movies into genres using the previews. Our study attempts to combine audio, visual and text features to classify a collection of movie previews into action, biography, comedy, and horror. For each of the collected previews, the audio and visual features are extracted and the text features are drawn from social tags via social websites. The probabilistic latent semantic analysis (PLSA) is used to incorporate the features from these three different aspects of information. The standard PLSA processes one type of information only. Therefore double-model and triple-model PLSAs are extended in order to combine two or three different types of information. We compare these various variants of PLSA approaches with unimodal PLSAs, which use either audio, visual or text features only. The experimental results show not only that one of the triple-model PLSAs achieves the highest accuracy, but also that social tags (text features) play an important role for classifying movies genres.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Brezeale, D., Cook, D.J.: Using closed captions and visual features to classify movies by genre. In: Poster Session of the Seventh International Workshop on Multimedia Data Mining (2006)

    Google Scholar 

  2. Moncrieff, S., Venkatesh, S., Dorai, C.: Horror film genre typing and scene labeling via audio analysis. In: Multimedia and Expo, 2003. ICME’03. Proceedings. 2003 International Conference on IEEE vol. 2, pp. II–193 (2003)

    Google Scholar 

  3. Arijon, D.: Grammar of the Film Language. Focal Press, London (1976)

    Google Scholar 

  4. Nam, J., Tewfik, A.H.: Combined audio and visual streams analysis for video sequence segmentation. In: 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing. Munich (1997)

    Google Scholar 

  5. Pfeiffer, S., Fischer, S., Effelsberg, W.: Automatic audio content analysis. In: Proceedings of the fourth ACM international conference on Multimedia, New York, NY, pp. 21–30 (1997)

    Google Scholar 

  6. Wang, Y., Huang, J., Liu, Z., Chen, T.: Multimedia content classification using motion and audio information. In: Proceedings of 1997 IEEE International Symposium on Circuits and Systems on IEEE, vol. 2, pp. 1488–1491 (1997)

    Google Scholar 

  7. Liu, Z., Wang, Y., Chen, T.: Audio feature extraction and analysis for scene segmentation and classification. J. VLSI Sig. Proc. Syst. Sig., Image Video Technol. 20(1-2), 61–79 (1998)

    Article  Google Scholar 

  8. Jain, S.K., Jadon, R.S.: Movies genres classifier using neural network. In: 24th IEEE International Symposium on Computer and Information Sciences. Guzelyurt (2009)

    Google Scholar 

  9. Rasheed, Z., Shah, M.: Movie genre classification by exploiting audio-visual features of previews. In: 16th International Conference on Pattern Recognition, IEEE, vol. 2, pp. 1086–1089 (2002)

    Google Scholar 

  10. Internet Movie Database: http://www.imdb.com/

  11. Douban Movies: http://movie.douban.com/

  12. Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. Mach. Learn. 42, 177–196 (2001)

    Article  MATH  Google Scholar 

  13. Levy, M., Sandler, M.: Music information retrieval using social tags and audio. IEEE Trans. Multimedia 11(3), 383–395 (2009)

    Article  Google Scholar 

  14. Lienhart, R., Romberg, S., Horster, E.: Multilayer PLSA for multimodal image retrieval. In: Proceedings of ACM International Conference on Image and Video Retrieval. New York, NY, USA (2009)

    Google Scholar 

  15. Jang, J.S.: Audio signal processing and recognition. Available at the links for on-line courses at: http://jang/books/audioSignalProcessing

  16. Zhang, T., Jay Kuo, C.-C.: Audio content analysis for online audiovisual data segmentation and classification. In: IEEE Transactions on Speech and Audio Processing (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jen-Ing G. Hwang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Hong, HZ., Hwang, JI.G. (2015). Multimodal PLSA for Movie Genre Classification. In: Schwenker, F., Roli, F., Kittler, J. (eds) Multiple Classifier Systems. MCS 2015. Lecture Notes in Computer Science(), vol 9132. Springer, Cham. https://doi.org/10.1007/978-3-319-20248-8_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-20248-8_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-20247-1

  • Online ISBN: 978-3-319-20248-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics