Abstract
Video and music clips in MTV match together in particular ways to produce attractive effect. In this paper, we use a dual-wing harmonium model to learn and represent the underlying association patterns between music and video clips in professional MTV. We also use the discovered patterns to facilitate automatic MTV generation. Provided with a raw video and certain professional MTV as template, we generate a new MTV by efficiently inferring the most related video clip for every music clip based on the trained model. Our method shows encouraging result compared with other automatic MTV generation approach.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Block, B.A.: The visual story: seeing the structure of film, TV, and new media. Focal Press, Boston (2001)
Sam, Y., Eugenia, L., et al.: The automatic video editor. In: Proceedings of the eleventh ACM international conference on Multimedia, Berkeley, CA, USA. ACM, New York (2003)
Xian-Sheng, H.U.A., Lie, L.U., et al.: Automatic music video generation based on temporal pattern analysis. In: Proceedings of the 12th annual ACM international conference on Multimedia. ACM, New York (2004)
Welling, M., Rosen-Zvi, M., et al.: Exponential family harmoniums with an application to information retrieval. In: Advances in Neural Information Processing Systems, vol. 17, pp. 1481–1488 (2005)
Xing, E., Yan, R., et al.: Mining associated text and images with dual-wing harmoniums. In: Proceedings of the 21th Annual Conf. on Uncertainty in Artificial Intelligence (UAI 2005). AUAI press (2005)
Foote, J., Cooper, M., et al.: Creating music videos using automatic media analysis. In: Proceedings of the tenth ACM international conference on Multimedia, pp. 553–560 (2002)
Wang, J., Xu, C., et al.: Automatic generation of personalized music sports video. In: Proceedings of the 13th annual ACM international conference on Multimedia, pp. 735–744 (2005)
Xian-Sheng, H.U.A., Lie, L.U., et al.: P-Karaoke: personalized karaoke system. In: Proceedings of the 12th annual ACM international conference on Multimedia. ACM, New York (2004)
Ngo, C.W., Pong, T.C., et al.: Motion analysis and segmentation through spatio-temporal slices processing. IEEE Transactions on Image Processing 12(3), 341–355 (2003)
Smolensky, P.: Information processing in dynamical systems: foundations of harmony theory. Mit Press Computational Models Of Cognition And Perception Series, pp. 194–281 (1986)
Hinton, G.E.: Training Products of Experts by Minimizing Contrastive Divergence. Neural Computation 14(8), 1771–1800 (2002)
DAn Ellis. Comparing features statistics: MFCCs, MSGs, etc (1999), http://www.icsi.berkeley.edu/~dpwe/respite/multistream/msgmfcc.html
Yang, J., Liu, Y., et al.: Harmonium Models for Semantic Video Representation and Classification. In: SIAM Conf. Data Mining (2007)
Arun, H., Kiho, H., et al.: Comparison of sequence matching techniques for video copy detection. In: Minerva, M.Y., Chung-Sheng, L., Rainer, W.L. (eds.) SPIE, vol. 4676, pp. 194–201 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Liao, C., Wang, P.P., Zhang, Y. (2009). Mining Association Patterns between Music and Video Clips in Professional MTV. In: Huet, B., Smeaton, A., Mayer-Patel, K., Avrithis, Y. (eds) Advances in Multimedia Modeling . MMM 2009. Lecture Notes in Computer Science, vol 5371. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-92892-8_41
Download citation
DOI: https://doi.org/10.1007/978-3-540-92892-8_41
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-92891-1
Online ISBN: 978-3-540-92892-8
eBook Packages: Computer ScienceComputer Science (R0)