Abstract
Multimedia data learning (a.k.a. multimedia data mining) is an emerging, multidisciplinary, and interdisciplinary research area with a wide spectrum of real-world applications related to a wide suite of areas noticeably including machine learning, artificial intelligence, data mining, multimedia, computer vision, and natural language processing. This chapter introduces important and fundamental concepts and theories of this area and provides further references.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Here we are only concerned with a research area; multimedia may also be referred to industries and even social or societal activities.
References
Y. Altun, I. Tsochantaridis, and T. Hofmann. Hidden Markov support vector machines. In Proc. ICML, Washington DC, August 2003.
P. Auer. On learning from multi-instance examples: empirical evaluation of a theoretical approach. In Proc. ICML, 1997.
T. Back, D.B. Fogel, and Z. Michalewicz. Handbook of Evolutionary Computation. 1997.
Y. Bengio, A. Courville, and P. Vincent. Representation learning: A review and new perspectives. arXiv:1206.5538v3, April 2014.
Y. Bengio, I.J. Goodfellow, and A. Courville. Deep Learning. MIT Press, 2017.
Y. Bengio, Y. LeCun, and D. Henderson. Globally trained handwritten word recognizer using spatial representation, space displacement neural networks, and hidden Markov models. In Proceedings of Advances in Neural Information Processing Systems, 1994.
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, (3):993–1022, 2003.
A. Blum and T. Mitchell. Combining labeled and unlabeled data with co-training. In Proc. Workshop on Computational Learning Theory. Morgan Kaufman Publishers, 1998.
B. E. Boser, I. M. Guyon, and V. N. Vapnik. A training algorithm for optimal margin classifiers. In the 5th Annual ACM Workshop on COLT, pages 144–152, Pittsburgh, PA, 1992.
R. Brachman and H. Levesque. Readings in Knowledge Representation. Morgan Kaufman, 1985.
S. Cebiric, F. Goasdoue, H. Kondylakis, D. Kotzinos, I. Manolescu, and G. Troullinou. Summarizing semantic graphs: A survey. The VLDB Journal, 28(3):295–327, 2019.
D.-R. Chen, Q. Wu, Y. Wing, and D.-X. Zhou. Support vector machine soft margin classifiers: Error analysis. Journal of Machine Learning Research, 5:1143–1175, 2004.
D.G. Childers, D.P. Skinner, and R.C. Kemerait. The cepstrum: A guide to processing. Proceedings of the IEEE, 65(10):1428–1443, 1977.
G. Cooper. Computational complexity of probabilistic inference using Bayesian belief networks (research note). Artificial Intelligence, (42):393–405, 1990.
C. Cortes and V. Vapnik. Support vector machine. Machine Learning, 20(3):273–297, 1995.
C. Cortes and V. Vapnik. Support-vector networks. Machine Learning, 20(3):273–297, 1995.
P. Dagum and M. Luby. Approximating probabilistic reasoning in Bayesian belief networks is np-hard. Artificial Intelligence, 60(1):141–153, 1993.
H. III Daume and D. Marcu. Learning as search optimization: Approximate large margin methods for structured prediction. In Proc. ICML, Bonn, Germany, August 2005.
S. Deerwester, S. Dumais, G. Furnas, T. Landauer, and R. Harshman. Indexing by latent semantic analysis. Journal of American Sociation of Information Science, 41:391–407, 1990.
T.G. Dietterich, R.H. Lathrop, and T. Lozano-Perez. Solving the multiple instance problem with axis-parallel rectangles. Artificial Intelligence, 89:31 ?71, 1997.
H. Drucker, C.J.C. Burges, L. Kaufman, A. Smola, and V. Vapnik. Support vector regression machines. In Advances in Neural Information Processing Systems 9, NIPS 1996, pages 156–161, 1997.
R.O. Duda, P.E. Hart, and D.G. Stork. Pattern Classification (2nd ed.). John Wiley and Sons, 2001.
C. Faloutsos. Searching Multimedia Databases by Content. Kluwer Academic Publishers, 1996.
C. Faloutsos, R. Barber, M. Flickner, J. Hafner, W. Niblack, D. Petkovic, and W. Equitz. Efficient and effective querying by image content. Journal of Intelligent Information Systems, 3(3/4):231–262, 1994.
Y. Freund. Boosting a weak learning algorithm by majority. In Proceedings of the Third Annual Workshop on Computational Learning Theory, 1990.
Y. Freund and R.E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, (55), 1997.
K. Fukunaga. Introduction to Statistical Pattern Recognition (Second Edition). Academic Press, 1990.
B. Furht, editor. Multimedia Systems and Techniques. Kluwer Academic Publishers, 1996.
A. Gersho. Asymptotically optimum block quantization. IEEE Trans. on Information Theory, 25(4):373–380, 1979.
D. Goldberg. Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley Professional, 1989.
J. Goodman. A bit of progress in language modeling. arXiv:cs/0108005v1, 2001.
A. Grossmann and J. Morlet. Decomposition of hardy functions into square integrable wavelets of constant shape. SIAM Journal on Mathematical Analysis, 15(4), 1984.
Z. Guo, Z. Zhang, E.P. Xing, and C. Faloutsos. Enhanced max margin learning on multimodal data mining in a multimedia database. In Proc. ACM International Conference on Knowledge Discovery and Data Mining, 2007.
J. Han and M. Kamber. Data Mining — Concepts and Techniques. Morgan Kaufmann, 2 edition, 2006.
P. Hayes. The logic of frames, 1979.
K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, 2016.
G.E. Hinton and R.R. Salakhutdinov.
T. Hofmann. Unsupervised learning by probabilistic latent semantic analysis. Machine Learning, 42(1):177C196, 2001.
B.K.P. Horn. Robot Vision. MIT Press and McGraw-Hill, 1986.
Jing Huang and S. Ravi Kumar et al. Image indexing using color correlograms. In IEEE Int’l Conf. Computer Vision and Pattern Recognition Proceedings, Puerto Rico, 1997.
R. Jain. Infoscopes: Multimedia information systems. In B. Furht, editor, Multimedia Systems and Techniques. Kluwer Academic Publishers, 1996.
R. Jain, R. Kasturi, and B.G. Schunck. Machine Vision. MIT Press and McGraw-Hill, 1995.
K. Kawaguchi, L.P. Kaelbling, and Y. Bengio. Generalization in deep learning. arXiv:1710.05468v5, 2019.
K.L. Ketner and H. Putnam. Reasoning and the Logic of Things. Harvard University Press, 1992.
A. Krizhevsky, I. Sutskever, and G.E. Hinton. ImageNet classification with deep convolutional neural networks. In Proceedings of Advances in Neural Information Processing Systems, 2012.
A. Kumar, P. Ondruska, M. Iyyer, J. Bradbury, I. Gulrajani, V. Zhong, R. Paulus, and R. Socher. Ask me anything: Dynamic memory networks for natural language processing. arXiv:1506.07285v5, 2016.
M. Langkvist, L. Karlsson, and A. Loutifi. A review of unsupervised feature learning and deep learning for time-series modeling. Pattern Recognition Letters, 42:11–24, 2014.
Y. LeCun. Generalization and network design strategies. Univ. Toronto Computer Science Department Technical Report, CRG-TR-89-4, 1989.
Y. LeCun, B. Boser, J.S. Denker, D. Henderson, R. Howard, W. Hubbard, and L. Jackel. Handwritten digit recognition with a backpropagation neural network. In Proceedings of Advances in Neural Information Processing Systems, 1990.
Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of IEEE, 86(11), Pages=2278-2324, year=1998).
Y. Li, M. Yang, and Z. Zhang. A survey of multi-view representation learning. IEEE Transactions on Knowledge and Data Engineering, 2018.
D. Lowe. Object recognition from local scale-invariant features. In Proc. IEEE International Conference on Computer Vision, September 1999.
O. Maron and T. Lozano-Perez. A framework for multiple instance learning. In Proc. NIPS, 1998.
E. Mayoraz and E. Alpaydin. Support vector machines for multi-class classification. In IWANN (2), pages 833–842, 1999.
T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean. Distributed representation of words and phrases and their compositionality. In Proceedings of Advances in Neural Information Processing Systems, 2013.
M. Minsky. A framework for representing knowledge. In P.H. Winston, editor, The Psychology of Computer Vision. McGraw-Hill, 1975.
S. Papert. One ai or many? In S.R. Graubard, editor, The Artificial Intelligence Debate: False Starts, Real Foundations. MIT Press, 1988.
G. Pass and R. Zabih. Histogram refinement for content-based image retrieval. In IEEE Workshop on Applications of Computer Vision, Sarasota, FL, December 1996.
Z. Pawlak. Rough sets. International Journal of Parallel Programming, 11(5):341–356, 1982.
Jeffrey Pennington, Richard Socher, and Christopher D. Manning. Glove: Global vectors for word representation. In Empirical Methods in Natural Language Processing (EMNLP), pages 1532–1543, 2014.
M. Pradham and P. Dagum. Optimal Monte Carlo estimation of belief network inference. In Proceedings of the Conference on Uncertainty in Artificial Intelligence, pages 446–453, 1996.
W.K. Pratt. Introduction to Digital Image Processing. CRC Press, 2013.
F. Rosenblatt. The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65(6):386–408, 1958.
S. Russell and P. Norvig. Artificial intelligence: A modern approach. Prentice Hall, Upper Saddle River, NJ, 1995.
G. Salton. Developments in automatic text retrieval. Science, 253:974–979, 1991.
R.C. Schank. Dynamic Memory: A Theory of Reminding and Learning in Computers and People. Cambridge University Press, 1990.
R. Schapire. Strength of weak learnability. Journal of Machine Learning, 5:197–227, 1990.
J. Shlens. A tutorial on principal component analysis. arXiv:1404.1100v1, 2014.
K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In Proceedings of International Conference on Learning Representation, 2015.
J.F. Sowa. Conceptual Structures: Information Processing in Mind and Machine. Addison-Wesley, 1984.
J.F. Sowa. Knowledge Representation – Logical, Philosophical, and Computational Foundations. Thomson Learning Publishers, 2000.
J.F. Sowa. Principles of Semantic Networks: Explorations in the Representation of Knowledge. Morgan Kaufmann, 2014.
R. Steinmetz and K. Nahrstedt. Multimedia Fundamentals — Media Coding and Content Processing. Prentice-Hall PTR, 2002.
V.S. Subrahmanian. Principles of Multimedia Database Systems. Morgan Kaufmann, 1998.
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. In Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, 2015.
S.L. Tanimoto. Elements of Artificial Intelligence Using Common LISP. Computer Science Press, 1990.
B. Taskar, C. Guestrin, and D. Koller. Max-margin Markov networks. In Neural Information Processing Systems Conference, 2003.
Y.W. Teh, M.I. Jordan, M.J. Beal, and D.M. Blei. Hierarchical Dirichlet process. Journal of the American Statistical Association, 2006.
E.P.K. Tsang. Foundations of Constraint Satisfaction. Academic Press, 1993.
I. Tsochantaridis, T. Hofmann, T. Joachims, and Y. Altun. Support vector machine learning for interdependent and structured output spaces. In Proc. ICML, Banff, Canada, 2004.
E. Turunen, K. Raivio, and T. Mantere. Soft computing methods. In S. Pohjolainen, editor, Mathematical Modeling. Springer, 2016.
J. Weston, C. Chopra, and A. Bordes. Memory networks. In Proceedings of International Conference on Learning Representation, 2015.
L. A. Zadeh. Fuzzy sets. Information and Control, 8(3):338–353, 1965.
L.A. Zadeh. Fuzzy logic, neural networks, and soft computing. Communications of the ACM, 37(3):77–84, 1994.
S. Zhai, Y. Cheng, W. Lu, and Z. Zhang. Doubly convolutional neural networks. In Proceedings of Advances in Neural Information Processing Systems, 2016.
X. Zhang, J. Zhao, and Y. LeCun. Character-level convolutional networks for text classification. In Proceedings of Advances on Neural Information Processing Systems, 2015.
Z. Zhang, R. Jing, and W. Gu. A new Fourier descriptor based on areas (AFD) and its applications in object recognition. In Proc. of IEEE International Conference on Systems, Man, and Cybernetics. International Academic Publishers, 1988.
Z. Zhang, F. Masseglia, R. Jain, and A. Del Bimbo. Editorial: Introduction to the special issue on multimedia data mining, 2008.
Z. Zhang and R. Zhang. Multimedia Data Mining — A Systematic Introduction to Concepts and Theory. Taylor & Francis, 2008.
X. Zhu. Semi-supervised learning literature survey. Technical Report, 1530, 2005.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Zhang, Z.(., Zhang, R.(. (2023). Multimedia Data Learning. In: Rokach, L., Maimon, O., Shmueli, E. (eds) Machine Learning for Data Science Handbook. Springer, Cham. https://doi.org/10.1007/978-3-031-24628-9_19
Download citation
DOI: https://doi.org/10.1007/978-3-031-24628-9_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-24627-2
Online ISBN: 978-3-031-24628-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)