Multimedia Data Learning

Zhang, Zhongfei (Mark); Zhang, Ruofei (Bruce)

doi:10.1007/978-3-031-24628-9_19

Zhongfei (Mark) Zhang^4,5 &
Ruofei (Bruce) Zhang⁶

1909 Accesses

Abstract

Multimedia data learning (a.k.a. multimedia data mining) is an emerging, multidisciplinary, and interdisciplinary research area with a wide spectrum of real-world applications related to a wide suite of areas noticeably including machine learning, artificial intelligence, data mining, multimedia, computer vision, and natural language processing. This chapter introduces important and fundamental concepts and theories of this area and provides further references.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 219.00; Price excludes VAT (USA)

Hardcover Book: USD 279.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Here we are only concerned with a research area; multimedia may also be referred to industries and even social or societal activities.

References

http://haixun.olidu.com/probase/index.htm.
https://developers.google.com/knowledge-graph/.
https://en.wikipedia.org/wiki/freebase.
http://wordnet.princeton.edu/.
http://www.image-net.org/.
Y. Altun, I. Tsochantaridis, and T. Hofmann. Hidden Markov support vector machines. In Proc. ICML, Washington DC, August 2003.
Google Scholar
P. Auer. On learning from multi-instance examples: empirical evaluation of a theoretical approach. In Proc. ICML, 1997.
Google Scholar
T. Back, D.B. Fogel, and Z. Michalewicz. Handbook of Evolutionary Computation. 1997.
Google Scholar
Y. Bengio, A. Courville, and P. Vincent. Representation learning: A review and new perspectives. arXiv:1206.5538v3, April 2014.
Google Scholar
Y. Bengio, I.J. Goodfellow, and A. Courville. Deep Learning. MIT Press, 2017.
MATH Google Scholar
Y. Bengio, Y. LeCun, and D. Henderson. Globally trained handwritten word recognizer using spatial representation, space displacement neural networks, and hidden Markov models. In Proceedings of Advances in Neural Information Processing Systems, 1994.
Google Scholar
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, (3):993–1022, 2003.
MATH Google Scholar
A. Blum and T. Mitchell. Combining labeled and unlabeled data with co-training. In Proc. Workshop on Computational Learning Theory. Morgan Kaufman Publishers, 1998.
Google Scholar
B. E. Boser, I. M. Guyon, and V. N. Vapnik. A training algorithm for optimal margin classifiers. In the 5th Annual ACM Workshop on COLT, pages 144–152, Pittsburgh, PA, 1992.
Google Scholar
R. Brachman and H. Levesque. Readings in Knowledge Representation. Morgan Kaufman, 1985.
MATH Google Scholar
S. Cebiric, F. Goasdoue, H. Kondylakis, D. Kotzinos, I. Manolescu, and G. Troullinou. Summarizing semantic graphs: A survey. The VLDB Journal, 28(3):295–327, 2019.
Article Google Scholar
D.-R. Chen, Q. Wu, Y. Wing, and D.-X. Zhou. Support vector machine soft margin classifiers: Error analysis. Journal of Machine Learning Research, 5:1143–1175, 2004.
MathSciNet MATH Google Scholar
D.G. Childers, D.P. Skinner, and R.C. Kemerait. The cepstrum: A guide to processing. Proceedings of the IEEE, 65(10):1428–1443, 1977.
Article Google Scholar
G. Cooper. Computational complexity of probabilistic inference using Bayesian belief networks (research note). Artificial Intelligence, (42):393–405, 1990.
Article MathSciNet MATH Google Scholar
C. Cortes and V. Vapnik. Support vector machine. Machine Learning, 20(3):273–297, 1995.
Article MATH Google Scholar
C. Cortes and V. Vapnik. Support-vector networks. Machine Learning, 20(3):273–297, 1995.
Article MATH Google Scholar
P. Dagum and M. Luby. Approximating probabilistic reasoning in Bayesian belief networks is np-hard. Artificial Intelligence, 60(1):141–153, 1993.
Article MathSciNet MATH Google Scholar
H. III Daume and D. Marcu. Learning as search optimization: Approximate large margin methods for structured prediction. In Proc. ICML, Bonn, Germany, August 2005.
Google Scholar
S. Deerwester, S. Dumais, G. Furnas, T. Landauer, and R. Harshman. Indexing by latent semantic analysis. Journal of American Sociation of Information Science, 41:391–407, 1990.
Article Google Scholar
T.G. Dietterich, R.H. Lathrop, and T. Lozano-Perez. Solving the multiple instance problem with axis-parallel rectangles. Artificial Intelligence, 89:31 ?71, 1997.
Google Scholar
H. Drucker, C.J.C. Burges, L. Kaufman, A. Smola, and V. Vapnik. Support vector regression machines. In Advances in Neural Information Processing Systems 9, NIPS 1996, pages 156–161, 1997.
Google Scholar
R.O. Duda, P.E. Hart, and D.G. Stork. Pattern Classification (2nd ed.). John Wiley and Sons, 2001.
Google Scholar
C. Faloutsos. Searching Multimedia Databases by Content. Kluwer Academic Publishers, 1996.
Book MATH Google Scholar
C. Faloutsos, R. Barber, M. Flickner, J. Hafner, W. Niblack, D. Petkovic, and W. Equitz. Efficient and effective querying by image content. Journal of Intelligent Information Systems, 3(3/4):231–262, 1994.
Article Google Scholar
Y. Freund. Boosting a weak learning algorithm by majority. In Proceedings of the Third Annual Workshop on Computational Learning Theory, 1990.
Google Scholar
Y. Freund and R.E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, (55), 1997.
Google Scholar
K. Fukunaga. Introduction to Statistical Pattern Recognition (Second Edition). Academic Press, 1990.
MATH Google Scholar
B. Furht, editor. Multimedia Systems and Techniques. Kluwer Academic Publishers, 1996.
Google Scholar
A. Gersho. Asymptotically optimum block quantization. IEEE Trans. on Information Theory, 25(4):373–380, 1979.
Article MathSciNet MATH Google Scholar
D. Goldberg. Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley Professional, 1989.
MATH Google Scholar
J. Goodman. A bit of progress in language modeling. arXiv:cs/0108005v1, 2001.
Google Scholar
A. Grossmann and J. Morlet. Decomposition of hardy functions into square integrable wavelets of constant shape. SIAM Journal on Mathematical Analysis, 15(4), 1984.
Google Scholar
Z. Guo, Z. Zhang, E.P. Xing, and C. Faloutsos. Enhanced max margin learning on multimodal data mining in a multimedia database. In Proc. ACM International Conference on Knowledge Discovery and Data Mining, 2007.
Google Scholar
J. Han and M. Kamber. Data Mining — Concepts and Techniques. Morgan Kaufmann, 2 edition, 2006.
Google Scholar
P. Hayes. The logic of frames, 1979.
Google Scholar
K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, 2016.
Google Scholar
G.E. Hinton and R.R. Salakhutdinov.
Google Scholar
T. Hofmann. Unsupervised learning by probabilistic latent semantic analysis. Machine Learning, 42(1):177C196, 2001.
Google Scholar
B.K.P. Horn. Robot Vision. MIT Press and McGraw-Hill, 1986.
Google Scholar
Jing Huang and S. Ravi Kumar et al. Image indexing using color correlograms. In IEEE Int’l Conf. Computer Vision and Pattern Recognition Proceedings, Puerto Rico, 1997.
Google Scholar
R. Jain. Infoscopes: Multimedia information systems. In B. Furht, editor, Multimedia Systems and Techniques. Kluwer Academic Publishers, 1996.
Google Scholar
R. Jain, R. Kasturi, and B.G. Schunck. Machine Vision. MIT Press and McGraw-Hill, 1995.
Google Scholar
K. Kawaguchi, L.P. Kaelbling, and Y. Bengio. Generalization in deep learning. arXiv:1710.05468v5, 2019.
Google Scholar
K.L. Ketner and H. Putnam. Reasoning and the Logic of Things. Harvard University Press, 1992.
Google Scholar
A. Krizhevsky, I. Sutskever, and G.E. Hinton. ImageNet classification with deep convolutional neural networks. In Proceedings of Advances in Neural Information Processing Systems, 2012.
Google Scholar
A. Kumar, P. Ondruska, M. Iyyer, J. Bradbury, I. Gulrajani, V. Zhong, R. Paulus, and R. Socher. Ask me anything: Dynamic memory networks for natural language processing. arXiv:1506.07285v5, 2016.
Google Scholar
M. Langkvist, L. Karlsson, and A. Loutifi. A review of unsupervised feature learning and deep learning for time-series modeling. Pattern Recognition Letters, 42:11–24, 2014.
Article Google Scholar
Y. LeCun. Generalization and network design strategies. Univ. Toronto Computer Science Department Technical Report, CRG-TR-89-4, 1989.
Google Scholar
Y. LeCun, B. Boser, J.S. Denker, D. Henderson, R. Howard, W. Hubbard, and L. Jackel. Handwritten digit recognition with a backpropagation neural network. In Proceedings of Advances in Neural Information Processing Systems, 1990.
Google Scholar
Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of IEEE, 86(11), Pages=2278-2324, year=1998).
Google Scholar
Y. Li, M. Yang, and Z. Zhang. A survey of multi-view representation learning. IEEE Transactions on Knowledge and Data Engineering, 2018.
Google Scholar
D. Lowe. Object recognition from local scale-invariant features. In Proc. IEEE International Conference on Computer Vision, September 1999.
Google Scholar
O. Maron and T. Lozano-Perez. A framework for multiple instance learning. In Proc. NIPS, 1998.
Google Scholar
E. Mayoraz and E. Alpaydin. Support vector machines for multi-class classification. In IWANN (2), pages 833–842, 1999.
Google Scholar
T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean. Distributed representation of words and phrases and their compositionality. In Proceedings of Advances in Neural Information Processing Systems, 2013.
Google Scholar
M. Minsky. A framework for representing knowledge. In P.H. Winston, editor, The Psychology of Computer Vision. McGraw-Hill, 1975.
Google Scholar
S. Papert. One ai or many? In S.R. Graubard, editor, The Artificial Intelligence Debate: False Starts, Real Foundations. MIT Press, 1988.
Google Scholar
G. Pass and R. Zabih. Histogram refinement for content-based image retrieval. In IEEE Workshop on Applications of Computer Vision, Sarasota, FL, December 1996.
Google Scholar
Z. Pawlak. Rough sets. International Journal of Parallel Programming, 11(5):341–356, 1982.
MATH Google Scholar
Jeffrey Pennington, Richard Socher, and Christopher D. Manning. Glove: Global vectors for word representation. In Empirical Methods in Natural Language Processing (EMNLP), pages 1532–1543, 2014.
Google Scholar
M. Pradham and P. Dagum. Optimal Monte Carlo estimation of belief network inference. In Proceedings of the Conference on Uncertainty in Artificial Intelligence, pages 446–453, 1996.
Google Scholar
W.K. Pratt. Introduction to Digital Image Processing. CRC Press, 2013.
Book Google Scholar
F. Rosenblatt. The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65(6):386–408, 1958.
Article Google Scholar
S. Russell and P. Norvig. Artificial intelligence: A modern approach. Prentice Hall, Upper Saddle River, NJ, 1995.
Google Scholar
G. Salton. Developments in automatic text retrieval. Science, 253:974–979, 1991.
Article MathSciNet Google Scholar
R.C. Schank. Dynamic Memory: A Theory of Reminding and Learning in Computers and People. Cambridge University Press, 1990.
Google Scholar
R. Schapire. Strength of weak learnability. Journal of Machine Learning, 5:197–227, 1990.
Article Google Scholar
J. Shlens. A tutorial on principal component analysis. arXiv:1404.1100v1, 2014.
Google Scholar
K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In Proceedings of International Conference on Learning Representation, 2015.
Google Scholar
J.F. Sowa. Conceptual Structures: Information Processing in Mind and Machine. Addison-Wesley, 1984.
MATH Google Scholar
J.F. Sowa. Knowledge Representation – Logical, Philosophical, and Computational Foundations. Thomson Learning Publishers, 2000.
Google Scholar
J.F. Sowa. Principles of Semantic Networks: Explorations in the Representation of Knowledge. Morgan Kaufmann, 2014.
MATH Google Scholar
R. Steinmetz and K. Nahrstedt. Multimedia Fundamentals — Media Coding and Content Processing. Prentice-Hall PTR, 2002.
Google Scholar
V.S. Subrahmanian. Principles of Multimedia Database Systems. Morgan Kaufmann, 1998.
Google Scholar
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. In Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, 2015.
Google Scholar
S.L. Tanimoto. Elements of Artificial Intelligence Using Common LISP. Computer Science Press, 1990.
Google Scholar
B. Taskar, C. Guestrin, and D. Koller. Max-margin Markov networks. In Neural Information Processing Systems Conference, 2003.
Google Scholar
Y.W. Teh, M.I. Jordan, M.J. Beal, and D.M. Blei. Hierarchical Dirichlet process. Journal of the American Statistical Association, 2006.
Google Scholar
E.P.K. Tsang. Foundations of Constraint Satisfaction. Academic Press, 1993.
Google Scholar
I. Tsochantaridis, T. Hofmann, T. Joachims, and Y. Altun. Support vector machine learning for interdependent and structured output spaces. In Proc. ICML, Banff, Canada, 2004.
Google Scholar
E. Turunen, K. Raivio, and T. Mantere. Soft computing methods. In S. Pohjolainen, editor, Mathematical Modeling. Springer, 2016.
Google Scholar
J. Weston, C. Chopra, and A. Bordes. Memory networks. In Proceedings of International Conference on Learning Representation, 2015.
Google Scholar
L. A. Zadeh. Fuzzy sets. Information and Control, 8(3):338–353, 1965.
Article MathSciNet MATH Google Scholar
L.A. Zadeh. Fuzzy logic, neural networks, and soft computing. Communications of the ACM, 37(3):77–84, 1994.
Article Google Scholar
S. Zhai, Y. Cheng, W. Lu, and Z. Zhang. Doubly convolutional neural networks. In Proceedings of Advances in Neural Information Processing Systems, 2016.
Google Scholar
X. Zhang, J. Zhao, and Y. LeCun. Character-level convolutional networks for text classification. In Proceedings of Advances on Neural Information Processing Systems, 2015.
Google Scholar
Z. Zhang, R. Jing, and W. Gu. A new Fourier descriptor based on areas (AFD) and its applications in object recognition. In Proc. of IEEE International Conference on Systems, Man, and Cybernetics. International Academic Publishers, 1988.
Google Scholar
Z. Zhang, F. Masseglia, R. Jain, and A. Del Bimbo. Editorial: Introduction to the special issue on multimedia data mining, 2008.
Google Scholar
Z. Zhang and R. Zhang. Multimedia Data Mining — A Systematic Introduction to Concepts and Theory. Taylor & Francis, 2008.
Book MATH Google Scholar
X. Zhu. Semi-supervised learning literature survey. Technical Report, 1530, 2005.
Google Scholar

Download references

Author information

Authors and Affiliations

Binghamton University, State University of New York, Binghamton, NY, USA
Zhongfei (Mark) Zhang
Microsoft AI & Research, Sunnyvale, CA, USA
Zhongfei (Mark) Zhang
Google, Mountain View, CA, USA
Ruofei (Bruce) Zhang

Authors

Zhongfei (Mark) Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ruofei (Bruce) Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhongfei (Mark) Zhang .

Editor information

Editors and Affiliations

Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, Israel
Lior Rokach
Department of Industrial Engineering, Tel Aviv University, Ramat Aviv, Israel
Oded Maimon
Department of Industrial Engineering, Tel Aviv University, Tel Aviv, Israel
Erez Shmueli

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Zhang, Z.(., Zhang, R.(. (2023). Multimedia Data Learning. In: Rokach, L., Maimon, O., Shmueli, E. (eds) Machine Learning for Data Science Handbook. Springer, Cham. https://doi.org/10.1007/978-3-031-24628-9_19

Download citation

DOI: https://doi.org/10.1007/978-3-031-24628-9_19
Published: 26 February 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-24627-2
Online ISBN: 978-3-031-24628-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics