Skip to main content

Multimedia Data Learning

  • Chapter
  • First Online:
Machine Learning for Data Science Handbook

Abstract

Multimedia data learning (a.k.a. multimedia data mining) is an emerging, multidisciplinary, and interdisciplinary research area with a wide spectrum of real-world applications related to a wide suite of areas noticeably including machine learning, artificial intelligence, data mining, multimedia, computer vision, and natural language processing. This chapter introduces important and fundamental concepts and theories of this area and provides further references.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 219.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 279.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Here we are only concerned with a research area; multimedia may also be referred to industries and even social or societal activities.

References

  1. http://haixun.olidu.com/probase/index.htm.

  2. https://developers.google.com/knowledge-graph/.

  3. https://en.wikipedia.org/wiki/freebase.

  4. http://wordnet.princeton.edu/.

  5. http://www.image-net.org/.

  6. Y. Altun, I. Tsochantaridis, and T. Hofmann. Hidden Markov support vector machines. In Proc. ICML, Washington DC, August 2003.

    Google Scholar 

  7. P. Auer. On learning from multi-instance examples: empirical evaluation of a theoretical approach. In Proc. ICML, 1997.

    Google Scholar 

  8. T. Back, D.B. Fogel, and Z. Michalewicz. Handbook of Evolutionary Computation. 1997.

    Google Scholar 

  9. Y. Bengio, A. Courville, and P. Vincent. Representation learning: A review and new perspectives. arXiv:1206.5538v3, April 2014.

    Google Scholar 

  10. Y. Bengio, I.J. Goodfellow, and A. Courville. Deep Learning. MIT Press, 2017.

    MATH  Google Scholar 

  11. Y. Bengio, Y. LeCun, and D. Henderson. Globally trained handwritten word recognizer using spatial representation, space displacement neural networks, and hidden Markov models. In Proceedings of Advances in Neural Information Processing Systems, 1994.

    Google Scholar 

  12. D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, (3):993–1022, 2003.

    MATH  Google Scholar 

  13. A. Blum and T. Mitchell. Combining labeled and unlabeled data with co-training. In Proc. Workshop on Computational Learning Theory. Morgan Kaufman Publishers, 1998.

    Google Scholar 

  14. B. E. Boser, I. M. Guyon, and V. N. Vapnik. A training algorithm for optimal margin classifiers. In the 5th Annual ACM Workshop on COLT, pages 144–152, Pittsburgh, PA, 1992.

    Google Scholar 

  15. R. Brachman and H. Levesque. Readings in Knowledge Representation. Morgan Kaufman, 1985.

    MATH  Google Scholar 

  16. S. Cebiric, F. Goasdoue, H. Kondylakis, D. Kotzinos, I. Manolescu, and G. Troullinou. Summarizing semantic graphs: A survey. The VLDB Journal, 28(3):295–327, 2019.

    Article  Google Scholar 

  17. D.-R. Chen, Q. Wu, Y. Wing, and D.-X. Zhou. Support vector machine soft margin classifiers: Error analysis. Journal of Machine Learning Research, 5:1143–1175, 2004.

    MathSciNet  MATH  Google Scholar 

  18. D.G. Childers, D.P. Skinner, and R.C. Kemerait. The cepstrum: A guide to processing. Proceedings of the IEEE, 65(10):1428–1443, 1977.

    Article  Google Scholar 

  19. G. Cooper. Computational complexity of probabilistic inference using Bayesian belief networks (research note). Artificial Intelligence, (42):393–405, 1990.

    Article  MathSciNet  MATH  Google Scholar 

  20. C. Cortes and V. Vapnik. Support vector machine. Machine Learning, 20(3):273–297, 1995.

    Article  MATH  Google Scholar 

  21. C. Cortes and V. Vapnik. Support-vector networks. Machine Learning, 20(3):273–297, 1995.

    Article  MATH  Google Scholar 

  22. P. Dagum and M. Luby. Approximating probabilistic reasoning in Bayesian belief networks is np-hard. Artificial Intelligence, 60(1):141–153, 1993.

    Article  MathSciNet  MATH  Google Scholar 

  23. H. III Daume and D. Marcu. Learning as search optimization: Approximate large margin methods for structured prediction. In Proc. ICML, Bonn, Germany, August 2005.

    Google Scholar 

  24. S. Deerwester, S. Dumais, G. Furnas, T. Landauer, and R. Harshman. Indexing by latent semantic analysis. Journal of American Sociation of Information Science, 41:391–407, 1990.

    Article  Google Scholar 

  25. T.G. Dietterich, R.H. Lathrop, and T. Lozano-Perez. Solving the multiple instance problem with axis-parallel rectangles. Artificial Intelligence, 89:31 ?71, 1997.

    Google Scholar 

  26. H. Drucker, C.J.C. Burges, L. Kaufman, A. Smola, and V. Vapnik. Support vector regression machines. In Advances in Neural Information Processing Systems 9, NIPS 1996, pages 156–161, 1997.

    Google Scholar 

  27. R.O. Duda, P.E. Hart, and D.G. Stork. Pattern Classification (2nd ed.). John Wiley and Sons, 2001.

    Google Scholar 

  28. C. Faloutsos. Searching Multimedia Databases by Content. Kluwer Academic Publishers, 1996.

    Book  MATH  Google Scholar 

  29. C. Faloutsos, R. Barber, M. Flickner, J. Hafner, W. Niblack, D. Petkovic, and W. Equitz. Efficient and effective querying by image content. Journal of Intelligent Information Systems, 3(3/4):231–262, 1994.

    Article  Google Scholar 

  30. Y. Freund. Boosting a weak learning algorithm by majority. In Proceedings of the Third Annual Workshop on Computational Learning Theory, 1990.

    Google Scholar 

  31. Y. Freund and R.E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, (55), 1997.

    Google Scholar 

  32. K. Fukunaga. Introduction to Statistical Pattern Recognition (Second Edition). Academic Press, 1990.

    MATH  Google Scholar 

  33. B. Furht, editor. Multimedia Systems and Techniques. Kluwer Academic Publishers, 1996.

    Google Scholar 

  34. A. Gersho. Asymptotically optimum block quantization. IEEE Trans. on Information Theory, 25(4):373–380, 1979.

    Article  MathSciNet  MATH  Google Scholar 

  35. D. Goldberg. Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley Professional, 1989.

    MATH  Google Scholar 

  36. J. Goodman. A bit of progress in language modeling. arXiv:cs/0108005v1, 2001.

    Google Scholar 

  37. A. Grossmann and J. Morlet. Decomposition of hardy functions into square integrable wavelets of constant shape. SIAM Journal on Mathematical Analysis, 15(4), 1984.

    Google Scholar 

  38. Z. Guo, Z. Zhang, E.P. Xing, and C. Faloutsos. Enhanced max margin learning on multimodal data mining in a multimedia database. In Proc. ACM International Conference on Knowledge Discovery and Data Mining, 2007.

    Google Scholar 

  39. J. Han and M. Kamber. Data Mining — Concepts and Techniques. Morgan Kaufmann, 2 edition, 2006.

    Google Scholar 

  40. P. Hayes. The logic of frames, 1979.

    Google Scholar 

  41. K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, 2016.

    Google Scholar 

  42. G.E. Hinton and R.R. Salakhutdinov.

    Google Scholar 

  43. T. Hofmann. Unsupervised learning by probabilistic latent semantic analysis. Machine Learning, 42(1):177C196, 2001.

    Google Scholar 

  44. B.K.P. Horn. Robot Vision. MIT Press and McGraw-Hill, 1986.

    Google Scholar 

  45. Jing Huang and S. Ravi Kumar et al. Image indexing using color correlograms. In IEEE Int’l Conf. Computer Vision and Pattern Recognition Proceedings, Puerto Rico, 1997.

    Google Scholar 

  46. R. Jain. Infoscopes: Multimedia information systems. In B. Furht, editor, Multimedia Systems and Techniques. Kluwer Academic Publishers, 1996.

    Google Scholar 

  47. R. Jain, R. Kasturi, and B.G. Schunck. Machine Vision. MIT Press and McGraw-Hill, 1995.

    Google Scholar 

  48. K. Kawaguchi, L.P. Kaelbling, and Y. Bengio. Generalization in deep learning. arXiv:1710.05468v5, 2019.

    Google Scholar 

  49. K.L. Ketner and H. Putnam. Reasoning and the Logic of Things. Harvard University Press, 1992.

    Google Scholar 

  50. A. Krizhevsky, I. Sutskever, and G.E. Hinton. ImageNet classification with deep convolutional neural networks. In Proceedings of Advances in Neural Information Processing Systems, 2012.

    Google Scholar 

  51. A. Kumar, P. Ondruska, M. Iyyer, J. Bradbury, I. Gulrajani, V. Zhong, R. Paulus, and R. Socher. Ask me anything: Dynamic memory networks for natural language processing. arXiv:1506.07285v5, 2016.

    Google Scholar 

  52. M. Langkvist, L. Karlsson, and A. Loutifi. A review of unsupervised feature learning and deep learning for time-series modeling. Pattern Recognition Letters, 42:11–24, 2014.

    Article  Google Scholar 

  53. Y. LeCun. Generalization and network design strategies. Univ. Toronto Computer Science Department Technical Report, CRG-TR-89-4, 1989.

    Google Scholar 

  54. Y. LeCun, B. Boser, J.S. Denker, D. Henderson, R. Howard, W. Hubbard, and L. Jackel. Handwritten digit recognition with a backpropagation neural network. In Proceedings of Advances in Neural Information Processing Systems, 1990.

    Google Scholar 

  55. Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of IEEE, 86(11), Pages=2278-2324, year=1998).

    Google Scholar 

  56. Y. Li, M. Yang, and Z. Zhang. A survey of multi-view representation learning. IEEE Transactions on Knowledge and Data Engineering, 2018.

    Google Scholar 

  57. D. Lowe. Object recognition from local scale-invariant features. In Proc. IEEE International Conference on Computer Vision, September 1999.

    Google Scholar 

  58. O. Maron and T. Lozano-Perez. A framework for multiple instance learning. In Proc. NIPS, 1998.

    Google Scholar 

  59. E. Mayoraz and E. Alpaydin. Support vector machines for multi-class classification. In IWANN (2), pages 833–842, 1999.

    Google Scholar 

  60. T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean. Distributed representation of words and phrases and their compositionality. In Proceedings of Advances in Neural Information Processing Systems, 2013.

    Google Scholar 

  61. M. Minsky. A framework for representing knowledge. In P.H. Winston, editor, The Psychology of Computer Vision. McGraw-Hill, 1975.

    Google Scholar 

  62. S. Papert. One ai or many? In S.R. Graubard, editor, The Artificial Intelligence Debate: False Starts, Real Foundations. MIT Press, 1988.

    Google Scholar 

  63. G. Pass and R. Zabih. Histogram refinement for content-based image retrieval. In IEEE Workshop on Applications of Computer Vision, Sarasota, FL, December 1996.

    Google Scholar 

  64. Z. Pawlak. Rough sets. International Journal of Parallel Programming, 11(5):341–356, 1982.

    MATH  Google Scholar 

  65. Jeffrey Pennington, Richard Socher, and Christopher D. Manning. Glove: Global vectors for word representation. In Empirical Methods in Natural Language Processing (EMNLP), pages 1532–1543, 2014.

    Google Scholar 

  66. M. Pradham and P. Dagum. Optimal Monte Carlo estimation of belief network inference. In Proceedings of the Conference on Uncertainty in Artificial Intelligence, pages 446–453, 1996.

    Google Scholar 

  67. W.K. Pratt. Introduction to Digital Image Processing. CRC Press, 2013.

    Book  Google Scholar 

  68. F. Rosenblatt. The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65(6):386–408, 1958.

    Article  Google Scholar 

  69. S. Russell and P. Norvig. Artificial intelligence: A modern approach. Prentice Hall, Upper Saddle River, NJ, 1995.

    Google Scholar 

  70. G. Salton. Developments in automatic text retrieval. Science, 253:974–979, 1991.

    Article  MathSciNet  Google Scholar 

  71. R.C. Schank. Dynamic Memory: A Theory of Reminding and Learning in Computers and People. Cambridge University Press, 1990.

    Google Scholar 

  72. R. Schapire. Strength of weak learnability. Journal of Machine Learning, 5:197–227, 1990.

    Article  Google Scholar 

  73. J. Shlens. A tutorial on principal component analysis. arXiv:1404.1100v1, 2014.

    Google Scholar 

  74. K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In Proceedings of International Conference on Learning Representation, 2015.

    Google Scholar 

  75. J.F. Sowa. Conceptual Structures: Information Processing in Mind and Machine. Addison-Wesley, 1984.

    MATH  Google Scholar 

  76. J.F. Sowa. Knowledge Representation – Logical, Philosophical, and Computational Foundations. Thomson Learning Publishers, 2000.

    Google Scholar 

  77. J.F. Sowa. Principles of Semantic Networks: Explorations in the Representation of Knowledge. Morgan Kaufmann, 2014.

    MATH  Google Scholar 

  78. R. Steinmetz and K. Nahrstedt. Multimedia Fundamentals — Media Coding and Content Processing. Prentice-Hall PTR, 2002.

    Google Scholar 

  79. V.S. Subrahmanian. Principles of Multimedia Database Systems. Morgan Kaufmann, 1998.

    Google Scholar 

  80. C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. In Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, 2015.

    Google Scholar 

  81. S.L. Tanimoto. Elements of Artificial Intelligence Using Common LISP. Computer Science Press, 1990.

    Google Scholar 

  82. B. Taskar, C. Guestrin, and D. Koller. Max-margin Markov networks. In Neural Information Processing Systems Conference, 2003.

    Google Scholar 

  83. Y.W. Teh, M.I. Jordan, M.J. Beal, and D.M. Blei. Hierarchical Dirichlet process. Journal of the American Statistical Association, 2006.

    Google Scholar 

  84. E.P.K. Tsang. Foundations of Constraint Satisfaction. Academic Press, 1993.

    Google Scholar 

  85. I. Tsochantaridis, T. Hofmann, T. Joachims, and Y. Altun. Support vector machine learning for interdependent and structured output spaces. In Proc. ICML, Banff, Canada, 2004.

    Google Scholar 

  86. E. Turunen, K. Raivio, and T. Mantere. Soft computing methods. In S. Pohjolainen, editor, Mathematical Modeling. Springer, 2016.

    Google Scholar 

  87. J. Weston, C. Chopra, and A. Bordes. Memory networks. In Proceedings of International Conference on Learning Representation, 2015.

    Google Scholar 

  88. L. A. Zadeh. Fuzzy sets. Information and Control, 8(3):338–353, 1965.

    Article  MathSciNet  MATH  Google Scholar 

  89. L.A. Zadeh. Fuzzy logic, neural networks, and soft computing. Communications of the ACM, 37(3):77–84, 1994.

    Article  Google Scholar 

  90. S. Zhai, Y. Cheng, W. Lu, and Z. Zhang. Doubly convolutional neural networks. In Proceedings of Advances in Neural Information Processing Systems, 2016.

    Google Scholar 

  91. X. Zhang, J. Zhao, and Y. LeCun. Character-level convolutional networks for text classification. In Proceedings of Advances on Neural Information Processing Systems, 2015.

    Google Scholar 

  92. Z. Zhang, R. Jing, and W. Gu. A new Fourier descriptor based on areas (AFD) and its applications in object recognition. In Proc. of IEEE International Conference on Systems, Man, and Cybernetics. International Academic Publishers, 1988.

    Google Scholar 

  93. Z. Zhang, F. Masseglia, R. Jain, and A. Del Bimbo. Editorial: Introduction to the special issue on multimedia data mining, 2008.

    Google Scholar 

  94. Z. Zhang and R. Zhang. Multimedia Data Mining — A Systematic Introduction to Concepts and Theory. Taylor & Francis, 2008.

    Book  MATH  Google Scholar 

  95. X. Zhu. Semi-supervised learning literature survey. Technical Report, 1530, 2005.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhongfei (Mark) Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Zhang, Z.(., Zhang, R.(. (2023). Multimedia Data Learning. In: Rokach, L., Maimon, O., Shmueli, E. (eds) Machine Learning for Data Science Handbook. Springer, Cham. https://doi.org/10.1007/978-3-031-24628-9_19

Download citation

Publish with us

Policies and ethics