Skip to main content

Multimedia Indexing, Search, and Retrieval in Large Databases of Social Networks

  • Chapter
  • First Online:
Book cover Social Media Retrieval

Abstract

Social networks are changing the way multimedia content is shared on the Web, by allowing users to upload their photos, videos, and audio content, produced by any means of digital recorders such as mobile/smartphones and Web/digital cameras. This plethora of content created the need for finding the desired media in the social media universe. Moreover, the diversity of the available content inspired users to demand and formulate more complicated queries. In the social media era, multimedia content search is promoted to a fundamental feature toward efficient search inside social multimedia streams, content classification, and context and event-based indexing. In this chapter, an overview of multimedia indexing and searching algorithms, following the data growth curve, is presented in detail. This chapter is thematically structured in two parts. In the first part, pure multimedia content retrieval issues are presented, while in the second part, the social aspects and new, interesting views on multimedia retrieval in the large social media databases are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    For further details visit Image CLEF 2011, the “Visual Concept Detection and Annotation” task.

References

  1. Andrienko, G., Andrienko, N., Bak, P., Kisilevich, S., Keim, D.: Analysis of community-contributed space-and time-referenced data (example of Panoramio photos). In: Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (GIS ’09), pp. 540–541. ACM, New York (2009)

    Google Scholar 

  2. Baeza-Yates, R., Cunto, W., Manber, U., Wu, S.: Proximity matching using fixed-queries trees. In: Proceedings of the 5th Combinatorial Pattern Matching (CPM), Asilomar. LNCS, vol. 807, pp. 198–212. (1994)

    MathSciNet  Google Scholar 

  3. Baeza-Yates, R., Navarro, G.: Fast approximate string matching in a dictionary. In: Proceedings of the 5th South American Symposium on String Processing and Information Retrieval (SPIRE), pp. 14–22. IEEE CS Press, Los Alamitos (1998)

    Google Scholar 

  4. Barnard, K., Duygulu, P., Forsyth, D., de Freitas, N., Blei, D.M., Jordan, M.I.: Matching words and pictures. JMLR 3, 1107–1135 (2003)

    MATH  Google Scholar 

  5. Batko, M., Falchi, F., Lucchese, C., Novak, D., Perego, R., Rabitti, F., Sedmidubsky, J., Zezula, P.: Building a web-scale image similarity search system. J. Multimed. Tools Appl. 47(3), 599–629 (2010)

    Article  Google Scholar 

  6. Becker, H., Naaman, M., Gravano, L.: Learning similarity metrics for event identification in social media. In: Proceedings of the Third ACM International Conference on Web Search and Data Mining (WSDM ’10), pp. 291–300. ACM, New York (2010). doi:10.1145/1718487.1718524, http://doi.acm.org/10.1145/1718487.1718524

  7. Bellman, R.: Adaptive Control Processes: A Guided Tour. Princeton University Press, Princeton (1961)

    MATH  Google Scholar 

  8. Bentley, J.L.: Multidimensional binary search trees in database applications. IEEE Trans. Soft. Eng. 5(4), 333–340 (1979)

    Article  MATH  Google Scholar 

  9. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  10. Bosch, A., Zisserman, A., Munoz, X.: Scene classification via pLSA. Computer vision ECCV 2006. Lecture Notes in Computer Science, vol. 3954, pp. 517–530. Springer, Berlin/Heidelberg (2006)

    Google Scholar 

  11. Bozkaya, T., Ozsoyoglu, M.: Distance-based indexing for high-dimensional metric spaces. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD), Tucson, pp. 357–368 (1997)

    Google Scholar 

  12. Bugnion, E., Fhei, S., Roos, T., Widmayer, P., Widmer, F.: A spatial index for approximate multiple string matching. In: Proceedings of the 1st South American Workshop on String Processing (WSP), Belo Horizonte, pp. 43–53 (1993)

    Google Scholar 

  13. Burkhard, W.A., Keller, R.M.: Some approaches to best-match file searching. Commun. ACM 16(4), 230–236 (1973)

    Article  MATH  Google Scholar 

  14. Brin, S.: Near neighbor search in large metric spaces. In: Proceedings of the 21st International Conference on Very Large Data Bases (VLDB), Zurich, pp. 574–584 (1995)

    Google Scholar 

  15. Chatzichristofis, S.A., Boutalis, Y.S.: CEDD: color and edge directivity descriptor: a compact descriptor for image indexing and retrieval. In: Gasteratos, A., Vincze, M., Tsotsos, J.K. (eds.) Proceedings of the 6th International Conference on Computer Vision Systems (ICVS’08), pp. 312–322. Springer, Berlin/Heidelberg (2008)

    Google Scholar 

  16. Chavez, E., Marroquin, J., Navarro, G.: Overcoming the curse of dimensionality. In: Proceedings of the European Workshop on Content-Based Multimedia Indexing (CBMI), Toulouse, pp. 57–64 (1999)

    Google Scholar 

  17. Chavez, E., Navarro, G., Baeza-Yates, R., Marroquin, J.L.: Searching in metric spaces. ACM Comput. Surv. 33(3), 273–321 (2001)

    Article  Google Scholar 

  18. Chiueh, T.: Content-based image indexing. In: Proceedings of the 20th Conference on Very Large Databases (VLDB), Santiago, pp. 582–593 (1994)

    Google Scholar 

  19. Chum, O., Philbin, J., Zisserman, A.: Near duplicate image detection: min-hash and tf-idf weighting. In: Proceedings of the British Machine Vision Conference, Leeds (2008)

    Google Scholar 

  20. Ciaccia, P., Patella, M., Zezula, P.: M-tree: an efficient access method for similarity search in metric spaces. In: Proceedings of the 23rd Conference on Very Large Databases (VLDB), Athens, pp. 426–435 (1997)

    Google Scholar 

  21. Ciaccia, P., Patella, M., Zezula, P.: Bulk loading the M-tree. In: Proceedings of the 9th Australasian Database Conference (ADC), Perth, pp. 15–26 (1998)

    Google Scholar 

  22. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 1, pp. 886–893. IEEE Computer Society, Los Alamitos (2005)

    Google Scholar 

  23. Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on p-stable distributions. In: Proceedings of the Symposium on Computational Geometry, Brooklyn, pp. 253–262 (2004)

    Google Scholar 

  24. Dehne, F., Nolteimer, H.: Voronoi trees and clustering problems. Inf. Syst. 12(2), 171–175 (1987)

    Article  Google Scholar 

  25. de Silva, G.C., Aizawa, K., Arase, Y., Xing X.: Interactive social, spatial and temporal querying for multimedia retrieval. In: Content-Based Multimedia Indexing (CBMI), 2011 9th International Workshop on, Madrid, 13–15 June 2011, pp. 7–12 (2011)

    Google Scholar 

  26. Flickr’s 1 million image dataset, visual concept detection and annotation, ImageCLEF (2011)

    Google Scholar 

  27. Flickr’s website, [Online]. Available: http://www.flickr.com/

  28. Flickr website. http://www.flickr.com/.Cited20Feb2012

  29. Flickr record in Wikipedia. http://en.wikipedia.org/wiki/Flickr\#cite\_note-3Cited20Feb2012

  30. Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. MIT, Cambridge (1998)

    MATH  Google Scholar 

  31. Gionis, A., Indyk, P., Motwani, R.: Similarity search in high dimensions via hashing. In: Proceedings of the 25th International Conference on Very Large Data Bases (VLDB), Edinburgh, pp. 518–529 (1999)

    Google Scholar 

  32. Guttman, A.: R-trees: a dynamic index structure for spatial searching. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD), Boston, pp. 47–57 (1984)

    Google Scholar 

  33. Hofmann, T.: Probabilistic latent semantic indexing. ACM SIGIR (1998)

    Google Scholar 

  34. Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. Mach. Learn. 41, 177–196 (2001)

    Article  Google Scholar 

  35. Jegou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: Proceedings of the European Conference on Computer Vision. LNCS, vol. I, pp. 304–317. Springer, Berlin (2008)

    Google Scholar 

  36. Jegou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbor search. IEEE Trans. Pattern Anal. Mach. Intell. 33(1), 117–128 (2011)

    Article  Google Scholar 

  37. Joly, A., Buisson, A.O.: Random maximum margin hashing. In: Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, pp. 873–880 (2011)

    Google Scholar 

  38. Joly, A., Frelicot, C., Buisson, O.: Feature statistical retrieval applied to content-based copy identification. In: Proceedings of the International Conference on Image Processing, Singapore, pp. 681–684 (2004)

    Google Scholar 

  39. Kulis, B., Grauman, K.: Kernelized locality-sensitive hashing for scalable image search. In: Proceedings of the 12th IEEE International Conference on Computer Vision (ICCV), Kyoto, pp. 2130–2137 (2009)

    Google Scholar 

  40. Li, J., Wang, J.Z.: Real-time computerized annotation of pictures. TPAMI 30(6), 985–1002 (2008)

    Article  Google Scholar 

  41. Li, X., Snoek, C.G.M., Worring, M.: Learning tag relevance by neighbor voting for social image retrieval. In: Proceedings of the 1st ACM International Conference on Multimedia Information Retrieval (MIR ’08), pp. 180–187. ACM, New York (2008)

    Google Scholar 

  42. Lin, Y.-R., Sundaram, H., De Choudhury, M., Kelliher, A.: Temporal patterns in social media streams: theme discovery and evolution using joint analysis of content and context. In: IEEE International Conference on Multimedia and Expo, ICME 2009, June 28 2009–July 3 2009, New York, pp. 1456–1459 (2009)

    Google Scholar 

  43. Liu, X., Troncy, R., Huet, B.: Using social media to identify events. In: Proceedings of the 3rd ACM SIGMM International Workshop on Social media (WSM ’11), pp. 3–8. ACM, New York (2011). doi:10.1145/2072609.2072613, http://doi.acm.org/10.1145/2072609.2072613

  44. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)

    Article  Google Scholar 

  45. Lv, Q., Josephson, W., Wang, Z., Charikar, M., Li, K., Multi-probe LSH: efficient indexing for high-dimensional similarity search. In: Proceedings of the 33rd International Conference on Very Large Data Bases (VLDB), Vienna, pp. 950–961 (2007)

    Google Scholar 

  46. Mico, L., Oncina, J., Vidal, E.: A new version of the nearest-neighbor approximating and eliminating search (AESA) with linear preprocessing-time and memory requirements. Pattern Recognit. Lett. 15, 9–17 (1994)

    Article  Google Scholar 

  47. M-tree web site, the M-tree project, [Online]. Available: http://www-db.deis.unibo.it/Mtree(2008)

  48. Naaman, M.: Social multimedia: highlighting opportunities for search and mining of multimedia data in social media applications. Multimed. Tools Appl. 56(1), 9–34 (2012). Springer, Computer Science

    Google Scholar 

  49. Navarro, G.: Searching in metric spaces by spatial approximation. VLDB J. 11(1), 28–46 (2002)

    Article  Google Scholar 

  50. Nolteimer, H., Verbarg, K., Zirkelbach, C.: Monotonous bisector trees: a tool for efficient partitioning of complex schemes of geometric objects. In: Monien, B., Ottmann, T. (eds.) Data Structures and Efficient Algorithms. LNCS, vol. 594, pp. 186–203. Springer, Berlin/New York (1992)

    Chapter  Google Scholar 

  51. Pauleve, L., J’egou, H., Amsaleg, L.: Locality sensitive hashing: a comparison of hash function types and querying mechanisms. Pattern Recognit. Lett. 31(11), 1348–1358 (2010)

    Article  Google Scholar 

  52. Papadopoulos, S., Zigkolis, C., Kompatsiaris, Y., Vakali, A.: Cluster-based landmark and event detection for tagged photo collections. IEEE Multimed. 18(1), 52–63 (2011)

    Article  Google Scholar 

  53. Popescu, A.-M., Pennacchiotti, M.: Detecting controversial events from twitter. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management (CIKM ’10), pp. 1873–1876. ACM, New York (2010). doi:10.1145/1871437.1871751, http://doi.acm.org/10.1145/1871437.1871751

  54. Popescu, A., Grefenstette, G., Moellic, P.-A.: Mining tourist information from user-supplied collections. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management (CIKM ’09), pp. 1713–1716. ACM, New York (2009)

    Google Scholar 

  55. Poullot, S., Buisson, O., Crucianu, M.: Z-grid-based probabilistic retrieval for scaling up content-based copy detection. In: Proceedings of the 6th ACM International Conference on Image and Video Retrieval (CIVR), Amsterdam, pp. 348–355 (2007)

    Google Scholar 

  56. Qi, G.-J., Aggarwal, C., Tian, Q., Ji, H., Huang, T.: Exploring context and content links in social media: a latent space method. IEEE Trans. Pattern Anal. Mach. Intell. 99(2011, preprints)

    Google Scholar 

  57. Quack, T., Leibe, B., Van Gool, L.: World-scale mining of objects and events from community photo collections. In: Proceedings of the 2008 International Conference on Content-Based Image and Video Retrieval (CIVR ’08), pp. 47–56. ACM, New York (2008). doi:10.1145/1386352.1386363, http://doi.acm.org/10.1145/1386352.1386363

  58. Raginsky, M., Lazebnik, S.: Locality-sensitive binary codes from shift-invariant kernels. In: Proceedings of the ACM NIPS, Vancouver, pp. 1509–1517 (2009)

    Google Scholar 

  59. Sakaki, T., Okazaki, M., Matsuo, Y.: Earthquake shakes twitter users: real-time event detection by social sensors. In: Proceedings of the 19th International Conference on World Wide Web (WWW ’10), pp. 851–860. ACM, New York (2010). doi:10.1145/1772690.1772777, http://doi.acm.org/10.1145/1772690.1772777

  60. Salakhutdinov, R., Mnih, A., Hinton, G.: Restricted boltzmann machines for collaborative filtering. In: Proceedings of the 24th ACM International Conference on Machine learning, Oregon, pp. 791–798 (2007)

    Google Scholar 

  61. Samet, H.: The quadtree and related hierarchical data structures. ACM Comput. Surv. (CSUR) 16(2), 187–260 (1984)

    Google Scholar 

  62. Sayyadi, H., Hurst, M., Maykov, A.: Event detection and tracking in social streams. In: Proceedings of the International AAAI Conference on Weblogs and Social Media. AAAI Press, Menlo Park (2009)

    Google Scholar 

  63. Shakhnarovich, G., Darrell, T., Indyk, P.: Nearest Neighbor Methods in Learning and Vision: Theory and Practice. MIT Press (2006)

    Google Scholar 

  64. Shechtman, E., Irani, M.: Matching local self-similarities across images and videos. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR ’07, 17–22 June 2007, pp. 1–8. IEEE, Piscataway (2007)

    Google Scholar 

  65. Sivic, J., Russell, B.C., Efros, A.A., Zisserman, A., Freeman, W.T.: Discovering object categories in image collections. In: Proceedings of the International Conference on Computer Vision, Beijing (2005)

    Google Scholar 

  66. Sizov, S.: GeoFolk: latent spatial semantics in web 2.0 social media. In: Proceedings of the Third ACM International Conference on Web Search and Data Mining (WSDM ’10), pp. 281–290. ACM, New York (2010). doi:10.1145/1718487.1718522, http://doi.acm.org/10.1145/1718487.1718522

  67. Skopal, T.: Pivoting M-tree: a metric access method for efficient similarity search. In: Proceedings of the Annual International Workshop on Databases, Texts, Specifications and Objects (DATESO), Desna, pp. 27–37 (2004)

    Google Scholar 

  68. Skopal, T., Hoksza, D.: Improving the performance of M-tree family by nearest-neighbor graphs. In: Proceedings of the 11th East European Conference on Advances in Databases and information Systems (ADBIS), Varna, pp. 172–188 (2007)

    Google Scholar 

  69. Smeulders, A.W.M., Worring, M., Santini, S., Gupta, A., Jain, R.: Content-based image retrieval at the end of the early years. IEEE Trans. Pattern Anal. Mach. Intell. 22(12), 1349–1380 (2000). doi:10.1109/34.895972, http://dx.doi.org/10.1109/34.895972

    Google Scholar 

  70. Sparck Jones, K., Walker, S., Robertson, S.E.: A probabilistic model of information retrieval: development and comparative experiments. Inf. Process. Manag. 36(6), 779–808 (2000)

    Article  Google Scholar 

  71. Tian, Y., Srivastava, J., Huang, T., Contractor, N.: Social multimedia computing. Computer 43(8), 27–36 (2010)

    Article  Google Scholar 

  72. Torralba, A., Fergus, R., Freeman, W.T.: 80 million tiny images: a large data set for nonparametric object and scene recognition. TPAMI 30, 1958–1970 (2008)

    Article  Google Scholar 

  73. Troncy, R., Malocha, B., Fialho. A.T.S.: Linking events with media. In: Paschke, A., Henze, N., Pellegrini, T. (eds.) Proceedings of the 6th International Conference on Semantic Systems (I-SEMANTICS ’10), p. 4. ACM, New York (2010). doi:10.1145/1839707.1839759, http://doi.acm.org/10.1145/1839707.1839759

  74. Uijlings, J.R.R., Smeulders, A.W.M., Scha, R.J.H.: Real-time visual concept classification. IEEE Trans. Multimed. 12(7), 665–681 (2010)

    Article  Google Scholar 

  75. Vidal, E.: An algorithm for finding nearest neighbors in (approximately) constant average time. Pattern Recognit. Lett. 4, 145–157 (1986)

    Article  Google Scholar 

  76. Wang, X.-J. Zhang, L., Li, X., Ma, W.-Y.: Annotating images by mining image search results. TPAMI 30, 1919–1932 (2008)

    Article  Google Scholar 

  77. Weber, R., Schek, H.J., Blott, S.: A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In: Proceedings of the 24th International Conference on Very Large Data Bases (VLDB), New York, pp. 194–205 (1998)

    Google Scholar 

  78. Weiss, Y., Torralba, A., Fergus, R.: Spectral hashing. In: Proceedings of the Advances in Neural Information Processing Systems (NIPS), Vancouver, pp. 1753–1760 (2008)

    Google Scholar 

  79. Westermann, U., Jain, R.: Toward a common event model for multimedia applications. IEEE Multimed. 14(1), 19–29 (2007)

    Article  Google Scholar 

  80. Yang, Q., Chen, Y., Xue, G.-R., Dai, W., Yu, Y.: Heterogeneous transfer learning for image clustering via the social web. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 – Volume 1 (ACL’09), vol. 1, pp. 1–9. Association for Computational Linguistics, Stroudsburg (2009)

    Google Scholar 

  81. Yianilos, P.N.: Excluded middle vantage point forests for nearest neighbor search. NEC Research Institute, Princeton University, technical report (1998)

    Google Scholar 

  82. YouTube statistics. http://www.youtube.com/t/press\_statistics.Cited20Feb2012

Download references

Acknowledgements

This work was partially supported by the EC FP7-funded project CUBRIK, ICT-287704 (www.cubrikproject.eu).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Theodoros Semertzidis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag London

About this chapter

Cite this chapter

Semertzidis, T., Rafailidis, D., Tiakas, E., Strintzis, M.G., Daras, P. (2013). Multimedia Indexing, Search, and Retrieval in Large Databases of Social Networks. In: Ramzan, N., van Zwol, R., Lee, JS., Clüver, K., Hua, XS. (eds) Social Media Retrieval. Computer Communications and Networks. Springer, London. https://doi.org/10.1007/978-1-4471-4555-4_3

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-4555-4_3

  • Published:

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-4471-4554-7

  • Online ISBN: 978-1-4471-4555-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics