Skip to main content
Log in

Extracting hierarchical structure of content groups from different social media platforms using multiple social metadata

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

A novel scheme for retrieving users’ desired contents, i.e., contents with topics in which users are interested, from multiple social media platforms is presented in this paper. In existing retrieval schemes, users first select a particular platform and then input a query into the search engine. If users do not specify suitable platforms for their information needs and do not input suitable queries corresponding to the desired contents, it becomes difficult for users to retrieve the desired contents. The proposed scheme extracts the hierarchical structure of content groups (sets of contents with similar topics) from different social media platforms, and it thus becomes feasible to retrieve desired contents even if users do not specify suitable platforms and do not input suitable queries. This paper has two contributions: (1) A new feature extraction method, Locality Preserving Canonical Correlation Analysis with multiple social metadata (LPCCA-MSM) that can detect content groups without the boundaries of different social media platforms is presented in this paper. LPCCA-MSM uses multiple social metadata as auxiliary information unlike conventional methods that only use content-based information such as textual or visual features. (2) The proposed novel retrieval scheme can realize hierarchical content structuralization from different social media platforms. The extracted hierarchical structure shows various abstraction levels of content groups and their hierarchical relationships, which can help users select topics related to the input query. To the best of our knowledge, an intensive study on such an application has not been conducted; therefore, this paper has strong novelty. To verify the effectiveness of the above contributions, extensive experiments for real-world datasets containing YouTube videos and Wikipedia articles were conducted.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. https://twitter.com

  2. https://www.youtube.com

  3. https://en.wikipedia.org

  4. YouTube videos related to each other are linked as “related videos”.

  5. We define a leaf concept as a concept that has no lower levels of concepts in the hierarchy.

  6. https://developers.google.com/youtube/v3

  7. https://www.mediawiki.org/wiki/API

References

  1. Amigó E, Gonzalo J, Artiles J, Verdejo F (2009) A comparison of extrinsic clustering evaluation metrics based on formal constraints. Inf Retriv 12(4):461–486

    Article  Google Scholar 

  2. Bao BK, Xu C, Min W, Hossain MS (2015) Cross-platform emerging topic detection and elaboration from multimedia streams. ACM Trans Multimed Comput Commun Appl 11(4):54

    Article  Google Scholar 

  3. Bhowmick S, Srinivasan S (2013) A template for parallelizing the louvain method for modularity maximization. In: Dynamics on and of complex networks, vol 2. Springer, pp 111–124

  4. Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exper 2008(10):P10,008

    Article  Google Scholar 

  5. Cai D, He X, Li Z, Ma WY, Wen JR (2004) Hierarchical clustering of www image search results using visual, textual and link information. In: ACM International conference on multimedia, pp 952–959

  6. Cao J, Zhang Y, Ji R, Xie F, Su Y (2016) Web video topics discovery and structuralization with social network. Neurocomputing 172(C):53–63

    Article  Google Scholar 

  7. Carpineto C, Osiński S, Romano G, Weiss D (2009) A survey of web clustering engines. ACM Comput Surv 41(3):17

    Article  Google Scholar 

  8. Chen M, Xu Z, Sha F, Weinberger KQ (2012) Marginalized denoising autoencoders for domain adaptation. In: ACM International conference on machine learning, pp 767–774

  9. Chu L, Zhang Y, Li G, Wang S, Zhang W, Huang Q (2014) Effective multi-modality fusion framework for cross-media topic detection. IEEE Trans Circ Syst Vid Technol 26(3):556–569

    Article  Google Scholar 

  10. Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990) Indexing by latent semantic analysis. J Amer Soc Inf Sci 41(6):391

    Article  Google Scholar 

  11. Donahue J, Jia Y, Vinyals O, Hoffman J, Zhang N, Tzeng E, Darrell T (2013) Decaf: a deep convolutional activation feature for generic visual recognition. arXiv preprint arXiv:1310.1531

  12. Fang Q, Xu C, Sang J, Hossain MS, Ghoneim A (2016) Folksonomy-based visual ontology construction and its applications. IEEE Trans Multimed 18(4):702–713

    Article  Google Scholar 

  13. Ferragina P, Gulli A (2008) A personalized search engine based on web-snippet hierarchical clustering. Softw Pract Exper 38(2):189–225

    Article  Google Scholar 

  14. Frey BJ, Dueck D (2007) Clustering by passing messages between data points. Science 315(5814):972–976

    Article  MathSciNet  MATH  Google Scholar 

  15. Gao K, Zhang Y, Luo P, Zhang W, Xia J, Lin S (2012) Visual stem mapping and geometric tense coding for augmented visual vocabulary. In: IEEE Conference on computer vision and pattern recognition, pp 3234–3241

  16. Gong B, Shi Y, Sha F, Grauman K (2012) Geodesic flow kernel for unsupervised domain adaptation. In: IEEE Conference on computer vision and pattern recognition, pp 2066–2073

  17. Harakawa R, Ogawa T, Haseyama M (2016) A web video retrieval method using hierarchical structure of web video groups. Multimed Tools Appl 75(24):17,059–17,079

    Article  Google Scholar 

  18. Harakawa R, Ogawa T, Haseyama M (2016) Accurate and efficient extraction of hierarchical structure of web communities forweb video retrieval. ITE Trans Media Technol Appl 4(1):49–59

    Article  Google Scholar 

  19. Haseyama M, Ogawa T, Yagi N (2013) A review of video retrieval based on image and video semantic understanding. ITE Trans Media Technol Appl 1(1):2–9

    Article  Google Scholar 

  20. He X, Zhang H, Kan MY, Chua TS (2016) Fast matrix factorization for online recommendation with implicit feedback. In: ACM SIGIR Conference on research and development in information retrieval, pp 549–558

  21. Hindle A, Shao J, Lin D, Lu J, Zhang R (2011) Clustering web video search results based on integration of multiple features. World Wide Web 14(1):53–73

    Article  Google Scholar 

  22. Hong R, Tang J, Tan HK, Ngo CW, Yan S, Chua TS (2011) Beyond search: event-driven summarization for web videos. ACM Trans Multimed Comput Commun Appl 7(4):35

    Article  Google Scholar 

  23. Hong R, Zha ZJ, Gao Y, Chua TS, Wu X (2013) Multimedia encyclopedia construction by mining web knowledge. Signal Process 93(8):2361–2368

    Article  Google Scholar 

  24. Hotelling H (1936) Relations between two sets of variates. Biometrika 28 (3):321–377

    Article  MATH  Google Scholar 

  25. Kamie M, Hashimoto T, Kitagawa H (2012) Effective web video clustering using playlist information. In: Annual ACM symposium on applied computing, pp 949–956

  26. Liu A, Nie W, Gao Y, Su YT (2016) Multi-modal clique-graph matching for view-based 3d model retrieval. IEEE Trans Image Process 25(5):2103–2116

    Article  MathSciNet  Google Scholar 

  27. Miller GA (1995) Wordnet: a lexical database for english. Commun ACM 38 (11):39–41

    Article  Google Scholar 

  28. Min W, Bao BK, Xu C, Hossain M S (2015) Cross-platform multi-modal topic modeling for personalized inter-platform recommendation. IEEE Trans Multimed 17(10):1787–1801

    Article  Google Scholar 

  29. Nie L, Wang M, Zha ZJ, Chua TS (2012) Oracle in image search: a content-based approach to performance prediction. ACM Trans Inf Syst 30(2):13

    Article  Google Scholar 

  30. Nie W, Liu A, Su Y (2016) Cross-domain semantic transfer from large-scale social media. Multimed Syst 22(1):75–85

    Article  Google Scholar 

  31. Nie W, Liu A, Zhu X, Su Y (2016) Quality models for venue recommendation in location-based social network. Multimed Tools Appl 75(20):12,521–12,534

    Article  Google Scholar 

  32. Pan SJ, Tsang IW, Kwok JT, Yang Q (2011) Domain adaptation via transfer component analysis. IEEE Trans Neural Netw 22(2):199–210

    Article  Google Scholar 

  33. Que X, Checconi F, Petrini F, Gunnels JA (2015) Scalable community detection with the louvain algorithm. In: IEEE International parallel and distributed processing symposium. IEEE, pp 28–37

  34. Rasiwasia N, Costa Pereira J, Coviello E, Doyle G, Lanckriet GR, Levy R, Vasconcelos N (2010) A new approach to cross-modal multimedia retrieval. In: ACM International conference on multimedia, pp 251–260

  35. Steinbach M, Karypis G, Kumar V, et al (2000) A comparison of document clustering techniques. In: Workshop on text mining at ACM SIGKDD international conference on knowledge discovery and data mining, vol 400, pp 525–526

  36. Sun T, Chen S (2007) Locality preserving cca with applications to data visualization and pose estimation. Image Vis Comput 25(5):531–543

    Article  Google Scholar 

  37. Takehara D, Harakawa R, Ogawa T, Haseyama M (2016) Hierarchical content group detection from different social media platforms using web link structure. In: IEEE International conference on image processing, pp 479–483

  38. Wang J, Yang J, Yu K, Lv F, Huang T, Gong Y (2010) Locality-constrained linear coding for image classification. In: IEEE Conference on computer vision and pattern recognition, pp 3360–3367

  39. Wang S, Wang Z, Jiang S, Huang Q (2014) Cross media topic analytics based on synergetic content and user behavior modeling. In: IEEE International conference on multimedia and expo, pp 1–6

  40. Xue Z, Jiang S, Li G, Huang Q, Zhang W (2013) Cross-media topic detection associated with hot search queries. In: ACM International conference on internet multimedia computing and service, pp 403–406

  41. Zelenkauskaite A (2016) Remediation, convergence, and big data conceptual limits of cross-platform social media. Convergence: the international journal of research into new media technologies, p 1354856516631519

  42. Zeng HJ, He QC, Chen Z, Ma WY, Ma J (2004) Learning to cluster web search results. In: ACM SIGIR International conference on research and development in information retrieval, pp 210–217

  43. Zhang H, Shang X, Luan H, Wang M, Chua TS (2016) Learning from collective intelligence: feature learning using social images and tags. ACM Trans Multimed Comput Commun Appl 13(1):1

    Article  Google Scholar 

  44. Zhang W, Chen T, Li G, Pang J, Huang Q, Gao W (2015) Fusing cross-media for topic detection by dense keyword groups. Neurocomputing 169:169–179

    Article  Google Scholar 

  45. Zhang Y, Li G, Chu L, Wang S, Zhang W, Huang Q (2013) Cross-media topic detection: a multi-modality fusion framework. In: IEEE International conference on multimedia and expo, pp 1–6

  46. Zhao Y, Karypis G (2002) Evaluation of hierarchical clustering algorithms for document datasets. In: ACM International conference on information and knowledge management, pp 515–524

  47. Zhou X, Liang X, Zhang H, Ma Y (2016) Cross-platform identification of anonymous identical users in multiple social media networks. IEEE Trans Knowl Data Eng 28(2):411–424

    Article  Google Scholar 

  48. Zhuang YT, Yang Y, Wu F (2008) Mining semantic correlation of heterogeneous multimedia data for cross-media retrieval. IEEE Trans Multimed 10(2):221–229

    Article  Google Scholar 

Download references

Acknowledgements

This work was partly supported by JSPS KAKENHI Grant Numbers JP25280036, JP24120002.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daichi Takehara.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Takehara, D., Harakawa, R., Ogawa, T. et al. Extracting hierarchical structure of content groups from different social media platforms using multiple social metadata. Multimed Tools Appl 76, 20249–20272 (2017). https://doi.org/10.1007/s11042-017-4717-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-017-4717-7

Keywords

Navigation