Abstract
A novel scheme for retrieving users’ desired contents, i.e., contents with topics in which users are interested, from multiple social media platforms is presented in this paper. In existing retrieval schemes, users first select a particular platform and then input a query into the search engine. If users do not specify suitable platforms for their information needs and do not input suitable queries corresponding to the desired contents, it becomes difficult for users to retrieve the desired contents. The proposed scheme extracts the hierarchical structure of content groups (sets of contents with similar topics) from different social media platforms, and it thus becomes feasible to retrieve desired contents even if users do not specify suitable platforms and do not input suitable queries. This paper has two contributions: (1) A new feature extraction method, Locality Preserving Canonical Correlation Analysis with multiple social metadata (LPCCA-MSM) that can detect content groups without the boundaries of different social media platforms is presented in this paper. LPCCA-MSM uses multiple social metadata as auxiliary information unlike conventional methods that only use content-based information such as textual or visual features. (2) The proposed novel retrieval scheme can realize hierarchical content structuralization from different social media platforms. The extracted hierarchical structure shows various abstraction levels of content groups and their hierarchical relationships, which can help users select topics related to the input query. To the best of our knowledge, an intensive study on such an application has not been conducted; therefore, this paper has strong novelty. To verify the effectiveness of the above contributions, extensive experiments for real-world datasets containing YouTube videos and Wikipedia articles were conducted.
Similar content being viewed by others
Notes
YouTube videos related to each other are linked as “related videos”.
We define a leaf concept as a concept that has no lower levels of concepts in the hierarchy.
References
Amigó E, Gonzalo J, Artiles J, Verdejo F (2009) A comparison of extrinsic clustering evaluation metrics based on formal constraints. Inf Retriv 12(4):461–486
Bao BK, Xu C, Min W, Hossain MS (2015) Cross-platform emerging topic detection and elaboration from multimedia streams. ACM Trans Multimed Comput Commun Appl 11(4):54
Bhowmick S, Srinivasan S (2013) A template for parallelizing the louvain method for modularity maximization. In: Dynamics on and of complex networks, vol 2. Springer, pp 111–124
Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exper 2008(10):P10,008
Cai D, He X, Li Z, Ma WY, Wen JR (2004) Hierarchical clustering of www image search results using visual, textual and link information. In: ACM International conference on multimedia, pp 952–959
Cao J, Zhang Y, Ji R, Xie F, Su Y (2016) Web video topics discovery and structuralization with social network. Neurocomputing 172(C):53–63
Carpineto C, Osiński S, Romano G, Weiss D (2009) A survey of web clustering engines. ACM Comput Surv 41(3):17
Chen M, Xu Z, Sha F, Weinberger KQ (2012) Marginalized denoising autoencoders for domain adaptation. In: ACM International conference on machine learning, pp 767–774
Chu L, Zhang Y, Li G, Wang S, Zhang W, Huang Q (2014) Effective multi-modality fusion framework for cross-media topic detection. IEEE Trans Circ Syst Vid Technol 26(3):556–569
Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990) Indexing by latent semantic analysis. J Amer Soc Inf Sci 41(6):391
Donahue J, Jia Y, Vinyals O, Hoffman J, Zhang N, Tzeng E, Darrell T (2013) Decaf: a deep convolutional activation feature for generic visual recognition. arXiv preprint arXiv:1310.1531
Fang Q, Xu C, Sang J, Hossain MS, Ghoneim A (2016) Folksonomy-based visual ontology construction and its applications. IEEE Trans Multimed 18(4):702–713
Ferragina P, Gulli A (2008) A personalized search engine based on web-snippet hierarchical clustering. Softw Pract Exper 38(2):189–225
Frey BJ, Dueck D (2007) Clustering by passing messages between data points. Science 315(5814):972–976
Gao K, Zhang Y, Luo P, Zhang W, Xia J, Lin S (2012) Visual stem mapping and geometric tense coding for augmented visual vocabulary. In: IEEE Conference on computer vision and pattern recognition, pp 3234–3241
Gong B, Shi Y, Sha F, Grauman K (2012) Geodesic flow kernel for unsupervised domain adaptation. In: IEEE Conference on computer vision and pattern recognition, pp 2066–2073
Harakawa R, Ogawa T, Haseyama M (2016) A web video retrieval method using hierarchical structure of web video groups. Multimed Tools Appl 75(24):17,059–17,079
Harakawa R, Ogawa T, Haseyama M (2016) Accurate and efficient extraction of hierarchical structure of web communities forweb video retrieval. ITE Trans Media Technol Appl 4(1):49–59
Haseyama M, Ogawa T, Yagi N (2013) A review of video retrieval based on image and video semantic understanding. ITE Trans Media Technol Appl 1(1):2–9
He X, Zhang H, Kan MY, Chua TS (2016) Fast matrix factorization for online recommendation with implicit feedback. In: ACM SIGIR Conference on research and development in information retrieval, pp 549–558
Hindle A, Shao J, Lin D, Lu J, Zhang R (2011) Clustering web video search results based on integration of multiple features. World Wide Web 14(1):53–73
Hong R, Tang J, Tan HK, Ngo CW, Yan S, Chua TS (2011) Beyond search: event-driven summarization for web videos. ACM Trans Multimed Comput Commun Appl 7(4):35
Hong R, Zha ZJ, Gao Y, Chua TS, Wu X (2013) Multimedia encyclopedia construction by mining web knowledge. Signal Process 93(8):2361–2368
Hotelling H (1936) Relations between two sets of variates. Biometrika 28 (3):321–377
Kamie M, Hashimoto T, Kitagawa H (2012) Effective web video clustering using playlist information. In: Annual ACM symposium on applied computing, pp 949–956
Liu A, Nie W, Gao Y, Su YT (2016) Multi-modal clique-graph matching for view-based 3d model retrieval. IEEE Trans Image Process 25(5):2103–2116
Miller GA (1995) Wordnet: a lexical database for english. Commun ACM 38 (11):39–41
Min W, Bao BK, Xu C, Hossain M S (2015) Cross-platform multi-modal topic modeling for personalized inter-platform recommendation. IEEE Trans Multimed 17(10):1787–1801
Nie L, Wang M, Zha ZJ, Chua TS (2012) Oracle in image search: a content-based approach to performance prediction. ACM Trans Inf Syst 30(2):13
Nie W, Liu A, Su Y (2016) Cross-domain semantic transfer from large-scale social media. Multimed Syst 22(1):75–85
Nie W, Liu A, Zhu X, Su Y (2016) Quality models for venue recommendation in location-based social network. Multimed Tools Appl 75(20):12,521–12,534
Pan SJ, Tsang IW, Kwok JT, Yang Q (2011) Domain adaptation via transfer component analysis. IEEE Trans Neural Netw 22(2):199–210
Que X, Checconi F, Petrini F, Gunnels JA (2015) Scalable community detection with the louvain algorithm. In: IEEE International parallel and distributed processing symposium. IEEE, pp 28–37
Rasiwasia N, Costa Pereira J, Coviello E, Doyle G, Lanckriet GR, Levy R, Vasconcelos N (2010) A new approach to cross-modal multimedia retrieval. In: ACM International conference on multimedia, pp 251–260
Steinbach M, Karypis G, Kumar V, et al (2000) A comparison of document clustering techniques. In: Workshop on text mining at ACM SIGKDD international conference on knowledge discovery and data mining, vol 400, pp 525–526
Sun T, Chen S (2007) Locality preserving cca with applications to data visualization and pose estimation. Image Vis Comput 25(5):531–543
Takehara D, Harakawa R, Ogawa T, Haseyama M (2016) Hierarchical content group detection from different social media platforms using web link structure. In: IEEE International conference on image processing, pp 479–483
Wang J, Yang J, Yu K, Lv F, Huang T, Gong Y (2010) Locality-constrained linear coding for image classification. In: IEEE Conference on computer vision and pattern recognition, pp 3360–3367
Wang S, Wang Z, Jiang S, Huang Q (2014) Cross media topic analytics based on synergetic content and user behavior modeling. In: IEEE International conference on multimedia and expo, pp 1–6
Xue Z, Jiang S, Li G, Huang Q, Zhang W (2013) Cross-media topic detection associated with hot search queries. In: ACM International conference on internet multimedia computing and service, pp 403–406
Zelenkauskaite A (2016) Remediation, convergence, and big data conceptual limits of cross-platform social media. Convergence: the international journal of research into new media technologies, p 1354856516631519
Zeng HJ, He QC, Chen Z, Ma WY, Ma J (2004) Learning to cluster web search results. In: ACM SIGIR International conference on research and development in information retrieval, pp 210–217
Zhang H, Shang X, Luan H, Wang M, Chua TS (2016) Learning from collective intelligence: feature learning using social images and tags. ACM Trans Multimed Comput Commun Appl 13(1):1
Zhang W, Chen T, Li G, Pang J, Huang Q, Gao W (2015) Fusing cross-media for topic detection by dense keyword groups. Neurocomputing 169:169–179
Zhang Y, Li G, Chu L, Wang S, Zhang W, Huang Q (2013) Cross-media topic detection: a multi-modality fusion framework. In: IEEE International conference on multimedia and expo, pp 1–6
Zhao Y, Karypis G (2002) Evaluation of hierarchical clustering algorithms for document datasets. In: ACM International conference on information and knowledge management, pp 515–524
Zhou X, Liang X, Zhang H, Ma Y (2016) Cross-platform identification of anonymous identical users in multiple social media networks. IEEE Trans Knowl Data Eng 28(2):411–424
Zhuang YT, Yang Y, Wu F (2008) Mining semantic correlation of heterogeneous multimedia data for cross-media retrieval. IEEE Trans Multimed 10(2):221–229
Acknowledgements
This work was partly supported by JSPS KAKENHI Grant Numbers JP25280036, JP24120002.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Takehara, D., Harakawa, R., Ogawa, T. et al. Extracting hierarchical structure of content groups from different social media platforms using multiple social metadata. Multimed Tools Appl 76, 20249–20272 (2017). https://doi.org/10.1007/s11042-017-4717-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-017-4717-7