Heterogeneous Information Fusion based Topic Detection from Social Media Data

Rani, Seema; Kumar, Mukesh

doi:10.1007/s10796-022-10334-w

Heterogeneous Information Fusion based Topic Detection from Social Media Data

Published: 07 September 2022

Volume 25, pages 513–528, (2023)
Cite this article

Information Systems Frontiers Aims and scope Submit manuscript

Seema Rani¹ &
Mukesh Kumar¹

472 Accesses
2 Citations
Explore all metrics

Abstract

Due to the pervasive nature of social networking platforms, as well as the proliferation of user generated content, the internet has become a repository of unstructured multimedia data. The use of this huge data for user experience enhancement is still a problem, where topic detection is one of the solutions to solve this issue, not having been explored in the literature for this application. Videos with similar content or related to the same topic can be grouped together with the help of topic detection methods. In this paper, a framework for topic detection using web videos textual metadata has been developed. The key contribution in this paper is to leverage multimedia metadata to find web video topics using a two-step process . First, we used transformer-based model to perform topic modeling for identification of topics from the heterogeneous textual data of web videos. Second, topic-based video retrieval has been accomplished using a classification approach. Further, experiments are carried out on a publicly available dataset to assess the performance of the proposed method. The proposed work is compared to the state-of-the-art methods Discriminative Probabilistic Models (DPM), Event clustering based method (ECBM),Multi-Modality Based Method (MMBM), Side-Information Based Method (SIBM), and Similarity Cascades(SC), which shows that the proposed system outperforms others in terms of Precision, Recall, F-measure and Accuracy. The experimental results demonstrates the effectiveness of proposed method for topic detection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Visual topic discovering, tracking and summarization from social media streams

Article 14 September 2016

Online Web-Video Topic Detection and Tracking with Semi-supervised Learning

Online web video topic detection and tracking with semi-supervised learning

Article 29 July 2014

References

Aggarwal, C.C., Hinneburg, A., & Keim, D.A. (2001). On the surprising behavior of distance metrics in high dimensional space. In International conference on database theory (pp. 420–434). Springer.
Allan, J., Carbonell, J.G., Doddington, G., Yamron, J., & Yang, Y. (1998). Topic detection and tracking pilot study final report.
Allaoui, M., Kherfi, M.L., & Cheriet, A. (2020). Considerably improving clustering algorithms using umap dimensionality reduction technique: a comparative study. In International conference on image and signal processing (pp. 317–325). Springer.
Bao, B.-K., Xu, C., Min, W., & Hossain, M.S. (2015). Cross-platform emerging topic detection and elaboration from multimedia streams. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 11(4), 1–21.
Article Google Scholar
Beyer, K.S., Goldstein, J., Ramakrishnan, R., & Shaft, U. (1999). When is ”nearest neighbor” meaningful?. In Proceedings of the 7th international conference on database theory, ICDT ’99 (pp. 217–235). Berlin: Springer.
Cao, J., Ngo, C. -W., Zhang, Y. -D., & Li, J. -T. (2011). Tracking web video topics: Discovery, visualization, and monitoring. IEEE Transactions on Circuits and Systems for Video Technology, 21(12), 1835–1846.
Article Google Scholar
Cao, J., Zhang, Y., Ji, R., Xie, F., & Su, Y. (2016). Web video topics discovery and structuralization with social network. Neurocomputing, 172, 53–63.
Article Google Scholar
Cao, J., Zhang, Y.-D., Song, Y.-C., Chen, Z.-N., Zhang, X., & Li, J.-T. (2009). Mcg-webv: A benchmark dataset for web video analysis. 10.
Chen, T., Liu, C., & Huang, Q. (2012). An effective multi-clue fusion approach for web video topic detection. In Proceedings of the 20th ACM international conference on multimedia (pp. 781–784).
Devlin, J., Chang, M. -W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805.
Everingham, M., Van Gool, L., Williams, C.K., Winn, J., & Zisserman, A. (2010). The pascal visual object classes (voc) challenge. International Journal of Computer Vision, 88(2), 303–338.
Article Google Scholar
Gandhi, A., Sharma, A., Biswas, A., & Deshmukh, O. (2016). Gethr-net: A generalized temporally hybrid recurrent neural network for multimodal information fusion. In European conference on computer vision (pp. 883–899). Springer.
Gialampoukidis, I., Moumtzidou, A., Liparas, D., Vrochidis, S., & Kompatsiaris, I. (2016). A hybrid graph-based and non-linear late fusion approach for multimedia retrieval. In 2016 14th International workshop on content-based multimedia indexing (CBMI) (pp. 1–6). IEEE.
Grootendorst, M. (2022). Bertopic: neural topic modeling with a class-based tf-idf procedure. arXiv:2203.05794.
He, Q., Chang, K., Lim, E. -P., & Banerjee, A. (2010). Keep it simple with time: A reexamination of probabilistic topic detection models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(10), 1795–1808.
Article Google Scholar
Lan, Z.-z., Bao, L., Yu, S.-I., Liu, W., & Hauptmann, A.G. (2012). Double fusion for multimedia event detection. In International conference on multimedia modeling (pp. 173–185). Springer.
Li, G., Jiang, S., Zhang, W., Pang, J., & Huang, Q. (2016a). Online web video topic detection and tracking with semi-supervised learning. Multimedia Systems, 22(1), 115–125.
Article Google Scholar
Li, W., Joo, J., Qi, H., & Zhu, S. -C. (2016b). Joint image-text news topic detection and tracking by multimodal topic and-or graph. IEEE Transactions on Multimedia, 19(2), 367–381.
Article Google Scholar
Liu, Y., Niculescu-Mizil, A., & Gryc, W. (2009). Topic-link lda: joint models of topic and author community.
Lu, Z., Lin, Y. -R., Huang, X., Xiong, N., & Fang, Z. (2017). Visual topic discovering, tracking and summarization from social media streams. Multimedia Tools and Applications, 76(8), 10855–10879.
Article Google Scholar
Manning, C., Raghavan, P., & Schütze, H. (2010). Introduction to information retrieval. Natural Language Engineering, 16(1), 100–103.
Google Scholar
McInnes, L., Healy, J., & Melville, J. (2018). Umap: Uniform manifold approximation and projection for dimension reduction. arXiv:1802.03426.
Min, W., Bao, B. -K., Xu, C., & Hossain, M.S. (2015). Cross-platform multi-modal topic modeling for personalized inter-platform recommendation. IEEE Transactions on Multimedia, 17(10), 1787–1801.
Article Google Scholar
Pandove, D., Goel, S., & Rani, R. (2018). Systematic review of clustering high-dimensional and large datasets. ACM Transactions on Knowledge Discovery from Data (TKDD), 12(2), 1–68.
Article Google Scholar
Pang, J., Jia, F., Zhang, C., Zhang, W., Huang, Q., & Yin, B. (2015). Unsupervised web topic detection using a ranked clustering-like pattern across similarity cascades. IEEE Transactions on Multimedia, 17 (6), 843–853.
Article Google Scholar
Pang, J., Tao, F., Li, L., Huang, Q., Yin, B., & Tian, Q. (2018). A two-step approach to describing web topics via probable keywords and prototype images from background-removed similarities. Neurocomputing, 275, 478–487.
Article Google Scholar
Papadopoulos, S., Zigkolis, C., Kompatsiaris, Y., & Vakali, A. (2011). Cluster-based landmark and event detection for tagged photo collections. IEEE Multimedia Magazine, 18(1), 52–63.
Article Google Scholar
Qian, S., Zhang, T., Xu, C., & Shao, J. (2015). Multi-modal event topic model for social event analysis. IEEE Transactions on Multimedia, 18(2), 233–246.
Article Google Scholar
Shahaf, D., & Guestrin, C. (2010). Connecting the dots between news articles. In Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 623–632).
Shao, J., Ma, S., Lu, W., & Zhuang, Y. (2012). A unified framework for web video topic discovery and visualization. Pattern Recognition Letters, 33(4), 410–419.
Article Google Scholar
Steinbach, M., Ertöz, L., & Kumar, V. (2004). The challenges of clustering high dimensional data. In New directions in statistical physics (pp. 273–309). Springer.
Wang, Z., Li, L., & Huang, Q. (2015). Cross-media topic detection with refined cnn based image-dominant topic model. In Proceedings of the 23rd ACM international conference on multimedia (pp. 1171–1174).
Wu, X., Lu, Y. -J., Peng, Q., & Ngo, C. -W. (2011). Mining event structures from web videos. IEEE MultiMedia, 18(1), 38–51.
Article Google Scholar
Xie, L., Natsev, A., Kender, J.R., Hill, M., & Smith, J. R. (2011). Visual memes in social media: tracking real-world news in youtube videos. In Proceedings of the 19th ACM international conference on multimedia (pp. 53–62).
Xue, Z., Jiang, S., Li, G., Huang, Q., & Zhang, W. (2013). Cross-media topic detection associated with hot search queries. In Proceedings of the fifth international conference on internet multimedia computing and service (pp. 403–406).
Xue, Z., Li, G., Zhang, W., Pang, J., & Huang, Q. (2014). Topic detection in cross-media: a semi-supervised co-clustering approach. International Journal of Multimedia Information Retrieval, 3 (3), 193–205.
Article Google Scholar
You, Q., Cao, L., Cong, Y., Zhang, X., & Luo, J. (2015). A multifaceted approach to social multimedia-based prediction of elections. IEEE Transactions on Multimedia, 17(12), 2271–2280.
Article Google Scholar
Zeppelzauer, M., & Schopfhauser, D. (2016). Multimodal classification of events in social media. Image and Vision Computing, 53, 45–56.
Article Google Scholar
Zhang, W., Chen, T., Li, G., Pang, J., Huang, Q., & Gao, W. (2015). Fusing cross-media for topic detection by dense keyword groups. Neurocomputing, 169, 169–179.
Article Google Scholar
Zhang, Y., Li, G., Chu, L., Wang, S., Zhang, W., & Huang, Q. (2013). Cross-media topic detection: A multi-modality fusion framework. In 2013 IEEE International conference on multimedia and expo (ICME) (pp. 1–6). IEEE.

Download references

Acknowledgements

This work is being supported by the Council of Scientific and Industrial Research (CSIR), New Delhi, India, fellowship under award letter no. 09/135(0745)/2016-EMR-I.

Author information

Authors and Affiliations

Computer Science & Engineering Department, University Institute of Engineering and Technology, Panjab University, Chandigarh, 160014, India
Seema Rani & Mukesh Kumar

Authors

Seema Rani
View author publications
You can also search for this author in PubMed Google Scholar
Mukesh Kumar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Seema Rani.

Ethics declarations

Conflict of Interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Rani, S., Kumar, M. Heterogeneous Information Fusion based Topic Detection from Social Media Data. Inf Syst Front 25, 513–528 (2023). https://doi.org/10.1007/s10796-022-10334-w

Download citation

Accepted: 15 August 2022
Published: 07 September 2022
Issue Date: April 2023
DOI: https://doi.org/10.1007/s10796-022-10334-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Heterogeneous Information Fusion based Topic Detection from Social Media Data

Abstract

Access this article

Similar content being viewed by others

Visual topic discovering, tracking and summarization from social media streams

Online Web-Video Topic Detection and Tracking with Semi-supervised Learning

Online web video topic detection and tracking with semi-supervised learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s Note

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Heterogeneous Information Fusion based Topic Detection from Social Media Data

Abstract

Access this article

Similar content being viewed by others

Visual topic discovering, tracking and summarization from social media streams

Online Web-Video Topic Detection and Tracking with Semi-supervised Learning

Online web video topic detection and tracking with semi-supervised learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation