Incorporating Neighborhood Information and Sentence Embedding Similarity into a Repost Prediction Model in Social Media Networks

Qiang, Zhecheng; Pasiliao, Eduardo L.; Semenov, Alexander; Zheng, Qipeng P.

doi:10.1007/978-3-031-26303-3_1

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13831))

Included in the following conference series:

International Conference on Computational Data and Social Networks

561 Accesses

Abstract

Predicting repost behaviors within social media networks plays an important role in human activities analysis and influence maximization decision making. Traditional methods for repost prediction can be categorized into stochastic diffusion based models and user profile or content features based machine learning models. In this paper, we propose a new framework combining user profile, content similarity and the neighborhood information around each target link as input features to make the prediction. Here neighborhood information can be interpreted as the combination of neighbors’ user profile. Two different kinds of graph based combination models are introduced in the article. After collecting the input features, we implement the state-of-the-art machine learning methods, e.g., Logistic Regression, K-nearest Neighbors, Gaussian Naive Bayes, Deep Neural Network, Random Forest, XGBoosting and Stacking Model to predict repost probability. We evaluate our model on real dataset Weibo to compare the performance with different features and machine learning methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Anshelevich, E., Chakrabarty, D., Hate, A., Swamy, C.: Approximation algorithms for the firefighter problem: cuts over time and submodularity. In: Dong, Y., Du, D.-Z., Ibarra, O. (eds.) ISAAC 2009. LNCS, vol. 5878, pp. 974–983. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-10631-6_98
Chapter MATH Google Scholar
Bourigault, S., Lamprier, S., Gallinari, P.: Representation learning for information diffusion through social networks: an embedded cascade model. In: Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, WSDM 2016, pp. 573–582. ACM, New York (2016). https://doi.org/10.1145/2835776.2835817, http://doi.acm.org/10.1145/2835776.2835817
Budak, C., Agrawal, D., El Abbadi, A.: Limiting the spread of misinformation in social networks. In: Proceedings of the 20th International Conference on World Wide Web, pp. 665–674. ACM (2011)
Google Scholar
Chen, G.H., Nikolov, S., Shah, D.: A latent source model for nonparametric time series classification. In: Advances in Neural Information Processing Systems, pp. 1088–1096 (2013)
Google Scholar
Chen, M., Zheng, Q.P., Boginski, V., Pasiliao, E.L.: Reinforcement learning in information cascades based on dynamic user behavior. In: Tagarelli, A., Tong, H. (eds.) CSoNet 2019. LNCS, vol. 11917, pp. 148–154. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-34980-6_17
Chapter Google Scholar
Domingos, P.: Mining social networks for viral marketing. IEEE Intell. Syst. 20(1), 80–82 (2005)
Google Scholar
Fei, H., Jiang, R., Yang, Y., Luo, B., Huan, J.: Content based social behavior prediction: a multi-task learning approach. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp. 995–1000. ACM (2011)
Google Scholar
Goyal, A., Bonchi, F., Lakshmanan, L.V.: Learning influence probabilities in social networks. In: Proceedings of the third ACM International Conference on Web Search and Data Mining, pp. 241–250. ACM (2010)
Google Scholar
Granovetter, M.: Threshold models of collective behavior. Am. J. Sociol. 83(6), 1420–1443 (1978)
Article Google Scholar
Guille, A., Hacid, H.: A predictive model for the temporal dynamics of information diffusion in online social networks. In: Proceedings of the 21st International Conference on World Wide Web, pp. 1145–1152. ACM (2012)
Google Scholar
Jiang, B., et al.: Retweeting behavior prediction based on one-class collaborative filtering in social networks. In: Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 977–980. ACM (2016)
Google Scholar
Kempe, D., Kleinberg, J., Tardos, E.: Maximizing the spread of influence through a social network. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2003, pp. 137–146. ACM, New York (2003). https://doi.org/10.1145/956750.956769, http://doi.acm.org/10.1145/956750.956769
Lagnier, C., Denoyer, L., Gaussier, E., Gallinari, P.: Predicting information diffusion in social networks using content and user’s profiles. In: Serdyukov, P., et al. (eds.) ECIR 2013. LNCS, vol. 7814, pp. 74–85. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-36973-5_7
Chapter Google Scholar
Peng, H.K., Zhu, J., Piao, D., Yan, R., Zhang, Y.: Retweet modeling using conditional random fields. In: 2011 11th IEEE International Conference on Data Mining Workshops, pp. 336–343. IEEE (2011)
Google Scholar
Qiang, Z., Pasiliao, E.L., Zheng, Q.P.: Model-based learning of information diffusion in social media networks. Appl. Netw. Sci. 4(1), 1–16 (2019). https://doi.org/10.1007/s41109-019-0215-3
Article Google Scholar
Reimers, N., Gurevych, I.: Making monolingual sentence embeddings multilingual using knowledge distillation. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics (2020). https://arxiv.org/abs/2004.09813
Rodriguez, M.G., Balduzzi, D., Schölkopf, B.: Uncovering the temporal dynamics of diffusion networks. arXiv preprint arXiv:1105.0697 (2011)
Saito, K., Kimura, M., Ohara, K., Motoda, H.: Learning continuous-time information diffusion model for social behavioral data analysis. In: Zhou, Z.-H., Washio, T. (eds.) ACML 2009. LNCS (LNAI), vol. 5828, pp. 322–337. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-05224-8_25
Chapter Google Scholar
Saito, K., Nakano, R., Kimura, M.: Prediction of information diffusion probabilities for independent cascade model. In: Lovrek, I., Howlett, R.J., Jain, L.C. (eds.) KES 2008. LNCS (LNAI), vol. 5179, pp. 67–75. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-85567-5_9
Chapter Google Scholar
Saito, K., Ohara, K., Yamagishi, Y., Kimura, M., Motoda, H.: Learning diffusion probability based on node attributes in social networks. In: Kryszkiewicz, M., Rybinski, H., Skowron, A., Raś, Z.W. (eds.) ISMIS 2011. LNCS (LNAI), vol. 6804, pp. 153–162. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21916-0_18
Chapter Google Scholar
Shah, D., Zaman, T.: Detecting sources of computer viruses in networks: theory and experiment. SIGMETRICS Perform. Eval. Rev. 38(1), 203–214 (2010). https://doi.org/10.1145/1811099.1811063, http://doi.acm.org/10.1145/1811099.1811063
Suh, B., Hong, L., Pirolli, P., Chi, E.H.: Want to be retweeted? Large scale analytics on factors impacting retweet in twitter network. In: 2010 IEEE Second International Conference on Social Computing, pp. 177–184. IEEE (2010)
Google Scholar
Tsur, O., Rappoport, A.: What’s in a hashtag?: content based prediction of the spread of ideas in microblogging communities. In: Proceedings of the fifth ACM International Conference on Web Search and Data Mining, pp. 643–652. ACM (2012)
Google Scholar
Varshney, D., Kumar, S., Gupta, V.: Predicting information diffusion probabilities in social networks: a Bayesian networks based approach. Knowl.-Based Syst. 133, 66–76 (2017)
Article Google Scholar
Yun, G., Zheng, Q.P., Boginski, V., Pasiliao, E.L.: Information network cascading and network re-construction with bounded rational user behaviors. In: Tagarelli, A., Tong, H. (eds.) CSoNet 2019. LNCS, vol. 11917, pp. 351–362. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-34980-6_37
Chapter Google Scholar
Zhang, J., Tang, J., Li, J., Liu, Y., Xing, C.: Who influenced you? Predicting retweet via social influence locality. ACM Trans. Knowl. Discov. Data 9(3), 25:1–25:26 (2015). https://doi.org/10.1145/2700398, http://doi.acm.org/10.1145/2700398
Zhang, M., Chen, Y.: Link prediction based on graph neural networks. arXiv preprint arXiv:1802.09691 (2018)
Zhu, J., Xiong, F., Piao, D., Liu, Y., Zhang, Y.: Statistically modeling the effectiveness of disaster information in social media. In: 2011 IEEE Global Humanitarian Technology Conference (GHTC), pp. 431–436. IEEE (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Industrial Engineering and Management Systems, University of Central Florida, Orlando, FL, USA
Zhecheng Qiang & Qipeng P. Zheng
Munitions Directorate, Air Force Research Laboratory, Shalimar, FL, USA
Eduardo L. Pasiliao
Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL, USA
Alexander Semenov

Authors

Zhecheng Qiang
View author publications
You can also search for this author in PubMed Google Scholar
Eduardo L. Pasiliao
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Semenov
View author publications
You can also search for this author in PubMed Google Scholar
Qipeng P. Zheng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhecheng Qiang .

Editor information

Editors and Affiliations

Virginia Commonwealth University, Richmond, VA, USA
Thang N. Dinh
Yeung Kin Man Academic Building, City University Hong Kong, Kowloon Tong, Hong Kong
Minming Li

Appendices

Appendix

A Prediction Performance Measures

Measure	Definition	Formula
Accuracy	The ratio of correctly predicted observations to the total observations	\(\frac{\left( TP+TN\right) }{\left( TP+FP+FN+TN\right) }\)
Precision	The ratio of correctly predicted positive observations to the total predicted positive observations	\(\frac{TP}{\left( TP+FP\right) }\)
Recall	The ratio of correctly predicted positive observations to all the observations in actual class	\(\frac{TP}{\left( TP+FN\right) }\)
F1 Score	The weighted average of Precision and Recall	\(\frac{2 \cdot Precision\cdot Recall}{\left( Precision+Recall\right) }\)
ROCAUC	Compute area under the receiver operating characteristic curve which is True Positive Rate against False Positive Rate curve from prediction scores	–

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Qiang, Z., Pasiliao, E.L., Semenov, A., Zheng, Q.P. (2023). Incorporating Neighborhood Information and Sentence Embedding Similarity into a Repost Prediction Model in Social Media Networks. In: Dinh, T.N., Li, M. (eds) Computational Data and Social Networks . CSoNet 2022. Lecture Notes in Computer Science, vol 13831. Springer, Cham. https://doi.org/10.1007/978-3-031-26303-3_1

Download citation

DOI: https://doi.org/10.1007/978-3-031-26303-3_1
Published: 11 February 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26302-6
Online ISBN: 978-3-031-26303-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Incorporating Neighborhood Information and Sentence Embedding Similarity into a Repost Prediction Model in Social Media Networks