Advertisement

Resampling-based predictive simulation framework of stochastic diffusion model for identifying top-K influential nodes

  • Kouzou OharaEmail author
  • Kazumi Saito
  • Masahiro Kimura
  • Hiroshi Motoda
Regular Paper
  • 3 Downloads

Abstract

We address a problem of efficiently estimating the influence of a node in information diffusion over a social network. Since the information diffusion is a stochastic process, the influence degree of a node is quantified by the expectation, which is usually obtained by very time-consuming many runs of simulation. Our contribution is that we proposed a framework for predictive simulation based on the leave-N-out cross-validation technique that well approximates the error from the unknown ground truth for two target problems: one to estimate the influence degree of each node, and the other to identify top-K influential nodes. The method we proposed for the first problem estimates the approximation error of the influence degree of each node, and the method for the second problem estimates the precision of the derived top-K nodes, both without knowing the true influence degree. We experimentally evaluate the proposed methods using the three real-world networks and show that they can serve as a good measure to solve the target problems with far fewer runs of simulation ensuring the accuracy when the leave-half-out cross-validation, i.e., N is the half of the number of runs, is used, which means that one can identify the influential nodes without knowing exactly their influence degree in good accuracy.

Keywords

Predictive simulation Influence degree Information diffusion Resampling 

Notes

References

  1. 1.
    Bakshy, E., Hofman, J.M., Mason, W.A., Watts, D.J.: Everyone’s an influencer: quantifying influence on twitter. In: Proceedings of the Fourth ACM International Conference on Web Search and Data Mining (WSDM’11), pp. 65–74 (2011)Google Scholar
  2. 2.
    Barabási, A.L., Albert, R.: Emergence of scaling in random networks. Science 286, 509–512 (1999)MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Borgs, C., Brautbar, M., Chayes, J., Lucier, B.: Maximizing social influence in nearly optimal time. In: Proceedings of the 25th Annual ACM–SIAM Symposium on Discrete Algorithms (SODA’14), pp. 946–957 (2014)Google Scholar
  4. 4.
    Chen, W., Wang, C., Wang, Y.: Scalable influence maximization for prevalent viral marketing in large-scale social networks. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2010), pp. 1029–1038 (2010)Google Scholar
  5. 5.
    Chen, W., Wang, Y., Yang, S.: Efficient influence maximization in social networks. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’09), pp. 199–208 (2009)Google Scholar
  6. 6.
    Chen, W., Yuan, Y., Zhang, L.: Scalable influence maximization in social networks under the linear threshold model. In: Proceedings of the 10th IEEE International Conference on Data Mining (ICDM 2010), pp. 88–97 (2010)Google Scholar
  7. 7.
    Chernoff, H.: A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations. Ann. Math. Stat. 23(4), 493–507 (1952)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Cohen, E.: Size-estimation framework with applications to transitive closure and reachability. J. Comput. Syst. Sci. 55, 441–453 (1997)MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Cohen, E.: All-distances sketches, revisited: hip estimators for massive graphs analysis. In: Proceedings of the 33rd ACM SIGMOD–SIGACT–SIGART Symposium on Principles of Database Systems, pp. 88–99 (2015)Google Scholar
  10. 10.
    Cohen, E., Delling, D., Pajor, T., Werneck, R.F.: Sketch-based influence maximization and computation: Scaling up with guarantees. In: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, pp. 629–638 (2014)Google Scholar
  11. 11.
    Cui, P., Wang, F., Yang, S., Sun, L.: Item-level social influence prediction with probabilistic hybrid factor matrix factorization. In: Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence (AAAI 2011), pp. 331–336 (2011)Google Scholar
  12. 12.
    Goyal, A., Lu, W., Lakshmanan, L.: Celf++: optimizing the greedy algorithm for influence maximization in social networks. In: Proceedings of the 20th International World Wide Web Conference (WWW2011), pp. 47–48 (2011)Google Scholar
  13. 13.
    Guille, A., Hacid, H.: A predictive model for the temporal dynamics of information diffusion in online social networks. In: Proceedings of the 21st International Conference Companion on World Wide Web (WWW’12), pp. 1145–1152 (2012)Google Scholar
  14. 14.
    Hoeffding, W.: Probability inequalities for sums of bounded random variables. J. Am. Stat. Assoc. 58(301), 13–30 (1963)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Jin, R., Liu, L., Aggarwal, C., Shen, Y.: Reliable clustering on uncertain graphs. In: Proceedings of the 2012 IEEE 12th International Conference on Data Mining (ICDM 2012), pp. 459–468 (2012)Google Scholar
  16. 16.
    Kassiano, V., Gounaris, A., Papadopoulos, A.N., Tsichlas, K.: Mining uncertain graphs: an overview. In: Proceedings of the International Workshop on Algorithmic Aspects of Cloud Computing (ALGOCLOUD 2016), pp. 87–116 (2017)Google Scholar
  17. 17.
    Kempe, D., Kleinberg, J., Tardos, E.: Maximizing the spread of influence through a social network. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2003), pp. 137–146 (2003)Google Scholar
  18. 18.
    Khan, A., Ye, Y., L, C.: On Uncertain Graphs. Morgan & Claypool, San Rafael (2018)CrossRefGoogle Scholar
  19. 19.
    Kimura, M., Saito, K.: Tractable models for information diffusion in social networks. In: Proceedings of the 10th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD 2006), LNAI 4213, pp. 259–271 (2006)Google Scholar
  20. 20.
    Kimura, M., Saito, K., Motoda, H.: Minimizing the spread of contamination by blocking links in a network. In: Proceedings of the 23rd AAAI Conference on Artificial Intelligence (AAAI-08), pp. 1175–1180 (2008)Google Scholar
  21. 21.
    Kimura, M., Saito, K., Motoda, H.: Blocking links to minimize contamination spread in a social network. ACM Trans. Knowl. Discov. Data 3, 9:1–9:23 (2009)CrossRefGoogle Scholar
  22. 22.
    Kimura, M., Saito, K., Motoda, H.: Efficient estimation of influence functions fot sis model on social networks. In: Proceedings of the 21st International Joint Conference on Artificial Intelligence (IJCAI-09) (2009)Google Scholar
  23. 23.
    Kimura, M., Saito, K., Nakano, R.: Extracting influential nodes for information diffusion on a social network. In: Proceedings of the 22nd AAAI Conference on Artificial Intelligence (AAAI-07), pp. 1371–1376 (2007)Google Scholar
  24. 24.
    Kimura, M., Saito, K., Nakano, R., Motoda, H.: Extracting influential nodes on a social network for information diffusion. Data Min. Knowl. Disc. 20, 70–97 (2010)MathSciNetCrossRefGoogle Scholar
  25. 25.
    Kimura, M., Saito, K., Ohara, K., Motoda, H.: Speeding-up node influence computation for huge social networks. Int. J. Data Sci. Anal. 1, 1–14 (2016)CrossRefGoogle Scholar
  26. 26.
    Kleinberg, J.: The convergence of social and technological networks. Commun. ACM 51(11), 66–72 (2008)CrossRefGoogle Scholar
  27. 27.
    Klimt, B., Yang, Y.: The enron corpus: a new dataset for email classification research. In: Proceedings of the 2004 European Conference on Machine Learning (ECML’04), pp. 217–226 (2004)Google Scholar
  28. 28.
    Lawrence, T., Hosein, P.: Stochastic dynamic programming heuristics for influence maximization-revenue optimization. Int. J. Data Sci. Anal. (first Online) (2018)Google Scholar
  29. 29.
    Leskovec, J., Krause, A., Guestrin, C., Faloutsos, C., VanBriesen, J., Glance, N.: Cost-effective outbreak detection in networks. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2007), pp. 420–429 (2007)Google Scholar
  30. 30.
    Loukides, G., Gwadera, R.: Preventing the diffusion of information to vulnerable users while preserving pagerank. Int. J. Data Sci. Anal. 5(1), 19–39 (2018)CrossRefGoogle Scholar
  31. 31.
    Milo, R., Shen-Orr, S., Itzkovitz, S., Kashtan, N., Chklovskii, D., Alon, U.: Network motifs: simple building blocks of complex networks. Science 298, 824–827 (2002)CrossRefGoogle Scholar
  32. 32.
    Newman, M.E.J.: The structure and function of complex networks. SIAM Rev. 45, 167–256 (2003)MathSciNetCrossRefzbMATHGoogle Scholar
  33. 33.
    Nguyen, H., Zheng, R.: Influence spread in large-scale social networks—a belief propagation approach. In: Proceedings of the 2012 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2012), LNAI 7524, pp. 515–530 (2012)Google Scholar
  34. 34.
    Ohara, K., Saito, K., Kimura, M., Motoda, H.: Predictive simulation framework of stochastic diffusion model for identifying top-k influential nodes. In: Proceedings of the 5th Asian Conference on Machine Learning (ACML2013), PMLR, vol. 29, pp. 149–164 (2013)Google Scholar
  35. 35.
    Ohara, K., Saito, K., Kimura, M., Motoda, H.: Resampling-based framework for estimating node centrality of large social network. In: Proceedings of the 17th International Conference on Discovery Science (DS’14), LNAI 8777, pp. 228–239 (2014)Google Scholar
  36. 36.
    Ohara, K., Saito, K., Kimura, M., Motoda, H.: Resampling-based gap analysis for detecting nodes with high centrality on large social network. In: Proceedings of the Nineteenth Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD2015), pp. 135–147 (2015)Google Scholar
  37. 37.
    Ohara, K., Saito, K., Kimura, M., Motoda, H.: Critical node identification based on articulation point detection for network with uncertain connectivity. In: Proceedings of the Sixth International Symposium on Computing and Networking (CANDAR 2018), pp. 146–152 (2018)Google Scholar
  38. 38.
    Potamias, M., Bonchi, F., Gionis, A., Kollios, G.: K-nearest neighbors in uncertain graphs. Proc. VLDB Endow. 3, 997–1008 (2010)CrossRefGoogle Scholar
  39. 39.
    Rossetti, G., Milli, L., Salvator, R., Sirbu, A., Pedreschi, D., Giannotti, F.: Ndlib: a python library to model and analyze diffusion processes over complex networks. Int. J. Data Sci. Anal. 5(1), 61–79 (2018)CrossRefGoogle Scholar
  40. 40.
    Saito, K., Kimura, M., Motoda, H.: Discovering influential nodes for sis models in social networks. In: Proceedings of the Twelfth International Conference of Discovery Science (DS2009), LNAI 5808, pp. 302–316. Springer, Berlin (2009)Google Scholar
  41. 41.
    Saito, K., Kimura, M., Ohara, K., Motoda, H.: Learning continuous-time information diffusion model for social behavioral data analysis. In: Proceedings of the 1st Asian Conference on Machine Learning (ACML2009), LNAI 5828, pp. 322–337 (2009)Google Scholar
  42. 42.
    Saito, K., Kimura, M., Ohara, K., Motoda, H.: Selecting information diffusion models over social networks for behavioral analysis. In: Proceedings of the 2010 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2010), LNAI 6323, pp. 180–195 (2010)Google Scholar
  43. 43.
    Saito, K., Kimura, M., ohara, K., Motoda, H.: Which targets to contact first to maximize influence over social network. In: Proceedings of the 6th International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction (SBP2013), LNCS 7812, pp. 359–367 (2013)Google Scholar
  44. 44.
    Vázquez, A.: Growing network with local rules: preferential attachment, clustering hierarchy, and degree correlations. Phys. Rev. 67(5), 056,104 (2003)Google Scholar
  45. 45.
    Watts, D.J., Dodds, P.S.: Influence, networks, and public opinion formation. J. Consum. Res. 34, 441–458 (2007)CrossRefGoogle Scholar
  46. 46.
    Watts, D.J., Strogatz, S.H.: Collective dyanamics of “small-world” networks. Nature 393, 440–442 (1998)CrossRefzbMATHGoogle Scholar
  47. 47.
    Yang, J., Counts, S.: Predicting the speed, scale, and range of information diffusion in twitter. In: Proceedings of the Fourth International Conference on Weblogs and Social Media (ICWSM 2010) (2010)Google Scholar
  48. 48.
    Yang, J., Leskovec, J.: Modeling information diffusion in implicit networks. In: Proceedings of the 2010 IEEE International Conference on Data Mining (ICDM’10), pp. 599–608 (2010)Google Scholar
  49. 49.
    Yang, Y., Chen, E., Liu, Q., Xiang, B., Xu, T., Shad, S.: On approximation of real-world influence spread. In: Proceedings of the 2012 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2012), LNAI 7524a, pp. 548–564 (2012)Google Scholar
  50. 50.
    Zhou, C., Zhang, P., Zang, W., Guo, L.: On the upper bounds of spread for greedy algorithms in social network influence maximization. IEEE Trans. Knowl. Data Eng. 27, 2770–2783 (2015)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Department of Integrated Information TechnologyAoyama Gakuin UniversityKanagawaJapan
  2. 2.Faculty of ScienceKanagawa UniversityKanagawaJapan
  3. 3.Center for Advanced Intelligence ProjectRIKENTokyoJapan
  4. 4.Department of Electronics and InformaticsRyukoku UniversityOtsuJapan
  5. 5.Institute of Scientific and Industrial ResearchOsaka UniversityOsakaJapan

Personalised recommendations