Skip to main content
Log in

Top-K structural diversity search in large networks

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

Social contagion depicts a process of information (e.g., fads, opinions, news) diffusion in the online social networks. A recent study reports that in a social contagion process, the probability of contagion is tightly controlled by the number of connected components in an individual’s neighborhood. Such a number is termed structural diversity of an individual, and it is shown to be a key predictor in the social contagion process. Based on this, a fundamental issue in a social network is to find top-\(k\) users with the highest structural diversities. In this paper, we, for the first time, study the top-\(k\) structural diversity search problem in a large network. Specifically, we study two types of structural diversity measures, namely, component-based structural diversity measure and core-based structural diversity measure. For component-based structural diversity, we develop an effective upper bound of structural diversity for pruning the search space. The upper bound can be incrementally refined in the search process. Based on such upper bound, we propose an efficient framework for top-\(k\) structural diversity search. To further speed up the structural diversity evaluation in the search process, several carefully devised search strategies are proposed. We also design efficient techniques to handle frequent updates in dynamic networks and maintain the top-\(k\) results. We further show how the techniques proposed in component-based structural diversity measure can be extended to handle the core-based structural diversity measure. Extensive experimental studies are conducted in real-world large networks and synthetic graphs, and the results demonstrate the efficiency and effectiveness of the proposed methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20

Similar content being viewed by others

Notes

  1. https://www.kddcup2012.org.

  2. http://dblp.uni-trier.de/xml/.

  3. http://www.netcom-analyzer.org/datasets/166.

References

  1. Agrawal, R., Gollapudi, S., Halverson, A., Ieong, S.: Diversifying search results. In: WSDM, pp. 5–14 (2009)

  2. Angel, A., Koudas, N.: Efficient diversity-aware search. In: SIGMOD, pp. 781–792 (2011)

  3. Backstrom, L., Huttenlocher, D.P., Kleinberg, J.M., Lan, X.: Group formation in large social networks: membership, growth, and evolution. In: KDD, pp. 44–54 (2006)

  4. Batagelj, V., Zaversnik, M.: An o (m) algorithm for cores decomposition of networks (2003). arXiv preprint cs/0310049

  5. Chang, K., Hwang, S.: Minimal probing: supporting expensive predicates for top-k queries. In: SIGMOD, pp. 346–357 (2002)

  6. Cheng, J., Ke, Y., Chu, S., Özsu, M.T.: Efficient core decomposition in massive networks. In: ICDE, pp. 51–62 (2011)

  7. Chiba, N., Nishizeki, T.: Arboricity and subgraph listing algorithms. SIAM J. Comput. 14(1), 210–223 (1985)

    Article  MATH  MathSciNet  Google Scholar 

  8. Chu, S., Cheng, J.: Triangle listing in massive networks and its applications. In: KDD, pp. 672–680 (2011)

  9. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms. MIT Press, Cambridge (2009)

    MATH  Google Scholar 

  10. Dodds, P.S., Watts, D.J.: Universal behavior in a generalized model of contagion. Phys. Rev. Lett. 92, 218701 (2004)

    Article  Google Scholar 

  11. Dorogovtsev, S.N., Mendes, J.F.F., Samukhin, A.N.: Structure of growing networks with preferential linking. Phys. Rev. Lett. 85(21), 4633 (2000)

    Article  Google Scholar 

  12. Fagin, R.: Combining fuzzy information from multiple systems. J. Comput. Syst. Sci. 58(1), 83–99 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  13. Fagin, R., Lotem, A., Naor, M.: Optimal aggregation algorithms for middleware. In: PODS, pp. 102–113 (2001)

  14. Huang, X., Cheng, H., Li, R.-H., Qin, L., Yu, J.X.: Top-k structural diversity search in large networks. PVLDB 6(13), 1618–1629 (2013)

    Google Scholar 

  15. Ilyas, I., Beskales, G., Soliman, M.: A survey of top-k query processing techniques in relational database systems. ACM Comput. Surv. (CSUR) 40(4), 11 (2008)

    Article  Google Scholar 

  16. Itai, A., Rodeh, M.: Finding a minimum circuit in a graph. SIAM J. Comput. 7(4), 413–423 (1978)

    Article  MATH  MathSciNet  Google Scholar 

  17. Kwak, H., Lee, C., Park, H., Moon, S.B.: What is twitter, a social network or a news media? In: WWW, pp. 591–600 (2010)

  18. Latapy, M.: Main-memory triangle computations for very large (sparse (power-law)) graphs. Theor. Comput. Sci. 407(1–3), 458–473 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  19. Li, R.-H., Yu, J.X.: Scalable diversified ranking on large graphs. In: ICDM, pp. 1152–1157 (2011)

  20. Luo, Y., Lin, X., Wang, W., Zhou, X.: Spark: top-k keyword query in relational databases. In: SIGMOD, pp. 115–126 (2007)

  21. Qin, L., Yu, J.X., Chang, L.: Diversifying top-k results. PVLDB 5(11), 1124–1135 (2012)

    Google Scholar 

  22. Romero, D.M., Meeder, B., Kleinberg, J.M.: Differences in the mechanics of information diffusion across topics: idioms, political hashtags, and complex contagion on twitter. In: WWW, pp. 695–704 (2011)

  23. Schank, T.: Algorithmic aspects of triangle-based network analysis. Ph.D. Dissertation, University Karlsruhe (2007)

  24. Schank, T., Wagner, D.: Finding, counting and listing all triangles in large graphs, an experimental study. In: WEA, pp. 606–609 (2005)

  25. Ugander, J., Backstrom, L., Marlow, C., Kleinberg, J.: Structural diversity in social contagion. Proc. Natl. Acad. Sci. 109(16), 5962–5966 (2012)

    Article  Google Scholar 

  26. Watts, D.J., Dodds, P.S.: Influentials, networks, and public opinion formation. J. Consum. Res. 34, 441–458 (2007)

    Article  Google Scholar 

  27. Xiao, C., Wang, W., Lin, X., Shang, H.: Top-k set similarity joins. In: ICDE, pp. 916–927 (2009)

  28. Zhang, Y., Callan, J., Minka, T.: Novelty and redundancy detection in adaptive filtering. In: SIGIR, pp. 81–88 (2002)

  29. Zhu, X., Guo, J., Cheng, X., Du, P., Shen, H.: A unified framework for recommending diverse and relevant queries. In: WWW, pp. 37–46 (2011)

Download references

Acknowledgments

This work is supported by the Hong Kong Research Grants Council (RGC) General Research Fund (GRF) Project Nos. CUHK 411211, 418512, 14209314, the Chinese University of Hong Kong Direct Grant Nos. 4055015, 4055048, NSFC Grants No. 61402292, and Natural Science Foundation of SZU Grant No. 201438. Lu Qin is supported by ARC DE140100999.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hong Cheng.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, X., Cheng, H., Li, RH. et al. Top-K structural diversity search in large networks. The VLDB Journal 24, 319–343 (2015). https://doi.org/10.1007/s00778-015-0379-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-015-0379-0

Keywords

Navigation