Skip to main content

Towards Neighborhood Window Analytics over Large-Scale Graphs

  • Conference paper
  • First Online:
Database Systems for Advanced Applications (DASFAA 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9643))

Included in the following conference series:

Abstract

Information networks are often modeled as graphs, where the vertices are associated with attributes. In this paper, we study neighborhood window analytics, namely k-hop window query, that aims to capture the properties of a local community involving the k-hop neighbors (defined on the graph structures) of each vertex. We develop a novel index, Dense Block Index (DBIndex), to facilitate efficient processing of k-hop window queries. Extensive experimental studies conducted over both real and synthetic datasets with hundreds of millions of vertices and edges show that our proposed solutions are four orders of magnitude faster in query performance than the non-index algorithm, and are superior over the state-of-the-art solution in terms of both scalability and efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Available at http://snap.stanford.edu/data/index.html, which is used in [12].

  2. 2.

    Other variants of k-hop window for directed graphs are possible; e.g., a vertex u is in \(W_{kh}(v)\) iff there is a \(\alpha \)-hop directed path from u to v where \(\alpha \leqslant k\).

  3. 3.

    Note that a simpler variation of our optimization problem has been proven to be NP-hard [16].

  4. 4.

    Note that although we could have avoided deriving W(v) a second time if we had materialized all the derived windows the first time, our approach is designed to avoid the space complexity of materializing all the windows in memory at the cost of computing each W(v) twice. We present an optimization later in this section to avoid the recomputation cost on k-hop window query.

  5. 5.

    http://aws.amazon.com/ec2/pricing/.

  6. 6.

    http://snap.stanford.edu/snap/index.html.

  7. 7.

    As in [12], for each dataset, EAGR is run for 10 iterations in the index construction.

  8. 8.

    Degree means average degree of the graph. The generated graph is of Erdos-Renyi model .

References

  1. Briscoe, E.J., Appling, D.S., Mappus IV, R.L., Hayes, H.: Determining credibility from social network structure. In: ICASNAM 2013, pp. 1418–1424. ACM (2013)

    Google Scholar 

  2. Broder, A.Z., Glassman, S.C., Manasse, M.S., Zweig, G.: Syntactic clustering of the web. Comput. Netw. ISDN Syst. 29(8), 1157–1166 (1997)

    Article  Google Scholar 

  3. Campanario, J.M.: Empirical study of journal impact factors obtained using the classical two-year citation window versus a five-year citation window. Scientometrics 87(1), 189–204 (2011)

    Article  Google Scholar 

  4. Chen, C., Yan, X., Zhu, F., Han, J., Yu, P.S.: Graph OLAP: towards online analytical processing on graphs. In: ICDM 2008, pp. 103–112 (2008)

    Google Scholar 

  5. Cheng, J., Shang, Z., Cheng, H., Wang, H., Yu, J.X.: K-reach: who is in your small world. VLDB 5(11), 1292–1303 (2012)

    Google Scholar 

  6. Dai, L., Luo, J.-D., Fu, X., Li, Z.: Predicting offline behaviors from online features: an ego-centric dynamical network approach. In: HotSocial 2012, pp. 17–24 (2012)

    Google Scholar 

  7. Everett, M., Borgatti, S.P.: Ego network betweenness. Soc. Netw. 27(1), 31–38 (2005)

    Article  Google Scholar 

  8. Fan, Q., Wang, Z., Chan, C.Y., Tan, K.L.: Supporting window analytics over large-scale dynamic graphs, CORR (2015). arxiv:1510.07104

  9. Huebsch, R., Garofalakis, M., Hellerstein, J.M., Stoica, I.: Sharing aggregate computation for distributed queries. In: SIGMOD 2007, pp. 485–496 (2007)

    Google Scholar 

  10. Ma, H.H., Gustafson, S., Moitra, A., Bracewell, D.: Ego-centric network sampling in viral marketing applications. In: Ting, I.-H., Wu, H.-J., Ho, T.-H. (eds.) Mining and Analyzing Social Networks. SCI, vol. 288, pp. 35–51. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  11. Ma, N., Guan, J., Zhao, Y.: Bringing pagerank to the citation analysis. Inf. Process. Manage. 44(2), 800–810 (2008)

    Article  Google Scholar 

  12. Mondal, J., Deshpande, A.: Eagr: supporting continuous ego-centric aggregate queries over large dynamic graphs. In: SIGMOD 2014, pp. 1335–1346 (2014)

    Google Scholar 

  13. Moustafa, W.E., Deshpande, A., Getoor, L.: Ego-centric graph pattern census. In: ICDE 2012, pp. 234–245 (2012)

    Google Scholar 

  14. Navlakha, S., Rastogi, R., Shrivastava, N.: Graph summarization with bounded error. In: SIGMOD 2008, pp. 419–432 (2008)

    Google Scholar 

  15. Trigoni, N., Yao, Y., Demers, A., Gehrke, J., Rajaraman, R.: Multi-query optimization for sensor networks. In: Prasanna, V.K., Iyengar, S.S., Spirakis, P.G., Welsh, M. (eds.) DCOSS 2005. LNCS, vol. 3560, pp. 307–321. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  16. Vassilevska, V., Pinar, A.: Finding nonoverlapping dense blocks of a sparse matrix. Lawrence Berkeley National Laboratory, Livermore (2004)

    Google Scholar 

  17. Wang, Z., Fan, Q., Wang, H., Tan, K.-L., Agrawal, D., El Abbadi, A.: Pagrol: parallel graph OLAP over large-scale attributed graphs. In: ICDE 2014, pp. 496–507 (2014)

    Google Scholar 

  18. Yan, X., He, B., Zhu, F., Han, J.: Top-k aggregation queries over large networks. In: ICDE 2010, pp. 377–380 (2010)

    Google Scholar 

  19. Zhao, P., Li, X., Xin, D., Han, J.: Graph cube: on warehousing and OLAP multidimensional networks. In: SIGMOD 2011, pp. 853–864 (2011)

    Google Scholar 

Download references

Acknowledgment

Qi Fan is supported by NGS Scholarship. This work is supported by the MOE/NUS grant R-252-000-500-112 and AWS in Education Grant award.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qi Fan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Fan, Q., Wang, Z., Chan, CY., Tan, KL. (2016). Towards Neighborhood Window Analytics over Large-Scale Graphs. In: Navathe, S., Wu, W., Shekhar, S., Du, X., Wang, S., Xiong, H. (eds) Database Systems for Advanced Applications. DASFAA 2016. Lecture Notes in Computer Science(), vol 9643. Springer, Cham. https://doi.org/10.1007/978-3-319-32049-6_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-32049-6_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-32048-9

  • Online ISBN: 978-3-319-32049-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics