Abstract
Sampling technique has become one of the recent research focuses in the graph-related fields. Most of the existing graph sampling algorithms tend to sample the high degree or low degree nodes in the complex networks because of the characteristic of scale-free. Scale-free means that degrees of different nodes are subject to a power law distribution. So, there is a significant difference in the degrees between the overall sampling nodes. In this paper, we propose a concept of approximate degree distribution and devise a stratified strategy using it in the complex networks. We also develop two graph sampling algorithms combining the node selection method with the stratified strategy. The experimental results show that our sampling algorithms preserve several properties of different graphs and behave more accurately than other algorithms. Further, we prove the proposed algorithms are superior to the off-the-shelf algorithms in terms of the unbiasedness of the degrees and more efficient than state-of-the-art FFS and ES-i algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Han, J.: Data Mining: Concepts and Techniques, 3rd edn. Morgan Kaufmann Publishers Inc., San Francisco (2005)
Clauset, A., Shalizi, C.R., Newman, M.E.J.: Power-law distributions in empirical data. Soc. Ind. Appl. Math. 51(4), 661–703 (2009)
Yu, L.: Sampling and characterizing online social networks. Dissertation. The University of Bristol, England (2016)
Maiya, A.S., Berger-Wolf, T.Y.: Benefits of bias: towards better characterization of network sampling. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 105–113 (2011)
Ahmed, N.K., Neville, J., Kompella, R.: Network sampling: from static to streaming graphs. ACM Trans. Knowl. Discov. Data (TKDD) 8(2), 1–56 (2014)
Stutzbach, D., et al.: Sampling techniques for large, dynamic graphs. In: Proceedings of 25th IEEE International Conference on Computer Communications, INFOCOM 2006. IEEE, pp. 1–6 (2006)
Gjoka, M., Kurant, M., Butts, C.T., Markopoulou, A.: Walking in Facebook: a case study of unbiased sampling of OSNs. In: INFOCOM, Proceedings, pp. 1–9. IEEE (2010)
Lee, C.H., Xu, X., Eun, D.Y.: Beyond random walk and metropolis-hastings samplers: why you should not backtrack for unbiased graph sampling. ACM SIGMETRICS Perform. Eval. Rev. 40(1), 319–330 (2012)
Leskovec, J., Faloutsos, C.: Sampling from large graphs. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 631–636. ACM (2006)
Kurant, M., Gjoka, M., Butts, C.T., Markopoulou, A.: Walking on a graph with a magnifying glass: stratified sampling via weighted random walks. ACM SIGMETRICS Perform. Eval. Rev. 39(1), 241–252 (2011)
Cormen, T.H., Leiserson, C.E., Rivest, R.L., et al.: Introduction to Algorithms, 3rd (edn.), 30(00), 118–118 (2015)
SNAP homepage. http://snap.stanford.edu/data/index.html
Bora, D.J., Gupta, A.K.: Effect of different distance measures on the performance of k-means algorithm: an experimental study in Matlab. Computer Science (2014)
Kim, B., Kim, J.M., Yi, G.: Analysis of clustering evaluation considering features of item response data using data mining technique for setting cut-off scores. Symmetry 9(5), 62 (2017)
Doran, D.: Triad-based role discovery for large social systems. In: Social Informatics, pp. 130–143 (2014)
de Heer, W.: Harmonic syntax and high-level statistics of the songs of three early Classical composers, EECS Department, University of California, Berkeley, 167 (2017)
Acknowledgements
This work was supported by the Fund by The National Natural Science Foundation of China (Grant No. 61462012, No. 61562010, No. U1531246), Guizhou University Graduate Innovation Fund (Grant No. 2017078), the Innovation Team of the Data Analysis and Cloud Service of Guizhou Province (Grant No. [2015]53), Science and Technology Project of the Department of Science and Technology in Guizhou Province (Grant No. LH [2016]7427).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Zhu, J., Li, H., Chen, M., Dai, Z., Zhu, M. (2019). Enhancing Stratified Graph Sampling Algorithms Based on Approximate Degree Distribution. In: Silhavy, R. (eds) Artificial Intelligence and Algorithms in Intelligent Systems. CSOC2018 2018. Advances in Intelligent Systems and Computing, vol 764. Springer, Cham. https://doi.org/10.1007/978-3-319-91189-2_20
Download citation
DOI: https://doi.org/10.1007/978-3-319-91189-2_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-91188-5
Online ISBN: 978-3-319-91189-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)