Journal of Combinatorial Optimization

, Volume 28, Issue 3, pp 588–599 | Cite as

Improvement of path analysis algorithm in social networks based on HBase

  • Yan Qiang
  • Bo Pei
  • Weili Wu
  • Juanjuan Zhao
  • Xiaolong Zhang
  • Yue Li
  • Lidong Wu
Article
  • 349 Downloads

Abstract

When social network has reached hundreds of million users, the analysis of data in social network services becomes very important. Understanding how nodes interconnect in large graphs is an essential problem in many fields. In order to find connecting nodes between two nodes or two groups of source nodes in huge graphs, we propose a parallelized data-mining algorithm to get the shortest path between nodes in a social network based on HBase distributed key/value store. Our algorithm can achieve the shortest path among different nodes in network under the parallel environment. We analyze the social network model by this algorithm first, and then optimize the output from cloud platform by using the intermediary degrees and degree central algorithm. Finally, with a simulated social network, we validate the efficiency of the proposed algorithm. The experiment results indicate that our algorithm can improve the efficiency of parallel breath-first search (BSF).

Keywords

Social networks HBase Parallel BFS The K-shortest paths Intermediary degrees 

References

  1. Barabási A-L, Albert R (1999) Emergence of scaling in random networks. Science 286:509Google Scholar
  2. Barabási A-L, Albert R, Jeong H (1999) Mean-field theory for scale-free random networks. Physica A 272:173Google Scholar
  3. Brandes U (2001) A faster algorithm for betweenness centrality. J Math Sociol 25(2):163–177CrossRefMATHGoogle Scholar
  4. Chang F, et al. (2006) Bigtable: a distributed storage system for structured data. OSDIGoogle Scholar
  5. Dean J, Ghemawat S (2008) Mapreduce simplified data processing on large clusters. Commun ACM 51(1):107–113CrossRefGoogle Scholar
  6. Dijkstra EW (1959) A note on two problems in connexion with graphs. Numerische Mathematik 1(1):269–271CrossRefMATHMathSciNetGoogle Scholar
  7. Gu L, Huang HL, Zhang XD (2013) The clustering coefficient and the diameter of small-world networks. Acta Mathematica Sinica 29:199–208 English SeriesGoogle Scholar
  8. Holme P, Kim BJ (2002) Growing scale free networks with tunable clustering. Phys Rev E 65:026107Google Scholar
  9. Hoque I, Gupta IC (2012) Disk layout techniques for online social network data. IEEE INTERNET COMPUTINGGoogle Scholar
  10. Lakshman A, Malik P (2010) Cassandra: a decentralized structured storage system. SIGOPS 44(2):35–40CrossRefGoogle Scholar
  11. Lin J, Dyer C (2010) Data-intensive text processing with MapReduce, ser. Synthesis lectures on human language technologies. Morgan and Claypool Publishers, FloridaGoogle Scholar
  12. Lu Z, Fan L, Wu W, Thuraisingham B, Yang K (2014) Efficient influence spread estimation for influence maximization under the linear threshold model. To appear in computational, social networksGoogle Scholar
  13. McCubbin C, Perozzi B (2011) Finding the ‘Needle’: Locating interesting nodes using the K-shortest paths algorithm in MapReduce. 2011 11th IEEE International Conference on Data Mining WorkshopsGoogle Scholar
  14. Missen MMSC (2008) The small world of web network graphs. International Multi Topic Conference on Wireless Networks, Information Processing and Systems, IMTICGoogle Scholar
  15. Qin L, Li H (2011) Centrality analysis of BBS reply networks. 2011 International Conference on Information Technology, Computer Engineering and Management Sciences, ICM 2011. September 24-25 2011Google Scholar
  16. Škrabálek J, Kunc P, Nguyen F (2013) Towards effective social network system implementation/new trends in databases and information systems. Springer Berlin, Heidelberg, pp 327–336CrossRefGoogle Scholar
  17. Taylor RC (2010) An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics. BMC Bioinform 11(Suppl 12):S1CrossRefGoogle Scholar
  18. Watts DJ, Strogatz SH (1998) Collective dynamics of ‘small-world’ networks. Nature 393:440CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  • Yan Qiang
    • 1
  • Bo Pei
    • 1
  • Weili Wu
    • 1
    • 2
  • Juanjuan Zhao
    • 1
  • Xiaolong Zhang
    • 1
    • 3
  • Yue Li
    • 1
  • Lidong Wu
    • 2
  1. 1.College of Computer Science and TechnologyTaiyuan University of TechnologyTaiyuanChina
  2. 2.Department of Computer ScienceUniversity of Texas at DallasRichardsonUSA
  3. 3.College of Information Sciences and TechnologyPennsylvania State UniversityState CollegeUSA

Personalised recommendations