Skip to main content

MapReduce-Based Complex Big Data Analytics over Uncertain and Imprecise Social Networks

  • Conference paper
  • First Online:
Book cover Big Data Analytics and Knowledge Discovery (DaWaK 2017)

Abstract

With advances in technology, high volumes of valuable but complex data can be easily collected and generated from various sources in the current era of big data. A prime source of these complex big data is the social network, in which users are often linked by some interdependencies such as friendships and follower-followee relationships. These interdependencies can be uncertain and imprecise. Moreover, as the social network keeps growing, there are situations in which individual users or businesses want to find those popular (i.e., frequently followed) groups of users so that they can follow the same groups. In this paper, we present a complex big data analytic solution that uses the MapReduce model to mine uncertain and imprecise social networks for discovering groups of potentially popular users. Evaluation results show the efficiency and practicality of our solution in conducting complex big data analytics over uncertain and imprecise social networks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://snap.stanford.edu/data/.

  2. 2.

    http://aws.amazon.com/ec2/.

References

  1. Balsa, E., Troncoso, C., Diaz, C.: A metric to evaluate interaction obfuscation in online social networks. Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 20(6), 877–892 (2012)

    Article  MathSciNet  Google Scholar 

  2. Bohlouli, M., Dalter, J., Dornhöfer, M., Zenkert, J., Fathi, M.: Knowledge discovery from social media using big data-provided sentiment analysis (SoMABiT). J. Inf. Sci. 41(6), 779–798 (2015)

    Article  Google Scholar 

  3. Chen, C.L.P., Zhang, C.: Data-intensive applications, challenges, techniques and technologies: a survey on big data. Inf. Sci. 275, 314–347 (2014)

    Article  Google Scholar 

  4. Cuzzocrea, A., Bellatreche, L., Song, I.-Y.: Data warehousing and OLAP over big data: current challenges and future research directions. In: ACM DOLAP 2013, pp. 67–70 (2013)

    Google Scholar 

  5. Cuzzocrea, A., Darmont, J., Mahboubi, H.: Fragmenting very large XML data warehouses via k-means clustering algorithm. Int. J. Bus. Intell. Data Min. 4(3/4), 301–328 (2009)

    Article  Google Scholar 

  6. Cuzzocrea, A., Furfaro, F., Saccà, D.: Hand-OLAP: a system for delivering OLAP services on handheld devices. In: ISADS 2003, pp. 80–87 (2003)

    Google Scholar 

  7. Cuzzocrea, A., Leung, C.K.-S., MacKinnon, R.K.: Mining constrained frequent itemsets from distributed uncertain data. Future Gener. Comput. Syst. 37, 117–126 (2014)

    Article  Google Scholar 

  8. Cuzzocrea, A., Saccà, D., Serafino, P.: A hierarchy-driven compression technique for advanced OLAP visualization of multidimensional data cubes. In: Tjoa, A.M., Trujillo, J. (eds.) DaWaK 2006. LNCS, vol. 4081, pp. 106–119. Springer, Heidelberg (2006). doi:10.1007/11823728_11

    Chapter  Google Scholar 

  9. Cuzzocrea, A., Saccà, D., Ullman, J.D.: Big data: a research agenda. In: IDEAS 2013, pp. 198–203 (2013)

    Google Scholar 

  10. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)

    Article  Google Scholar 

  11. Dhahri, N., Trabelsi, C., Ben Yahia, S.: RssE-Miner: a new approach for efficient events mining from social media RSS feeds. In: Cuzzocrea, A., Dayal, U. (eds.) DaWaK 2012. LNCS, vol. 7448, pp. 253–264. Springer, Heidelberg (2012). doi:10.1007/978-3-642-32584-7_21

    Chapter  Google Scholar 

  12. Jiang, F., Leung, C.K.-S.: Mining interesting “following” patterns from social networks. In: Bellatreche, L., Mohania, M.K. (eds.) DaWaK 2014. LNCS, vol. 8646, pp. 308–319. Springer, Cham (2014). doi:10.1007/978-3-319-10160-6_28

    Google Scholar 

  13. Jiang, F., Leung, C.K.-S.: Stream mining of frequent patterns from delayed batches of uncertain data. In: Bellatreche, L., Mohania, M.K. (eds.) DaWaK 2013. LNCS, vol. 8057, pp. 209–221. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40131-2_18

    Chapter  Google Scholar 

  14. Jiang, F., Leung, C.K.-S., Liu, D., Peddle, A.M.: Discovery of really popular friends from social networks. In: IEEE BDCloud 2014, pp. 342–349 (2014)

    Google Scholar 

  15. Jiang, F., Leung, C.K.-S., Sarumi, O.A., Zhang, C.Y.: Mining sequential patterns from uncertain big DNA data in the Spark framework. In: IEEE BIBM 2016, pp. 874–881 (2016)

    Google Scholar 

  16. Jin, S., Lin, W., Yin, H., Yang, S., Li, A., Deng, B.: Community structure mining in big data social media networks with MapReduce. Cluster Comput. 18(3), 999–1010 (2015)

    Article  Google Scholar 

  17. Liu, H., Chen, L., Zhu, H., Lu, T., Liang, F.: Uncertainty community detection in social networks. J. Softw. 9(4), 1045–1049 (2014)

    Google Scholar 

  18. Kang, Y., Yu, B., Wang, W., Meng, D.: Spectral clustering for large-scale social networks via a pre-coarsening sampling based NystrÖm method. In: Cao, T., Lim, E.-P., Zhou, Z.-H., Ho, T.-B., Cheung, D., Motoda, H. (eds.) PAKDD 2015, Part II. LNCS (LNAI), vol. 9078, pp. 106–118. Springer, Cham (2015). doi:10.1007/978-3-319-18032-8_9

    Chapter  Google Scholar 

  19. Leung, C.K.-S., Cuzzocrea, A., Jiang, F.: Discovering frequent patterns from uncertain data streams with time-fading and landmark models. In: Hameurlain, A., Küng, J., Wagner, R., Cuzzocrea, A., Dayal, U. (eds.) TLDKS VIII. LNCS, vol. 7790, pp. 174–196. Springer, Heidelberg (2013). doi:10.1007/978-3-642-37574-3_8

    Chapter  Google Scholar 

  20. Leung, C.K.-S., Jiang, F.: Big data analytics of social networks for the discovery of “following” patterns. In: Madria, S., Hara, T. (eds.) DaWaK 2015. LNCS, vol. 9263, pp. 123–135. Springer, Cham (2015). doi:10.1007/978-3-319-22729-0_10

    Chapter  Google Scholar 

  21. Leung, C.K.-S., Jiang, F., Pazdor, A.G.M., Peddle, A.M.: Parallel social network mining for interesting ‘following’ patterns. Concurr. Comput. Practice Exp. 28(15), 3994–4012 (2016)

    Article  Google Scholar 

  22. Leung, C.K.-S., MacKinnon, R.K.: BLIMP: a compact tree structure for uncertain frequent pattern mining. In: Bellatreche, L., Mohania, M.K. (eds.) DaWaK 2014. LNCS, vol. 8646, pp. 115–123. Springer, Cham (2014). doi:10.1007/978-3-319-10160-6_11

    Google Scholar 

  23. Leung, C.K.-S., MacKinnon, R.K., Tanbeer, S.K.: Fast algorithms for frequent itemset mining from uncertain data. In: IEEE ICDM 2014, pp. 893–898 (2014)

    Google Scholar 

  24. Leung, C.K.-S., Mateo, M.A.F., Brajczuk, D.A.: A tree-based approach for frequent pattern mining from uncertain data. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 653–661. Springer, Heidelberg (2008). doi:10.1007/978-3-540-68125-0_61

    Chapter  Google Scholar 

  25. Leung, C.K.-S., Tanbeer, S.K.: Mining popular patterns from transactional databases. In: Cuzzocrea, A., Dayal, U. (eds.) DaWaK 2012. LNCS, vol. 7448, pp. 291–302. Springer, Heidelberg (2012). doi:10.1007/978-3-642-32584-7_24

    Chapter  Google Scholar 

  26. Leung, C.K.-S., Tanbeer, S.K., Cameron, J.J.: Interactive discovery of influential friends from social networks. Soc. Netw. Anal. Min. 4(1), art. 154 (2014)

    Google Scholar 

  27. Ma, L., Huang, H., He, Q., Chiew, K., Wu, J., Che, Y.: GMAC: a seed-insensitive approach to local community detection. In: Bellatreche, L., Mohania, M.K. (eds.) DaWaK 2013. LNCS, vol. 8057, pp. 297–308. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40131-2_26

    Chapter  Google Scholar 

  28. Madden, S.: From databases to big data. IEEE Internet Comput. 16(3), 4–6 (2012)

    Article  Google Scholar 

  29. Mumu, T.S., Ezeife, C.I.: Discovering community preference influence network by social network opinion posts mining. In: Bellatreche, L., Mohania, M.K. (eds.) DaWaK 2014. LNCS, vol. 8646, pp. 136–145. Springer, Cham (2014). doi:10.1007/978-3-319-10160-6_13

    Google Scholar 

  30. Rader, E., Gray, R.: Understanding user beliefs about algorithmic curation in the Facebook news feed. In: ACM CHI 2015, pp. 173–182 (2015)

    Google Scholar 

  31. Rajadesingan, A., Zafarani, R., Liu, H.: Sarcasm detection on Twitter: a behavioral modeling approach. In: ACM WSDM 2015, pp. 97–106 (2015)

    Google Scholar 

  32. Tanbeer, S.K., Leung, C.K.-S., Cameron, J.J.: Interactive mining of strong friends from social networks and its applications in e-commerce. J. Organ. Comput. Electron. Commerce 24(2–3), 157–173 (2014)

    Google Scholar 

  33. Wang, Y., Vasilakos, A.V., Ma, J., Xiong, N.: On studying the impact of uncertainty on behavior diffusion in social networks. IEEE Trans. Syst. Man Cybern. Syst. 45(2), 185–197 (2015)

    Article  Google Scholar 

  34. Wei, E.H.-C., Koh, Y.S., Dobbie, G.: Finding maximal overlapping communities. In: Bellatreche, L., Mohania, M.K. (eds.) DaWaK 2013. LNCS, vol. 8057, pp. 309–316. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40131-2_27

    Chapter  Google Scholar 

  35. Yu, W., Coenen, F., Zito, M., Salhi, S.: Minimal vertex unique labelled subgraph mining. In: Bellatreche, L., Mohania, M.K. (eds.) DaWaK 2013. LNCS, vol. 8057, pp. 317–326. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40131-2_28

    Chapter  Google Scholar 

  36. Yuan, N.J.: Mining social and urban big data. In: ACM WWW 2015, p. 1103 (2015)

    Google Scholar 

Download references

Acknowledgement

This project is partially supported by Natural Sciences and Engineering Research Council of Canada (NSERC) and University of Manitoba.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Carson Kai-Sang Leung .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Braun, P., Cuzzocrea, A., Jiang, F., Leung, C.KS., Pazdor, A.G.M. (2017). MapReduce-Based Complex Big Data Analytics over Uncertain and Imprecise Social Networks. In: Bellatreche, L., Chakravarthy, S. (eds) Big Data Analytics and Knowledge Discovery. DaWaK 2017. Lecture Notes in Computer Science(), vol 10440. Springer, Cham. https://doi.org/10.1007/978-3-319-64283-3_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-64283-3_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-64282-6

  • Online ISBN: 978-3-319-64283-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics