Abstract
A smart world can be considered as a convergence of the physical world, cyber world, social world, and thinking world. In these four worlds, huge amounts of valuable data are generated and gathered at a rapid rate from a broad range of data sources. Although the quality of these big data depend on their degrees of uncertainty, rich sets of valuable information and useful knowledge can be mined from the big data. This paper focuses on big data computing and mining, which aims to (a) analyze these rich sets of big data, and (b) discover implicit, previously unknown, and potentially useful information and knowledge from the big data. In particular, we present data science solutions for discover frequent patterns. Through our presentation, we discuss how these solutions interconnecting (a) big data generated and collected from the physical world, (b) frequent pattern mining algorithms in the cyber world, (c) social interactions among social individuals in the social world, and (d) user preference and interest reflecting the user cognitive thinking in the thinking world. We show these interactions through our discussion on mining coronavirus disease 2019 (COVID-19) data in a smart world environment. The interconnections link the physical, cyber, social and thinking worlds together to establish a better environment towards big data computing and mining in a smart world.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ma, J., Yang, L.T., Apduhan, B.O., Huang, R., Barolli, L., Takizawa, M.: Towards a smart world and ubiquitous intelligence: a walkthrough from smart things to smart hyperspaces and UbicKids. Int. J. Pervasive Comput. Commun. 1(1), 53–68 (2005). https://doi.org/10.1108/17427370580000113
Ning, H., Liu, H., Ma, J., Yang, L.T., Wan, Y., Ye, X., Huang, R.: From Internet to smart world. IEEE Access 3, 1994–1999 (2015). https://doi.org/10.1109/ACCESS.2015.2493890
Ning, H., Liu, H.: Cyber-physical-social-thinking space based science and technology framework for the Internet of Things. Sci. China Inf. Sci. 58(3), 1–19 (2015). https://doi.org/10.1007/s11432-014-5209-2
Leung, C.K., Braun, P., Cuzzocrea, A.: AI-based sensor information fusion for supporting deep supervised learning. Sensors 19(6), 1345:1–1345:12 (2019). https://doi.org/10.3390/s19061345
Han, Z., Leung, C.K.: FIMaaS: scalable frequent itemset mining-as-a-service on cloud for non-expert miners. In: BigDAS 2015, pp. 84–91. ACM (2015). https://doi.org/10.1145/2837060.2837072
Guo, L., Yin, H., Wang, Q., Cui, B., Huang, Z., Cui, L.: Group recommendation with latent voting mechanism. In: IEEE ICDE 2020, pp. 121–132 (2020). https://doi.org/10.1109/ICDE48307.2020.00018
Jiang, F., Leung, C.K., Pazdor, A.G.M.: Web page recommendation based on bitwise frequent pattern mining. In: IEEE/WIC/ACM WI 2016, pp. 632–635. IEEE (2016). https://doi.org/10.1109/WI.2016.0111
Leung, C.K., Jiang, F., Souza, J.: Web page recommendation from sparse big web data. In: IEEE/WIC/ACM WI 2018, pp. 592–597. IEEE (2018). https://doi.org/10.1109/WI.2018.00-32
Leung, C.K., Kajal, A., Won, Y., Choi, J.M.C.: Big data analytics for personalized recommendation systems. In: IEEE DASC-PiCom-CBDCom-CyberSciTech 2019, pp. 1060–1065 (2019). https://doi.org/10.1109/DASC/PiCom/CBDCom/CyberSciTech.2019.00190
Fariha, A., Ahmed, C.F., Leung, C.K., Abdullah, S.M., Cao, L.: Mining frequent patterns from human interactions in meetings using directed acyclic graphs. In: PAKDD 2013, Part I. LNCS (LNAI), vol. 7818, pp. 38–49. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37453-1_4
Leung, C.K., Tanbeer, S.K., Cameron, J.J.: Interactive discovery of influential friends from social networks. Soc. Netw. Anal. Min. 4(1), 154:1–154:13 (2014). https://doi.org/10.1007/s13278-014-0154-z
Deng, D., Mai, J.J., Leung, C.K., Cuzzocrea, A.: Cognitive-based hybrid collaborative filtering with rating scaling on entropy to defend shilling influence. In: ICNCC 2019, pp. 176–185. ACM (2019). https://doi.org/10.1145/3375998.3376040
Audu, A.A., Cuzzocrea, A., Leung, C.K., MacLeod, K.A., Ohin, N.I., Pulgar-Vidal, N.C.: An intelligent predictive analytics system for transportation analytics on open data towards the development of a smart city. In: CISIS 2019. AISC, vol. 993, pp. 224–236. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-22354-0_21
Kitagawa, H., Saad, W., Shim, K., Tang, J. (eds.): Proceedings of the IEEE BigComp 2019 (2019)
Lee, W., et al. (eds.): Proceedings of the IEEE BigComp 2020 (2020)
Leung, C.K.: Big data mining and computing in a smart world. In: IEEE UIC-ATC-ScalCom-CBDCom-IoP 2015, pp. ciii (2015). https://doi.org/10.1109/UIC-ATC-ScalCom-CBDCom-IoP.2015.341
Leung, C.K., Braun, P., Hoi, C.S.H., Souza, J., Cuzzocrea, A.: Urban analytics of big transportation data for supporting smart cities. In: DaWaK 2019. LNCS, vol. 11708, pp. 24–33. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-27520-4_3
Lee, W., Leung, C.K. (eds.): Big Data Applications and Services 2017. AISC, vol. 770. Springer, Singapore (2017). https://doi.org/10.1007/978-981-13-0695-2
Lee, W., Leung, C.K. (eds.): Special Issue on Selected Papers from Smart Data 2018 and Big Data Service 2018. Sensors, vol. 19. MDPI (2019)
Leung, C.K.: Big data mining applications and services. In: BigDAS 2015, pp. 1–8. ACM (2015). https://doi.org/10.1145/2837060.2837076
Leung, C.K., Nasridinov, A. (eds.): Proceedings of the BigDAS 2015. ACM (2015)
Xu, L., Guo, T., Dou, W., Wang, W., Wei, J.: An experimental evaluation of garbage collectors on big data applications. PVLDB 12(5), 570–583 (2019). https://doi.org/10.14778/3303753.3303762
Bellatreche, L., Leung, C.K., Xia, Y., Elbaz, D.: Advances in cloud and big data computing. Concurr. Comput.: Pract. Exp. 31(2), e5053:1–e5053:3 (2019). https://doi.org/10.1002/cpe.5053
Hilprecht, B., Binnig, C., Röhm, U.: Learning a partitioning advisor for cloud databases. In: ACM SIGMOD 2020, pp. 143–157 (2020). https://doi.org/10.1145/3318464.3389704
Jiang, F., Leung, C.K.: A data analytic algorithm for managing, querying, and processing uncertain big data in cloud environments. Algorithms 8(4), 1175–1194 (2015). https://doi.org/10.3390/a8041175
Jiang, F., Leung, C.K., Middleton, R., Pazdor, A.G.M.: Big social data mining in a cloud computing environment. In: ICCBB 2018, pp. 58–65. IEEE (2018). https://doi.org/10.1109/ICCBB.2018.8756461
Kobusinska, A., Leung, C.K., Hsu, C., Raghavendra, S., Chang, V.: Emerging trends, issues and challenges in Internet of Things, big data and cloud computing. Future Gener. Comput. Syst. (FGCS) 87, 416–419 (2018). https://doi.org/10.1016/j.future.2018.05.021
Bleifuß, T., Bornemann, L., Johnson, T., Kalashnikov, D.V., Naumann, F., Srivastava, D.: Exploring change - a new dimension of data analytics. PVLDB 12(2), 85–98 (2018). https://doi.org/10.14778/3282495.3282496
Cuzzocrea, A., Mumolo, E., Leung, C.K., Grasso, G.M.: An innovative deep-learning algorithm for supporting the approximate classification of workloads in big data environments. In: IDEAL 2019, Part II. LNCS, vol. 11872, pp. 225–237. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33617-2_24
Dewan, U., Ahmed, C.F., Leung, C.K., Rizvee, R.A., Deng, D., Souza, J.: An efficient approach for mining weighted frequent patterns with dynamic weights. In: ICDM 2019, pp. 13–27 (2019)
Lakshmanan, L.V.S., Leung, C.K., Ng, R.T.: The segment support map: scalable mining of frequent itemsets. ACM SIGKDD Explor. 2(2), 21–27 (2000). https://doi.org/10.1145/380995.381005
Leung, C.K.: Uncertain frequent pattern mining. In: Frequent Pattern Mining, pp. 417–453. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07821-2_14
Leung, C.K., Mateo, M.A.F., Brajczuk, D.A.: A tree-based approach for frequent pattern mining from uncertain data. In: PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 653–661. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-68125-0_61
Yoo, K., Leung, C.K., Nasridinov, A. (eds.): Special Issue on Big Data Analysis and Visualization. Applied Sciences, vol. 10. MDPI (2020)
Al-Dubai, A., et al. (eds.): Proceedings of the IEEE IUCC-DSCI-SmartCNS 2019 (2019)
Chen, J., Yang, LT. (eds.): Proceedings of the IEEE DSDIS-CPSCom-GreenCom-iThings 2015 (2015)
Dierckens, K.E., Harrison, A.B., Leung, C.K., Pind, A.V.: A data science and engineering solution for fast k-means clustering of big data. In: IEEE TrustCom-BigDataSE-ICESS 2017, pp. 925–932 (2017). https://doi.org/10.1109/TrustCom/BigDataSE/ICESS.2017.332
Leung, C.K., Zhang, Y.: An HSV-based visual analytic system for data science on music and beyond. IJACDT 8(1), 68–83 (2019). https://doi.org/10.4018/IJACDT.2019010105
Parameswaran, A.: Enabling data science for the majority. PVLDB 12(12), 2309–2322 (2019). https://doi.org/10.14778/3352063.3352148
Sarumi, O., Leung, C.K.: Scalable data science and machine learning algorithm for gene prediction. In: BigDAS 2019, pp. 118–126 (2019)
Leung, C.K., Zhang, Y., Hoi, C.S.H., Souza, J., Wodi, B.H.: Big data analysis and services: visualization of smart data to support healthcare analytics. In: IEEE iThings-GreenCom- CPSCom-SmartData 2019, pp. 1261–1268 (2019). https://doi.org/10.1109/iThings/GreenCom/CPSCom/SmartData.2019.00212
Jiang, F., Leung, C.K., Tanbeer, S.K.: Finding popular friends in social networks. In: CGC 2012, pp. 501–508. IEEE (2012). https://doi.org/10.1109/CGC.2012.99
Kim, J., Guo, T., Feng, K., Cong, G., Khan, A., Choudhury, F.M.: Densely connected user community and location cluster search in location-based social networks. In: ACM SIGMOD 2020, pp. 2199-2209. https://doi.org/10.1145/3318464.3380603
Leung, C.K.: Mathematical model for propagation of influence in a social network. In: Encyclopedia of Social Network Analysis and Mining, 2nd edn. pp. 1261–1269. Springer, New York (2018). https://doi.org/10.1007/978-1-4939-7131-2
Tanbeer, S.K., Leung, C.K., Cameron, J.J.: Interactive mining of strong friends from social networks and its applications in e-commerce. JOCEC 24(2–3), 157–173 (2014). https://doi.org/10.1080/10919392.2014.896715
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: VLDB 1994, pp. 487–499. Morgan Kaufmann (1994)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: ACM SIGMOD 2000, pp. 1–12 (2000). https://doi.org/10.1145/335191.335372
Braun, P., Cuzzocrea, A., Leung, C.K., Pazdor, A.G.M., Souza, J.: Item-centric mining of frequent patterns from big uncertain data. Procedia Comput. Sci. 126, 1875–1884 (2018). https://doi.org/10.1016/j.procs.2018.08.075
Shenoy, P., Bhalotia, J.R., Bawa, M., Shah, D.: Turbo-charging vertical mining of large databases. In: ACM SIGMOD 2000, pp. 22–33 (2000). https://doi.org/10.1145/335191.335376
Zaki, M.J.: Scalable algorithms for association mining. IEEE TKDE 12(3), 372–390 (2000). https://doi.org/10.1109/69.846291
Zaki, M.J.: Fast vertical mining using diffsets. In: ACM KDD 2003, pp. 326–335 (2003). https://doi.org/10.1145/956750.956788
Leung, C.K.: Pattern mining for knowledge discovery. In: IDEAS 2019, pp. 34:1–34:5. ACM (2019). https://doi.org/10.1145/3331076.3331099
Cuzzocrea, A., Leung, C.K., Jiang, F., MacKinnon, R.K.: Complex mining from uncertain big data in distributed environments: problems, definitions and two effective and efficient algorithms. In: Big Data Management and Processing, pp. 297-332. Taylor & Francis (2017) https://doi.org/10.1201/9781315154008-15
Kumar, S., Mohbey, K.K.: A review on big data based parallel and distributed approaches of pattern mining. JKSU-CIS (2019). https://doi.org/10.1016/j.jksuci.2019.09.006
Leung, C.K., Jiang, F., Pazdor, A.G.M.: Bitwise parallel association rule mining for web page recommendation. In: IEEE/WIC/ACM WI 2017, pp. 662–669. ACM (2017). https://doi.org/10.1145/3106426.3106542
Quoc, P.H.V., Küng, J.: FPO tree and DP3 algorithm for distributed parallel frequent itemsets mining. ESWA 140, 112874:1–112874:13 (2020). https://doi.org/10.1016/j.eswa.2019.112874
Zaki, M.J.: Parallel and distributed association mining: a survey. IEEE Concurr. 7(4), 14–25 (1999). https://doi.org/10.1109/4434.806975
Braun, P., Cuzzocrea, A., Jiang, F., Leung, C.K., Pazdor, A.G.M.: MapReduce-based complex big data analytics over uncertain and imprecise social networks. In: DaWaK 2017. LNCS, vol. 10440, pp. 130–145. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-64283-3_10
Luna, J.M., Padillo, F., Pechenizkiy, M., Ventura, S.: Apriori versions based on MapReduce for mining frequent patterns on big data. IEEE Trans. Cybern. 48(10), 2851–2865 (2018). https://doi.org/10.1109/TCYB.2017.2751081
Leung, C.K., Zhang, H., Souza, J., Lee, W.: Scalable vertical mining for big data analytics of frequent itemsets. In: DEXA 2018, Part I. LNCS, vol. 11029, pp. 3–17. Springer, Cham (2018) https://doi.org/10.1007/978-3-319-98809-2_1
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. CACM 51(1), 107–113 (2008). https://doi.org/10.1145/1327452.1327492
Hoi, C.S.H., Leung, C.K., Tran, K., Cuzzocrea, A., Bochicchio, M., Simonetti, M.: Supporting social information discovery from big uncertain social key-value data via graph-like metaphors. In: ICCC 2018. LNCS (LNISA), vol. 10971, pp. 102-116. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-94307-7_8
Braun, P., Cuzzocrea, A., Leung, C.K., Pazdor, A.G.M., Souza, J., Tanbeer, S.K.: Pattern mining from big IoT data with fog computing: models, issues, and research perspectives. In: IEEE/ACM CCGrid 2019, pp. 854–891. IEEE (2019). https://doi.org/10.1109/CCGRID.2019.00075
Lee, D., Cho, J., Park, D.: Efficient partitioning of on-cloud remote executable code and on-chip software for complex-connected IoT. In: IEEE BigComp 2019, pp. 627–630 (2019). https://doi.org/10.1109/BIGCOMP.2019.8679228
Leung, C.K., Deng, D., Hoi, C.S.H., Lee, W.: Constrained big data mining in an edge computing environment. In: Big Data Applications and Services 2017. AISC, vol. 770, pp. 61–68. Springer, Singapore (2017). https://doi.org/10.1007/978-981-13-0695-2_8
Zhou, D., Ouyang, M., Kuang, Z., Li, Z., Zhou, J.P., Cheng, X.: Incremental association rule mining based on matrix compression for edge computing. IEEE Access 7, 173044–173053 (2019). https://doi.org/10.1109/ACCESS.2019.2956823
Hoi, C.S.H., Khowaja, D., Leung, C.K.: Constrained frequent pattern mining from big data via crowdsourcing. In: Big Data Applications and Services 2017. AISC, vol. 770, pp. 69–79. Springer, Singapore (2017). https://doi.org/10.1007/978-981-13-0695-2_9
Leung, C.K.: Frequent itemset mining with constraints. In: Encyclopedia of Database Systems, 2nd edn. pp. 1531–1536. Springer, New York (2018) https://doi.org/10.1007/978-1-4614-8265-9_17
Nijssen, S., Zimmermann, A.: Constraint-based pattern mining. In: Frequent Pattern Mining, pp. 147–163. Springer, Cham (2014) https://doi.org/10.1007/978-3-319-07821-2_7
Sarumi, O.A., Leung, C.K.: Exploiting anti-monotonic constraints for mining palindromic motifs from big genomic data. In: IEEE BigData 2019, pp. 4864–4873 (2019). https://doi.org/10.1109/BigData47090.2019.9006397
Fan, C., Hao, H., Leung, C.K., Sun, L.Y., Tran, J.: Social network mining for recommendation of friends based on music interests. In: IEEE/ACM ASONAM 2018, pp. 833–840. IEEE (2018). https://doi.org/10.1109/ASONAM.2018.8508262
Leung, C.K., Cuzzocrea, A., Mai, J.J., Deng, D., Jiang, F.: Personalized DeepInf: enhanced social influence prediction with deep learning and transfer learning. In: IEEE BigData 2019, pp. 2871–2880 (2019). https://doi.org/10.1109/BigData47090.2019.9005969
Singh, S.P., Leung, C.K., Jiang, F., Cuzzocrea, A.: A theoretical approach to discover mutual friendships from social graph networks. In: iiWAS 2019, pp. 212–221. ACM (2019). https://doi.org/10.1145/3366030.3366077
Zarrinkalam, F., Fani, H., Bagheri, E.: Social user interest mining: methods and applications. In: ACM KDD 2019, pp. 3235–3236 (2019). https://doi.org/10.1145/3292500.3332279
Leung, C.K., Jiang, F., Poon, T.W., Crevier, P.: Big data analytics of social network data: who cares most about you on Facebook? In: Highlighting the Importance of Big Data Management and Analysis for Various Applications, pp. 1–15. Springer, Cham (2018) https://doi.org/10.1007/978-3-319-60255-4_1
Mai, M., Leung, C.K., Choi, J.M.C., Kwan, L.K.R.: Big data analytics of Twitter data and its application for physician assistants: who is talking about your profession in Twitter? In: Data Management and Analysis, pp. 17–32. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-32587-9_2
Pan, Z., Liang, Y., Wang, W., Yu, Y., Zheng, Y., Zhang, J.: Urban traffic prediction from spatio-temporal data using deep meta learning. In: ACM KDD 2019, pp. 1720–1730 (2019). https://doi.org/10.1145/3292500.3330884
De Guia, J., Devaraj, M., Leung, C.K.: DeepGx: deep learning using gene expression for cancer classification. In: IEEE/ACM ASONAM 2019, pp. 913–920. ACM (2019). https://doi.org/10.1145/3341161.3343516
Han, P., Yang, P., Zhao, P., Shang, S., Liu, Y., Zhou, J., Gao, X., Kalnis, P.: GCN-MF: disease-gene association identification by graph convolutional networks and matrix factorization. In: ACM KDD 2019, pp. 705–713 (2019). https://doi.org/10.1145/3292500.3330912
Kulkarni, A., Sood, K., Kaul, S., Vasuja, R. (eds.): Big Data Analytics in Healthcare. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-31672-3
Lee, P.: The unreasonable effectiveness, and difficulty, of data in healthcare. In: ACM KDD 2019, pp. 3–4 (2019). https://doi.org/10.1145/3292500.3330645
Pawliszak, T., Chua, M., Leung, C.K., Tremblay-Savard, O.: Operon-based approach for the inference of rRNA and tRNA evolutionary histories in bacteria. BMC Genom. 21(Supplement 2), 252:1–252:14 (2020). https://doi.org/10.1186/s12864-020-6612-2
Sarumi, O.A., Leung, C.K., Adetunmbi, A.O.: Spark-based data analytics of sequence motifs in large omics data. Procedia Comput. Sci. 126, 596–605 (2018). https://doi.org/10.1016/j.procs.2018.07.294
Souza, J., Leung, C.K., Cuzzocrea, A.: An innovative big data predictive analytics framework over hybrid big data sources with an application for disease analytics. In: AINA 2020. AISC, vol. 1151, pp. 669–680. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-44041-1_59
Wu, P., Cheng, C., Kaddi, C.D., Venugopalan, J., Hoffman, R., Wang, M.D.: Omic and electronic health record big data analytics for precision medicine. IEEE Trans. Biomed. Eng. 64(2), 263–273 (2017). https://doi.org/10.1109/TBME.2016.2573285
Acknowledgements
This project is partially supported by (i) Natural Sciences and Engineering Research Council of Canada (NSERC) as well as (ii) University of Manitoba.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Leung, C.K. (2021). Big Data Computing and Mining in a Smart World. In: Lee, W., Leung, C.K., Nasridinov, A. (eds) Big Data Analyses, Services, and Smart Data. BIGDAS 2018. Advances in Intelligent Systems and Computing, vol 899. Springer, Singapore. https://doi.org/10.1007/978-981-15-8731-3_2
Download citation
DOI: https://doi.org/10.1007/978-981-15-8731-3_2
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-8730-6
Online ISBN: 978-981-15-8731-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)