Abstract
Simultaneously, the application of resilient distributed datasets (RDD) in cloud computing provides a good environment for data analysis of big data. In addition, the combination of Machine Learning (ML) algorithms of the edge computing paradigm and the SFUP-SP algorithm may be able to also be used to improve local computing capabilities and speed up data analysis and user decision-making.
Similar content being viewed by others
Data Availability
All data is available upon request of the authors
References
Agrawal, R., Imielinski, T., Swami, A.: Database mining: a performance perspective. IEEE Trans. Knowl. Data Eng. 5(6), 914–925 (1993)
Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD international conference on management of data, pp. 207–216 (1993)
Agrawal, R., Srikant, R., et al: Fast algorithms for mining association rules. In: Proc. 20Th Int. Conf. Very Large Data Bases, VLDB, vol. 1215, pp. 487–499. Citeseer (1994)
Ahmed, C.F., Tanbeer, S.K., Jeong, B.S., Lee, Y.K.: Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans. Knowl. Data Eng. 21(12), 1708–1721 (2009)
Armbrust, M., Das, T., Davidson, A., Ghodsi, A., Or, A., Rosen, J., Stoica, I., Wendell, P., Xin, R., Zaharia, M.: Scaling spark in the real world: performance and usability. Proceedings of the VLDB Endowment 8(12), 1840–1843 (2015)
Benlachmi, Y., Hasnaoui, M.L.: Big data and spark: comparison with hadoop. In: 2020 Fourth World conference on smart trends in systems, security and sustainability (Worlds4), pp. 811–817. IEEE (2020)
Chan, R., Yang, Q., Shen, Y.D.: Mining high utility itemsets. In: Third IEEE International Conference on Data Mining, pp. 19–19. IEEE Computer Society (2003)
Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)
Dessokey, M., Saif, S.M., Salem, S., Saad, E., Eldeeb, H.: Memory management approaches in apache spark: a review. In: International Conference on Advanced Intelligent Systems and Informatics, pp. 394–403. Springer (2020)
Fournier-Viger, P., Lin, J.C.W., Gomariz, A., Gueniche, T., Soltani, A., Deng, Z., Lam, H.T.: The spmf open-source data mining library version 2. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 36–40. Springer (2016)
Fournier-Viger, P., Wu, C.W., Tseng, V.S.: Mining top-k association rules. In: Canadian Conference on Artificial Intelligence, pp. 61–73. Springer (2012)
Fournier-Viger, P., Wu, C.W., Zida, S., Tseng, V.S.: Fhm: faster high-utility itemset mining using estimated utility co-occurrence pruning. In: International Symposium on Methodologies for Intelligent Systems, pp. 83–92. Springer (2014)
Gadekallu, T.R., Pham, Q.V., Nguyen, D.C., Maddikunta, P.K.R., Deepa, N., Prabadevi, B., Pathirana, P.N., Zhao, J., Hwang, W.J.: Blockchain for edge of things: applications, opportunities, and challenges. IEEE Internet Things J. 9(2), 964–988 (2021)
Goyal, V., Sureka, A., Patel, D.: Efficient skyline itemsets mining. In: Proceedings of the Eighth International c* Conference on Computer Science & Software Engineering, pp. 119–124 (2015)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. ACM Sigmod Record 29(2), 1–12 (2000)
Krishnamoorthy, S.: Pruning strategies for mining high utility itemsets. Expert Syst. Appl. 42(5), 2371–2381 (2015)
Lin, C.W., Hong, T.P., Lu, W.H.: Efficiently mining high average utility itemsets with a tree structure. In: Asian Conference on Intelligent Information and Database Systems, pp. 131–139. Springer (2010)
Lin, C.W., Hong, T.P., Lu, W.H.: An effective tree structure for mining high utility itemsets. Expert Syst. Appl. 38(6), 7419–7424 (2011)
Lin, J.C.W., Yang, L., Fournier-Viger, P., Dawar, S., Goyal, V., Sureka, A., Vo, B.: A more efficient algorithm to mine skyline frequent-utility patterns. In: International Conference on Genetic and Evolutionary Computing, pp. 127–135. Springer (2016)
Lin, J.C.W., Yang, L., Fournier-Viger, P., Hong, T.P.: Mining of skyline patterns by considering both frequent and utility constraints. Eng. Appl. Artif. Intel. 77, 229–238 (2019)
Liu, J., Wang, K., Fung, B.C.: Direct discovery of high utility itemsets without candidate generation. In: 2012 IEEE 12th International Conference on Data Mining, pp. 984–989. IEEE (2012)
Liu, J., Wang, K., Fung, B.C.: Mining high utility patterns in one phase without generating candidates. IEEE Trans. Knowl. Data Eng. 28(5), 1245–1257 (2015)
Liu, M., Qu, J.: Mining high utility itemsets without candidate generation. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 55–64 (2012)
Liu, Y., Liao, W.K., Choudhary, A.: A two-phase algorithm for fast discovery of high utility itemsets. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 689–695. Springer (2005)
Ogihara, Z.P., Zaki, M., Parthasarathy, S., Ogihara, M., Li, W.: New algorithms for fast discovery of association rules. In: 3rd Intl. Conf. on Knowledge Discovery and Data Mining, Citeseer (1997)
Podpecan, V., Lavrac, N., Kononenko, I.: A fast algorithm for mining utility-frequent itemsets. Constraint-Based Min. Learn. p. 9 (2007)
Salloum, S., Dautov, R., Chen, X., Peng, P.X., Huang, J.Z.: Big data analytics on apache spark. Int. J. Data Sci. Anal. 1(3), 145–164 (2016)
Satyanarayanan, M.: The emergence of edge computing. Comput. 50(1), 30–39 (2017)
Song, W., Zheng, C., Fournier-Viger, P.: Mining skyline frequent-utility itemsets with utility filtering. In: Pacific Rim International Conference on Artificial Intelligence, pp. 411–424. Springer (2021)
Tseng, V.S., Wu, C.W., Fournier-Viger, P., Philip, S.Y.: Efficient algorithms for mining top-k high utility itemsets. IEEE Trans. Knowl. Data Eng. 28(1), 54–67 (2015)
Tseng, V.S., Wu, C.W., Shie, B.E., Yu, P.S.: Up-growth: an efficient algorithm for high utility itemset mining. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 253–262 (2010)
Wu, J.M.T., Lin, J.C.W., Tamrakar, A.: High-utility itemset mining with effective pruning strategies. ACM Trans. Knowl. Discov. Data (TKDD) 13(6), 1–22 (2019)
Wu, J.M.T., Srivastava, G., Lin, J.C.W., Djenouri, Y., Wei, M., Parizi, R.M., Khan, M.S.: Mining of high-utility patterns in big iot-based databases. Mob. Netw. Appl. 26(1), 216–233 (2021)
Wu, J.M.T., Teng, Q., Huda, S., Chen, Y.C., Chen, C.M.: A privacy frequent itemsets mining framework for collaboration in iot using federated learning. ACM Trans. Sens. Netw. (TOSN) (2022)
Wu, J.M.T., Teng, Q., Lin, J.C.W., Cheng, C.F.: Incrementally updating the discovered high average-utility patterns with the pre-large concept. IEEE Access 8, 66788–66798 (2020)
Wu, J.M.T., Wei, M., Wu, M.E., Tayeb, S.: Top-k dominating queries on incomplete large dataset. J. Supercomput., pp. 1–22 (2021)
Yao, H., Hamilton, H.J.: Mining itemset utilities from transaction databases. Data Knowl. Eng. 59(3), 603–626 (2006)
Yen, S.J., Lee, Y.S.: Mining High Utility Quantitative Association Rules. In: International Conference on Data Warehousing and Knowledge Discovery, pp. 283–292. Springer (2007)
Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauly, M., Franklin, M.J., Shenker, S., Stoica, I.: Resilient distributed datasets: a {Fault-Tolerant} abstraction for {In-Memory} cluster computing. In: 9Th USENIX Symposium on Networked Systems Design and Implementation (NSDI 12), pp. 15–28 (2012)
Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. In: 2Nd USENIX Workshop on Hot Topics in Cloud Computing (Hotcloud 10) (2010)
Zida, S., Fournier-Viger, P., Lin, J.C.W., Wu, C.W., Tseng, V.S.: Efim: a highly efficient algorithm for high-utility itemset mining. In: Mexican International Conference on Artificial Intelligence, pp. 530–546. Springer (2015)
Zida, S., Fournier-Viger, P., Lin, J.C.W., Wu, C.W., Tseng, V.S.: Efim: a fast and memory efficient algorithm for high-utility itemset mining. Knowl. Inf. Syst. 51(2), 595–625 (2017)
Funding
No funding was obtained for this study
Author information
Authors and Affiliations
Contributions
Conceptualization: Jimmy Ming-Tai Wu and Huiying Zhou; Methodology: Jerry Chun-Wei Lin; Formal analysis: Gautam Srivastava and Mohamed Baza; Original Draft: Jimmy Ming-Tai Wu and Huiying Zhou and Mohamed Baza; Review & Editing: Gautam Srivastava, and Jerry Chun-Wei Lin
Corresponding author
Ethics declarations
Competing interests
The authors have no competing interests to declare
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wu, J.MT., Zhou, H., Lin, J.CW. et al. Mining Skyline Patterns from Big Data Environments based on a Spark Framework. J Grid Computing 21, 22 (2023). https://doi.org/10.1007/s10723-023-09653-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10723-023-09653-2