Skip to main content
Log in

A MapReduce approach for spatial co-location pattern mining via ordered-clique-growth

  • Published:
Distributed and Parallel Databases Aims and scope Submit manuscript

Abstract

Spatial co-location pattern is a subset of spatial features whose instances are frequently located together in geography. Mining co-location patterns are particularly valuable for discovering spatial dependencies. Traditional co-location pattern mining algorithms are computationally expensive with rapidly increasing of data volume. In this paper, we explore a novel iterative framework based on parallel ordered-clique-growth for co-location pattern mining. The ordered clique extension can re-use previously processed information and be executed in parallel, and hence speed up the identification of co-location instances. Based on the iterative framework, a MapReduce algorithm is designed to search for prevalent co-location patterns in a level-wise manner, namely PCPM_OC. To narrow the search space of ordered cliques, two pruning techniques are suggested for filtering invalid clique instances as much as possible. The completeness and correctness of PCPM_OC are proven and we also discuss its complexity in this paper. Moreover, we compare PCPM_OC with two advanced MapReduce based co-location pattern mining algorithms on multiple perspectives. At last, substantial experiments are conducted on synthetic and real-world spatial datasets to study the performance of PCPM_OC. Experimental results demonstrate that PCPM_OC has a significant improvement in efficiency and shows better scalability on massive spatial data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Shekhar, S., Huang, Y.: Discovering spatial co-location patterns: a summary of results. In: 7th International Symposium on Advances in Spatial and Temporal Databases (SSTD), pp. 236–256 (2001)

  2. Yoo, J.S., Shekhar, S.: A joinless approach for mining spatial colocation patterns. IEEE Trans. Knowl. Data Eng. 18(10), 1323–1337 (2006)

    Article  Google Scholar 

  3. Huang, Y., Shekhar, S., Xiong, H.: Discovering colocation patterns from spatial data Sets: a general approach. IEEE Trans. Knowl. Data Eng. 16(12), 1472–1485 (2004)

    Article  Google Scholar 

  4. Xiong, H., Shekhar, S., Huang, Y., Kumar, V., Ma, X., Yoo, J.S.: A framework for discovering co-location patterns in data sets with extended spatial objects. In: SIAM International Conference on Data Mining, pp. 1–13 (2004)

  5. Mohammad, A., Farhad, S., Robert, W.: A generic regional spatio-temporal co-occurrence pattern mining model: a case study for air pollution. J. Geogr. Syst. 17(3), 249–274 (2015)

    Article  Google Scholar 

  6. Fang, Y., Wang, L., Hu, T., Wang, X.: DFCPM: a dominant feature co-location pattern miner. In: APWEB/WAIM, pp. 456–460 (2018)

  7. Wang L., Bao X., Cao, L.: Interactive probabilistic post-mining of user-preferred spatial co-location patterns. In: IEEE International Conference on Data Engineering (ICDE), pp. 1256–1259 (2018)

  8. Yang, P., Zhang, T., Wang, L.: TSRS: trip service recommended system based on summarized co-location patterns. In: APWEB/WAIM, pp. 451–455 (2018)

  9. Yu, W.: Spatial co-location pattern mining for location-based services in road networks. Expert Syst. Appl. 46, 324–335 (2016)

    Article  Google Scholar 

  10. Li, J., Adilmagambetov, A., Jabbar, M.S.M., Osornio-Vargas, A., Wine, O.: On discovering co-location patterns in datasets: a casestudy of pollutants and child cancers. Geoinformatica 20(4), 651–692 (2016)

    Article  Google Scholar 

  11. Lu, J., Wang, L., Fang, Y., Zhao, J.: Mining strong symbiotic patterns hidden in spatial prevalent co-location patterns. Knowl. Based Syst. 146, 190–202 (2018)

    Article  Google Scholar 

  12. Lu, J., Wang, L., Fang, Y., Li, M.: Mining competitive pairs hidden in co-location patterns from dynamic spatial databases. In: Pacific Asia Knowledge Discovery and Data Mining (PAKDD), pp. 467–480 (2017)

  13. Yao, X., Chen, L., Peng, L., Chi, T.: A co-location pattern-mining algorithm with a density-weighted distance thresholding consideration. Inf. Sci. 396, 144–161 (2017)

    Article  Google Scholar 

  14. Wang, L., Bao, X., Zhou, L.: Redundancy reduction for prevalent co-location patterns. IEEE Trans. Knowl. Data Eng. 30(1), 142–155 (2018)

    Article  Google Scholar 

  15. Wang, L., Bao, X., Chen, H., Cao, L.: Effective lossless condensed representation and discovery of spatial co-location patterns. Inf. Sci. 436–437, 197–213 (2018)

    Article  MathSciNet  Google Scholar 

  16. Yang, P., Wang, L., Wang, X.: A parallel spatial co-location pattern mining approach based on ordered clique growth. In: International Conference on Database Systems for Advanced Applications (DASFAA), pp. 734–742 (2018)

  17. Andrzejewski, W., Boinski, P.: Efficient spatial co-location pattern mining on multiple GPUs. Expert Syst. Appl. 93, 465–483 (2018)

    Article  Google Scholar 

  18. Fang, Y., Wang, L., Wang, X., Zhou, L.: Mining co-location patterns with dominant features. In: International Conference on Web Information Systems Engineering (WISE), pp. 183–198 (2017)

  19. Fang, Y., Wang, L., Hu, T.: Spatial co-location pattern mining based on density peaks clustering and fuzzy theory. In: APWEB/WAIM, pp. 298–305 (2018)

  20. Ouyang, Z., Wang, L., Wu, P.: Spatial co-location pattern discovery from fuzzy objects. Int. J. Artif. Intell. Tools 26, 1750003 (2017). https://doi.org/10.1142/S0218213017500038

    Article  Google Scholar 

  21. Chan, H.K., Long, C., Yan, D., Wong, R.C. : Fraction-score: a new support measure for co-location pattern mining. In: IEEE International Conference on Data Engineering (ICDE), pp. 1514–1525 (2019)

  22. Wang, L., Bao, Y., Lu, J., Yip, J.: A new join-less approach for co-location pattern mining. In: 8th IEEE International Conference on Computer and Information Technology (CIT), pp. 197–202 (2008)

  23. Wang, L., Zhou, L., Lu, J., Yip, J.: An order-clique-based approach for mining maximal co-locations. Inf. Sci. 179(19), 3370–3382 (2009)

    Article  Google Scholar 

  24. Lin, Z., Lim, S.J.: Fast spatial co-location mining without cliqueness checking. In: International Conference on Information and Knowledge Management (CIKM), pp. 1461–1462 (2008)

  25. Yoo, J.S., Shekhar, S.: A partial join approach for mining co-location patterns. In: The 12th Annual ACM International Workshop on Geographic Information Systems, pp. 241–249 (2004)

  26. Yao, X., Peng, L., Yang, L., Chi, T.: A fast space-saving algorithm for maximal co-location pattern mining. Expert Syst. Appl. 63, 310–323 (2016)

    Article  Google Scholar 

  27. Xiao, X., Xie, X., Luo, Q., Ma, W.: Density based co-location pattern discovery. In: 16th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pp. 1–10 (2008)

  28. Kim, S., K., Kim, Y., Kim, U.: Maximal cliques generating algorithm for spatial co-location pattern mining. In: Secure and Trust Computing, Data Management and Applications (STA), pp. 241–250 (2011)

  29. Yoo, J.S., Boulware, D., Kimmey, D.: A parallel spatial co-location mining algorithm based on MapReduce. In: IEEE International Congress on Big Data, pp. 25–31 (2014)

  30. Yang, P., Wang, L., Wang, X., Fang, Y.: A parallel joinless algorithm for co-location pattern mining based on group-dependent shard. In: International Conference on Web Information Systems Engineering (WISE), pp. 240–250 (2018)

  31. Zheng, B., Zheng, K., Jensen, C.S., Nguyen, Q.V.H., Su, H., Li, G., Zhou, X.: Answering why-not group spatial keyword queries. IEEE Trans. Knowl. Data Eng. (2019). https://doi.org/10.1109/TKDE.2018.2879819

    Article  Google Scholar 

  32. Zheng, B., Su, H., Hua, W., Zheng, K., Zhou, X., Li, G.: Efficient clue-based route search on road networks. IEEE Trans. Knowl. Data Eng. 29(9), 1846–1859 (2017)

    Article  Google Scholar 

  33. Zhao, Y., Shang, S., Wang, Y., Zheng, B., Nguyen, Q.V.H., Zheng, K.: REST: a reference-based framework for spatio-temporal trajectory compression. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD), pp. 2797–2806 (2018)

  34. Zheng, K., Zhao, Y., Lian, D., Zheng, B., Liu, G., Zhou, X.: Reference-based framework for spatio-temporal trajectory compression and query processing. IEEE Trans. Knowl. Data Eng. (2019). https://doi.org/10.1109/TKDE.2019.2914449

    Article  Google Scholar 

  35. Zheng, B., Zheng, K., Xiao, X., Su, H., Yin, H., Zhou, X., Li, G.: Keyword-aware continuous kNN query on road networks. In: IEEE International Conference on Data Engineering (ICDE), pp. 871–882 (2016)

  36. Liu, J., Lemus, N.M., Pacitti, E., Porto, F., Valduriez, P.: Parallel computation of PDFs on big spatial data using spark. Distrib. Parallel Databases (2019). https://doi.org/10.1007/s10619-019-07260-3

    Article  Google Scholar 

  37. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: International Conference on Very Large Data Bases (VLDB), pp. 487–499 (1994)

  38. Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: ACM SIGMOD International Conference on Management of Data, pp. 1–12 (2000)

  39. Barua, S., Sander, J.: Mining statistically significant co-location and segregation patterns. IEEE Trans. Knowl. Data Eng. 26(5), 1185–1199 (2014)

    Article  Google Scholar 

  40. Cai, J., Liu, Q., Deng, M., Tang, J., He, Z.: Adaptive detection of statistically significant regional spatial co-location patterns. Comput. Environ. Urban Syst. 68, 53–63 (2018)

    Article  Google Scholar 

  41. Yao, X., Chen, L., Wen, C., Peng, L., Yang, L., Chi, T., Wang, X., Yu, W.: A spatial co-location mining algorithm that includes adaptive proximity improvements and distant instance references. Int. J. Geogr. Inf. Sci. 3, 1–26 (2018)

    Google Scholar 

  42. Andrzejewski, W., Boinski, P.: Parallel GPU-based plane-sweep algorithm for construction of iCPI-trees. J. Database Manage. 26(3), 1–20 (2015)

    Article  Google Scholar 

  43. Garaeva, A., Makhmutova, F., Anikin, I., Sattler, K.U.: A framework for co-location patterns mining in big spatial data. In: IEEE International Conference on Soft Computing & Measurements, pp. 477–480 (2017)

  44. Li, H., Wang, Y., Zhan, D., Zhang, M., Chang, E.: PFP: parallel FP-growth for query recommendation. In: ACM Conference on Recommender Systems, pp. 107–114 (2008)

  45. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: OSDI, pp. 137–150 (2004)

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (61472346, 61662086, 61762090), and the Project of Innovative Research Team of Yunnan Province (2018HC019).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lizhen Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, P., Wang, L. & Wang, X. A MapReduce approach for spatial co-location pattern mining via ordered-clique-growth. Distrib Parallel Databases 38, 531–560 (2020). https://doi.org/10.1007/s10619-019-07278-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10619-019-07278-7

Keywords

Navigation