# Detecting global hyperparaboloid correlated clusters: a Hough-transform based multicore algorithm

**Part of the following topical collections:**

## Abstract

Correlation clustering detects complex and intricate relationships in high-dimensional data by identifying groups of data points, each characterized by differents correlation among a (sub)set of features. Current correlation clustering methods generally limit themselves to linear correlations only. In this paper, we introduce a method for detecting global non-linear correlated clusters focusing on quadratic relations. We introduce a novel Hough transform for the detection of hyperparaboloids and apply it to the detection of hyperparaboloid correlated clusters in arbitrary high-dimensional data spaces. We further provide a solution for utilizing all available CPU cores on a system. For this we simply split the Hough space among a pre-defined axis into a number of equi-sized partitions. In this paper we show that this most simple way of parallelization already improves the runtime significantly. Non-linear correlation clustering like our method can reveal valuable insights which are not covered by current linear versions. Our empirical results on synthetic and real world data reveal that the proposed method is robust against noise, jitter and irregular densities.

## Keywords

Data mining Non-linear correlation clustering Hough transform Multicore## Notes

## Supplementary material

## References

- 1.Achtert, E., Böhm, C., Kröger, P., Zimek, A.: Mining hierarchies of correlation clusters. In: Proceedings of the International Conference on Scientific and Statistical Database Management, SSDBM, pp. 119–128 (2006)Google Scholar
- 2.Achtert, E., Böhm, C., Kriegel, H.P., Kröger, P., Zimek, A.: On exploring complex relationships of correlation clusters. In: Proceedings of the International Conference on Scientific and Statistical Database Management, SSDBM (2007a)Google Scholar
- 3.Achtert, E., Böhm, C., Kriegel, H.P., Kröger, P., Zimek, A.: Robust, complete, and efficient correlation clustering. In: Proceedings of the 2007 SIAM International Conference on Data Mining, pp. 413–418 (2007b)Google Scholar
- 4.Achtert, E., Böhm, C., David, J., Kröger, P., Zimek, A.: Global correlation clustering based on the Hough transform. Stat. Anal. Data Min.
**1**, 111–127 (2008)MathSciNetCrossRefGoogle Scholar - 5.Aggarwal, C.C., Yu, P.S.: Finding generalized projected clusters in high dimensional spaces. ACM SIGMOD Rec.
**29**(2), 70–81 (2000)CrossRefGoogle Scholar - 6.Atkins, P., Depaula, J., Keeler, J.: Physical Chemistry. Oxford University Press, Oxford (2017)Google Scholar
- 7.Böhm, C., Kailing, K., Kröger, P., Zimek, A.: Computing clusters of correlation connected objects. In: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data—SIGMOD ’04, p. 455 (2004)Google Scholar
- 8.Duda, R.O., Hart, P.E.: Use of the Hough transform to detect lines and curves in pictures. Commun. ACM
**15**, 11–15 (1972)CrossRefzbMATHGoogle Scholar - 9.Eitman, W.J., Guthrie, G.E.: The shape of the average cost curve. Am. Econ. Rev.
**42**(5), 832–838 (1952)Google Scholar - 10.Kazempour, D., Mauder, M., Kröger, P., Seidl, T.: Detecting global hyperparaboloid correlated clusters based on Hough transform. In: Proceedings of the 29th International Conference on Scientific and Statistical Database Management, pp. 31:1–31:6 (2017)Google Scholar
- 11.MacQueen, J.B.: Kmeans some methods for classification and analysis of multivariate observations. 5th Berkeley Symp. Math. Stat. Probab. 1967
**1**(233), 281–297 (1967)MathSciNetGoogle Scholar - 12.Sander, J., Ester, M., Kriegel, H.P., Xu, X.: Density-based clustering in spatial databases: the algorithm GDBSCAN and its applications. Data Min. Knowl.
**194**, 169–194 (1998)CrossRefGoogle Scholar - 13.Schubert, E., Koos, A., Emrich, T., Züfle, A., Schmid, K.A., Zimek, A.: A framework for clustering uncertain data. PVLDB
**8**(12), 1976–1979 (2015)Google Scholar - 14.Sha, C., Qiu, X., Zhou, A.: KLNCC: a new nonlinear correlation clustering algorithm based on KL-divergence. In: 8th IEEE International Conference on Computer and Information Technology, pp. 125–130 (2008)Google Scholar
- 15.Tung, A.K.H., Xu, X., Ooi, B.C.: CURLER: finding and visualizing nonlinear correlation clusters. In: SIGMOD ’05: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, pp. 467–478 (2005)Google Scholar
- 16.Zwietering, M., Jongenburger, I., Rombouts, F., Riet, K.V.: Modeling of the bacterial growth curve. J. Appl. Environ. Microbiol.
**56**(6), 1875–1881 (1990)Google Scholar