Fast algorithms for the calculation of Kendall’s τ
- 508 Downloads
Traditional algorithms for the calculation of Kendall’s τ between two datasets of n samples have a calculation time of O(n2). This paper presents a suite of algorithms with expected calculation time of O(n log n) or better using a combination of sorting and balanced tree data structures. The literature, e.g. Dwork et al. (2001), has alluded to the existence of O(n log n) algorithms without any analysis: this paper gives an explicit descriptions of such algorithms for general use both for the case with and without duplicate values in the data. Execution times for sample data are reduced from 3.8 hours to around 1–2 seconds for one million data pairs.
KeywordsKendall’s Tau Algorithm O(n log n)
Unable to display preview. Download preview PDF.
- Adel’son-Vel’skii, G. M. and Landis, E. M. (1962), An Algorithm for the Organization of Information.Soviet Mathematics Doklady, 3, 1259–1262.Google Scholar
- Dwork, C, Kumar, R., Naor, M. and Sivakumar, D. (2001), Rank Aggregation Revisited.Proc. 10th International World Wide Web Conference, 613–622.Google Scholar
- Knuth, D.E. (1998),The Art of Computer Programming, Volume 3: Sorting and Searching, Addison-Wesley, 2nd edition.Google Scholar
- Lindskog, F., McNeil, A. and Schmock, U. (2001), Kendall’s τ for Elliptical Distributions.Working paper from http://www, math, ethz. ch/~mcneil/pub_list, html Google Scholar
- Press, W.H., Flannery, B.P., Teukolsky, S.A. and Vetterling, W.T. (1993),Numerical Recipes, Cambridge University Press.Google Scholar