Fast algorithms for the calculation of Kendall’s τ
- Cite this article as:
- Christensen, D. Computational Statistics (2005) 20: 51. doi:10.1007/BF02736122
Traditional algorithms for the calculation of Kendall’s τ between two datasets of n samples have a calculation time of O(n2). This paper presents a suite of algorithms with expected calculation time of O(n log n) or better using a combination of sorting and balanced tree data structures. The literature, e.g. Dwork et al. (2001), has alluded to the existence of O(n log n) algorithms without any analysis: this paper gives an explicit descriptions of such algorithms for general use both for the case with and without duplicate values in the data. Execution times for sample data are reduced from 3.8 hours to around 1–2 seconds for one million data pairs.