Efficient On-Line Nonparametric Kernel Density Estimation
Nonparametric density estimation has broad applications in computational finance especially in cases where high frequency data are available. However, the technique is often intractable, given the run times necessary to evaluate a density. We present a new and efficient algorithm based on multipole techniques. Given the n kernels that estimate the density, current methods take O(n) time directly to sum the kernels to perform a single density query. In an on-line algorithm where points are continually added to the density, the cumulative O(n 2 ) running time for n queries makes it very costly, if not impractical, to compute the density for large n . Our new Multipole-accelerated On-line Density Estimation (MODE) algorithm is general in that it can be applied to any kernel (in arbitrary dimensions) that admits a Taylor series expansion. The running time for a density query reduces to O (logn) or even constant time, depending on the kernel chosen, and, hence, the cumulative running time is reduced to O (n logn) or O(n) , respectively. Our results show that the MODE algorithm provides dramatic advantages over the direct approach to density evaluation. For example, we show using a modest computing platform that on-line density updates and queries for 1 million points and two dimensions take 8 days to compute using the direct approach versus 40 seconds with the MODE approach.
Unable to display preview. Download preview PDF.