Fast and stable multivariate kernel density estimation by fast sum updating

Dr Nicolas Langrené1, Mr Xavier Warin2

1CSIRO, Docklands, Australia,

2EDF, Paris, France


Kernel density estimation and kernel regression are powerful but computationally expensive techniques: a direct evaluation of kernel density estimates at M evaluation points given N input sample points requires a quadratic O(M.N) operations, which is prohibitive for large scale problems. For this reason, approximate methods such as binning with Fast Fourier Transform or the Fast Gauss Transform have been proposed to speed up kernel density estimation. Among these fast methods, the Fast Sum Updating approach is an attractive alternative, as it is an exact method and its speed is independent from the input sample and the bandwidth. Unfortunately, this method, based on data sorting, has for the most part been limited to the univariate case. In this talk, we revisit the fast sum updating approach and extend it in several ways. Our main contribution is to extend it to the general multivariate case for general input data and rectilinear evaluation grid. Other contributions include its extension to a wider class of kernels, including the triangular, cosine and Silverman kernels, and its combination with a fast approximate k-nearest-neighbours bandwidth for multivariate datasets. Our multivariate regression and density estimation tests confirm the speed, accuracy and stability of the method. We hope this work will renew interest for the fast sum updating approach and help solve large scale practical density estimation and regression problems.


Nicolas Langrené is a Research Scientist at Data61, CSIRO. His main research interest is in mathematical and numerical methods, focusing on applied probability and stochastic optimization, with applications in computational finance, computational statistics and energy markets. He completed his MEng in Applied Mathematics and Computer Science at Grenoble INP – ENSIMAG and the University of Grenoble Alpes in 2009, and completed his Msc in Statistics, Probability and Mathematical Finance and his PhD in Stochastic Control at the University of Paris Diderot, Sorbonne Paris Cité in 2010 and 2014 respectively. Before joining CSIRO, he also worked with the investment bank Natixis and the power company EDF.


