# Very fast optimal bandwidth selection for univariate kernel density estimation

 dc.contributor.author Raykar, Vikas Chandrakant dc.contributor.author Duraiswami, Ramani dc.date.accessioned 2006-01-03T15:15:40Z dc.date.available 2006-01-03T15:15:40Z dc.date.issued 2006-01-03T15:15:40Z dc.identifier.uri http://hdl.handle.net/1903/3028 dc.description.abstract Most automatic bandwidth selection procedures for kernel density estimates require estimation of quantities involving en the density derivatives. Estimation of modes and inflexion points of densities also require derivative estimates. The computational complexity of evaluating the density derivative at M evaluation points given N sample points from the density is O(MN). In this paper we propose a computationally efficient $\epsilon$-exact approximation algorithm for univariate, Gaussian kernel based, density derivative estimation that reduces the computational complexity from O(MN) to linear order (O(N+M)). The constant depends on the desired arbitrary accuracy, $\epsilon$. We apply the density derivative evaluation procedure to estimate the optimal bandwidth for kernel density estimation, a process that is often intractable for large data sets. For example for N = M = 409,600 points while the direct evaluation of the density derivative takes around 12.76 hours the fast evaluation requires only 65 seconds with an error of around $10^{-12)$. Algorithm details, error bounds, procedure to choose the parameters and numerical experiments are presented. We demonstrate the speedup achieved on the bandwidth selection using the solve-the-equation plug-in method'' [18]. We also demonstrate that the proposed procedure can be extremely useful for speeding up exploratory projection pursuit techniques. dc.format.extent 655565 bytes dc.format.mimetype application/pdf dc.language.iso en_US en dc.relation.ispartofseries Department of Computer Science Technical Report en dc.relation.ispartofseries CS-TR-4774 en dc.relation.ispartofseries UMIACS Technical Report en dc.relation.ispartofseries UMIACS-TR-2005-73 en dc.subject kernel density estimation en dc.subject computational statistics en dc.subject fast gauss transform en dc.subject projection pursuit en dc.subject approximation algorithms en dc.title Very fast optimal bandwidth selection for univariate kernel density estimation en dc.type Technical Report en dc.type Working Paper en dc.relation.isAvailableAt College of Computer, Methematical & Physical Sciences en_us dc.relation.isAvailableAt Computer Science en_us dc.relation.isAvailableAt Digital Repository at the University of Maryland en_us dc.relation.isAvailableAt University of Maryland (College Park, Md.) en_us
﻿