Very fast optimal bandwidth selection for univariate kernel density estimation

dc.contributor.authorRaykar, Vikas Chandrakant
dc.contributor.authorDuraiswami, Ramani
dc.date.accessioned2006-01-03T15:15:40Z
dc.date.available2006-01-03T15:15:40Z
dc.date.issued2006-01-03T15:15:40Z
dc.description.abstractMost automatic bandwidth selection procedures for kernel density estimates require estimation of quantities involving the density derivatives. Estimation of modes and inflexion points of densities also require derivative estimates. The computational complexity of evaluating the density derivative at M evaluation points given N sample points from the density is O(MN). In this paper we propose a computationally efficient $\epsilon$-exact approximation algorithm for univariate, Gaussian kernel based, density derivative estimation that reduces the computational complexity from O(MN) to linear order (O(N+M)). The constant depends on the desired arbitrary accuracy, $\epsilon$. We apply the density derivative evaluation procedure to estimate the optimal bandwidth for kernel density estimation, a process that is often intractable for large data sets. For example for N = M = 409,600 points while the direct evaluation of the density derivative takes around 12.76 hours the fast evaluation requires only 65 seconds with an error of around $10^{-12)$. Algorithm details, error bounds, procedure to choose the parameters and numerical experiments are presented. We demonstrate the speedup achieved on the bandwidth selection using the ``solve-the-equation plug-in method'' [18]. We also demonstrate that the proposed procedure can be extremely useful for speeding up exploratory projection pursuit techniques.en
dc.format.extent655565 bytes
dc.format.mimetypeapplication/pdf
dc.identifier.urihttp://hdl.handle.net/1903/3028
dc.language.isoen_USen
dc.relation.isAvailableAtCollege of Computer, Methematical & Physical Sciencesen_us
dc.relation.isAvailableAtComputer Scienceen_us
dc.relation.isAvailableAtDigital Repository at the University of Marylanden_us
dc.relation.isAvailableAtUniversity of Maryland (College Park, Md.)en_us
dc.relation.ispartofseriesDepartment of Computer Science Technical Reporten
dc.relation.ispartofseriesCS-TR-4774en
dc.relation.ispartofseriesUMIACS Technical Reporten
dc.relation.ispartofseriesUMIACS-TR-2005-73en
dc.subjectkernel density estimationen
dc.subjectcomputational statisticsen
dc.subjectfast gauss transformen
dc.subjectprojection pursuiten
dc.subjectapproximation algorithmsen
dc.titleVery fast optimal bandwidth selection for univariate kernel density estimationen
dc.typeTechnical Reporten
dc.typeWorking Paperen

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
CS-TR-4774.pdf
Size:
640.2 KB
Format:
Adobe Portable Document Format