Skip to content
University of Maryland LibrariesDigital Repository at the University of Maryland
    • Login
    View Item 
    •   DRUM
    • College of Computer, Mathematical & Natural Sciences
    • Computer Science
    • Technical Reports of the Computer Science Department
    • View Item
    •   DRUM
    • College of Computer, Mathematical & Natural Sciences
    • Computer Science
    • Technical Reports of the Computer Science Department
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    On the Efficiency of Nearest Neighbor Searching with Data Clustered in Lower Dimensions

    Thumbnail
    View/Open
    CS-TR-4209.ps (2.195Mb)
    No. of downloads: 271

    Auto-generated copy of CS-TR-4209.ps (311.7Kb)
    No. of downloads: 664

    Date
    2001-05-10
    Author
    Maneewongvatana, Songrit
    Mount, David M.
    Metadata
    Show full item record
    Abstract
    In nearest neighbor searching we are given a set of n data points in real d-dimensional space, R^d, and the problem is to preprocess these points into a data structure, so that given a query point, the nearest data point to the query point can be reported efficiently. Because data sets can be quite large, we are interested in data structures that use optimal O(dn) storage. Given the limitation of linear storage, the best known data structures suffer from expected-case query times that grow exponentially in d. However, it is widely regarded in practice that data sets in high dimensional spaces tend to consist of clusters residing in much lower dimensional subspaces. This raises the question of whether data structures for nearest neighbor searching adapt to the presence of lower dimensional clustering, and further how performance varies when the clusters are aligned with the coordinate axes. We analyze the popular kd-tree data structure in the form of two variants based on a modification of the splitting method, which produces cells satisfy the basic packing properties needed for efficiency without producing empty cells. We show that when data points are uniformly distributed on a k-dimensional hyperplane for k <= d, then expected number of leaves visited in such a kd-tree grows exponentially in k, but not in d. We show that the growth rate is even smaller still if the hyperplane is aligned with the coordinate axes. We present empirical studies to support our theoretical results.
    URI
    http://hdl.handle.net/1903/520
    Collections
    • Technical Reports of the Computer Science Department

    DRUM is brought to you by the University of Maryland Libraries
    University of Maryland, College Park, MD 20742-7011 (301)314-1328.
    Please send us your comments.
    Web Accessibility
     

     

    Browse

    All of DRUMCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    LoginRegister
    Pages
    About DRUMAbout Download Statistics

    DRUM is brought to you by the University of Maryland Libraries
    University of Maryland, College Park, MD 20742-7011 (301)314-1328.
    Please send us your comments.
    Web Accessibility