Hashing Technique: A New Index Method for High Dimensional Data

dc.contributor.authorSong, Zhexuanen_US
dc.contributor.authorRoussopoulos, Nicken_US
dc.date.accessioned2004-05-31T21:08:38Z
dc.date.available2004-05-31T21:08:38Z
dc.date.created1999-09en_US
dc.date.issued1999-09-25en_US
dc.description.abstractWhen dimension goes high, sequential scan processing becomes more efficient than most index-based query. In this paper, we propose a new index method for high-dimensional data spaces. This method is based on hashing technique. The basic idea is: First find a hashing function which puts the given d-dimensional space data into a d'-dimensional buckets where d' << d. Then, we use existing index techniques to manage those buckets. We later define some properties of a good hashing function and give four hashing functions. To demonstrate the efficiency of our idea, we experimentally compared our algorithms with sequential scan and Pyramid-Techniques. The results demonstrate that this method outperforms others for skewed data set. It always beats the sequential scan by using only half of elapsed time for range query. However if the data has uniform distribution, Pyramid-Technique is still the best method.en_US
dc.format.extent202344 bytes
dc.format.mimetypeapplication/postscript
dc.identifier.urihttp://hdl.handle.net/1903/509
dc.language.isoen_US
dc.relation.isAvailableAtDigital Repository at the University of Marylanden_US
dc.relation.isAvailableAtUniversity of Maryland (College Park, Md.)en_US
dc.relation.isAvailableAtTech Reports in Computer Science and Engineeringen_US
dc.relation.isAvailableAtComputer Science Department Technical Reportsen_US
dc.relation.ispartofseriesUM Computer Science Department; CS-TR-4059en_US
dc.titleHashing Technique: A New Index Method for High Dimensional Dataen_US
dc.typeTechnical Reporten_US

Files

Original bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
CS-TR-4059.ps
Size:
197.6 KB
Format:
Postscript Files
Loading...
Thumbnail Image
Name:
CS-TR-4059.pdf
Size:
195.22 KB
Format:
Adobe Portable Document Format
Description:
Auto-generated copy of CS-TR-4059.ps