New Approaches to Similarity Searching in Metric Spaces

dc.contributor.advisorMount, Daviden_US
dc.contributor.authorcelik, cengizen_US
dc.contributor.departmentComputer Scienceen_US
dc.contributor.publisherDigital Repository at the University of Marylanden_US
dc.contributor.publisherUniversity of Maryland (College Park, Md.)en_US
dc.date.accessioned2006-06-14T05:46:38Z
dc.date.available2006-06-14T05:46:38Z
dc.date.issued2006-04-24en_US
dc.description.abstractThe complex and unstructured nature of many types of data, such as multimedia objects, text documents, protein sequences, requires the use of similarity search techniques for retrieval of information from databases. One popular approach for similarity searching is mapping database objects into feature vectors, which introduces an undesirable element of indirection into the process. A more direct approach is to define a distance function directly between objects. Typically such a function is taken from a metric space, which satisfies a number of properties, such as the triangle inequality. Index structures that can work for metric spaces have been shown to provide satisfactory performance, and were reported to outperform vector-based counterparts in many applications. Metric spaces also provide a more general framework, and for some domains defining a distance between objects can be accomplished more intuitively than mapping objects to feature vectors. In this thesis we will investigate new efficient methods for similarity searching in metric spaces. We will first show that current solutions to indexing in metric spaces have several drawbacks. Tree-based solutions do not provide the best tradeoffs between construction time and query performance. Tree structures are also difficult to make dynamic without further degrading their performance. There is also a family of flat structures that address some of the deficiencies of tree-based indices, but they introduce their own unique problems in terms of higher construction cost, higher space usage, and extra CPU overhead. In this thesis a new family of flat structures will be introduced, which are very flexible and simple. We will show that dynamic operations can easily be performed, and that they can be customized to work under different performance requirements. They also address many of the general drawbacks of flat structures as outlined above. A new framework, composite metrics will also be introduced, which provides a more flexible similarity searching process by allowing several metrics to be combined in one search structure. Two indexing structures will be introduced that can handle similarity queries in this setting, and it will be shown that they provide competitive query performance with respect to data structures for standard metrics.en_US
dc.format.extent713378 bytes
dc.format.mimetypeapplication/pdf
dc.identifier.urihttp://hdl.handle.net/1903/3454
dc.language.isoen_US
dc.subject.pqcontrolledComputer Scienceen_US
dc.titleNew Approaches to Similarity Searching in Metric Spacesen_US
dc.typeDissertationen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
umi-umd-3278.pdf
Size:
696.66 KB
Format:
Adobe Portable Document Format