FastMap: A Fast Algorithm for Indexing, Data-Mining and
Visualization of Traditional and Multimedia Datasets

Faloutsos, Christos; Lin, King-Ip (David)

FastMap: A Fast Algorithm for Indexing, Data-Mining and Visualization of Traditional and Multimedia Datasets

dc.contributor.author	Faloutsos, Christos	en_US
dc.contributor.author	Lin, King-Ip (David)	en_US
dc.date.accessioned	2004-05-31T22:29:10Z
dc.date.available	2004-05-31T22:29:10Z
dc.date.created	1995-01	en_US
dc.date.issued	1998-10-15	en_US
dc.description.abstract	A very promising idea for fast searching in traditional and multimedia databases is to map objects into points in k-d space, using k feature-extraction functions, provided by a domain expert rJag91]. Thus. we can subsequently use highly fine-tuned spatia l access methods (SAMs), to answer several types of queries, including the 'Query By Example' type (which translates to a range query); the 'all pairs' query (which translates to a spatial join [BKSS94]); the nearest-neighbor or best-match query, etc. However, designing feature extraction functions can be hard. It is relatively easier for a domain expert to assess the similarity/distance of two objects. Given only the distance information though, it is not obvious how to map objects into points. This is exactly the topic of this paper. We describe a fast algorithm to map objects into points in some k-dimensional space (k is user-defined), such that the dissimilarities are preserved. There are two benefits from this mapping: (a) efficient retriev al, in conjunction with a SAM, as discussed before and (b) visualization and data-mining: the objects can now be plotted as points in 2-d or Sd space, revealing potential clusters, correlations among attributes and other regularities that data-mining is l ooking for. We introduce an older method from pattern recognition, namely, Multi-Dimcnsional Scaling (MDS) [Tor52]; although unsuitable for indexing, we use it as yardstick for our method. Then, we propose a much faster algorithm to solve the problem in hand, while in addition it allows for indexing. Experiments on real and synthetic data indeed show that the proposed algorithm is significantly faster than MDS, (being linear, as opposed to quadratic, on the database size N), while it manages to preserve distances an d the overall structure of the data-set. (Also cross-referenced as UMIACS-TR-94-132)	en_US
dc.format.extent	584052 bytes
dc.format.mimetype	application/postscript
dc.identifier.uri	http://hdl.handle.net/1903/680
dc.language.iso	en_US
dc.relation.isAvailableAt	Digital Repository at the University of Maryland	en_US
dc.relation.isAvailableAt	University of Maryland (College Park, Md.)	en_US
dc.relation.isAvailableAt	Tech Reports in Computer Science and Engineering	en_US
dc.relation.isAvailableAt	UMIACS Technical Reports	en_US
dc.relation.ispartofseries	UM Computer Science Department; CS-TR-3383	en_US
dc.relation.ispartofseries	UMIACS; UMIACS-TR-94-132	en_US
dc.title	FastMap: A Fast Algorithm for Indexing, Data-Mining and Visualization of Traditional and Multimedia Datasets	en_US
dc.type	Technical Report	en_US

Files

Original bundle

Now showing 1 - 2 of 2

Name:: CS-TR-3383.ps
Size:: 570.36 KB
Format:: Postscript Files

Download

Name:: CS-TR-3383.pdf
Size:: 319.36 KB
Format:: Adobe Portable Document Format
Description:: Auto-generated copy of CS-TR-3383.ps

Download

Collections

Technical Reports from UMIACS
Technical Reports of the Computer Science Department