Institute for Systems Research Technical Reports

Permanent URI for this collectionhttp://hdl.handle.net/1903/4376

This archive contains a collection of reports generated by the faculty and students of the Institute for Systems Research (ISR), a permanent, interdisciplinary research unit in the A. James Clark School of Engineering at the University of Maryland. ISR-based projects are conducted through partnerships with industry and government, bringing together faculty and students from multiple academic departments and colleges across the university.

Browse

Search Results

Now showing 1 - 1 of 1
  • Thumbnail Image
    Item
    Efficient Retrieval of Similar Time Sequences Under Time Warping
    (1997) Yi, B.; Jagadish, H.V.; Faloutsos, Christos; Faloutsos, Christos; ISR
    Fast similarity searching in large time-sequence databases has attracted a lot of research interest. All of them use the Euclidean distance ($L_2$), or some variation of $L_p$ metric. $L_p$ metrics lead to efficient indexing, thanks to feature extraction (e.g., by keeping the first few DFT coefficients) and subsequent use of fast spatial access methods for the points in feature space. In this work we examine a popular, field-tested dissimilarity function, the "time warping" distance function which permits local accelerations and decelerations in the rate of the signals or sequences. This function is natural and suitable for several applications, like matching of voice, audio and medical signals (e.g., electrocardiograms). However, from the indexing viewpoint it presents two major challenges: (a) it does not lead to any natural "features", precluding the use of spatial access methods (b) it is quadratic ($O(len_1 * len_2)$) on the length of the sequences involved. Here we show how to overcome both problems: for the former, we propose using a modification of the so-called "FastMap", to map sequences into points, trading off a tiny amount of "recall" (typically zero) for large gains in speed. For the latter, we provide a fast, linear test, to help us discard quickly many of the false alarms that FastMap will typically introduce. Using both ideas in cascade, our proposed method achieved up to 7.8-time speed-up over the straightforward sequential scanning, on both read and synthetic datasets.