Discovering Frequent Structures using Summaries

View/ Open
Date
2002-05-22Author
Ghazizadeh, Shayan
Chawathe, Sudarshan
Metadata
Show full item recordAbstract
We study the problem of finding frequent structures in
semistructured data (represented as a directed labeled graph).
Frequent structures are graphs that are isomorphic to a large number
of subgraphs in the data graph. Frequent structures form building
blocks for visual exploration and data mining of semistructured
data. We overcome the inherent computational complexity of the
problem by using a summary data structure to prune the search space
and to provide interactive feedback. We present an experimental
study of our methods operating on real datasets. The implementation
of our methods (which is freely available) is capable of operating
on datasets that are two to three orders of magnitude larger than
those described in prior work.
(Also UMIACS-TR-2002-44)