Web Archiving: Organizing Web Objects into Web Containers to Optimize Access
Web Archiving: Organizing Web Objects into Web Containers to Optimize Access
Loading...
Files
Publication or External Link
Date
2007-10-09
Authors
Song, Sangchul
JaJa, Joseph
Advisor
Citation
DRUM DOI
Abstract
The web is becoming the preferred medium for communicating and storing
information pertaining to almost any human activity. However it is an
ephemeral medium whose contents are constantly changing, resulting in
a permanent loss of part of our cultural and scientific heritage on a
regular basis. Archiving important web contents is a very challenging
technical problem due to its tremendous scale and complex structure,
extremely dynamic nature, and its rich heterogeneous and deep
contents. In this paper, we consider the problem of archiving a linked
set of web objects into web containers in such a way as to minimize
the number of containers accessed during a typical browsing session.
We develop a method that makes use of the notion of PageRank and
optimized graph partitioning to enable faster browsing of archived web
contents. We include simulation results that illustrate the
performance of our scheme and compare it to the common scheme
currently used to organize web objects into web containers.