Comparing Strength of Locality of Reference: Popularity, Temporal Correlations, and Some Folk Theorems for the Miss Rates and Outputs of Caches

Thumbnail Image
PhD_2005-1.pdf(1.17 MB)
No. of downloads: 406
Publication or External Link
Vanichpun, Sarut
Makowski, Armand M.
The performance of demand-driven caching is known to depend on the locality of reference exhibited by the stream of requests made to the cache. In spite of numerous efforts, no consensus has been reached on how to formalize this notion, let alone on how to compare streams of requests on the basis of their locality of reference. We take on this issue with an eye towards validating operational expectations associated with the notion of locality of reference. We focus on two ``folk theorems," that is, (i) The stronger the locality of reference, the smaller the miss rate of the cache; and (ii) Good caching is expected to produce an output stream of requests exhibiting less locality of reference than the input stream of requests. These two folk theorems are explored in the context of demand-driven caching for the two main contributors of locality of reference, namely popularity and temporal correlations. We first focus exclusively on popularity by considering the situation where there are no temporal correlations in the stream of requests, as would be the case under the Independent Reference Model (IRM). As we propose to measure strength of locality of reference in a stream of requests through the skewness of its popularity distribution, we introduce the notion of majorization as a means for capturing this degree of skewness. We show that these folk theorems hold for caches operating under a large class of replacement policies, the so-called Random On-demand Replacement Algorithms (RORA), which includes the optimal policy $A_0$ and the random policy. However, counterexamples prove that this is not always the case under the (popular) Least-Recently-Used (LRU) and CLIMB policies. In such cases, conjectures are offered (and supported by simulations) as to when the folk theorems would hold under the LRU or CLIMB caching, given that the IRM input has a Zipf-like popularity pmf. To compare the strength of temporal correlations in streams of requests, we define the notion of Temporal Correlations (TC) ordering based on the so-called supermodular ordering, a concept of positive dependence which has been successfully used for comparing dependence structures in sequences of random variables. We explore how the TC ordering captures the strength of temporal correlations in several Web request models, namely the higher-order Markov chain model (HOMM), the partial Markov chain model (PMM) and the Least-Recently-Used stack model (LRUSM). We establish the folk theorem to the effect that the stronger the strength of temporal correlations, the smaller the miss rate for the PMM under certain assumptions on the caching policy. Conjectures and simulations are offered as to when this folk theorem would hold under the HOMM and under the LRUSM. In addition, the validity of this folk theorem for general request streams under the Working Set algorithm is studied. Lastly, we investigate how the majorization and TC orderings can be translated into comparisons of three well-known locality of reference metrics, namely the working set size, the inter-reference time and the stack distance.