Tuesday, October 2, 2012

Design considerations for Distributed Caching on the internet - white paper


Abstract
In this paper, we describe the design and implementation of
an integrated architecture for cache systems that scale to hundreds or thousands of caches with thousands to millions of
users. Rather than simply try to maximize hit rates, we take
an end-to-end approach to improving response time by also
considering hit times and miss times. We begin by studying
several Internet caches and workloads, and we derive three
core design principles for large scale distributed caches: (1)
minimize the number of hops to locate and access data on
both hits and misses, (2) share data among many users and
scale to many caches, and (3) cache data close to clients.
Our strategies for addressing these issues are built around a
scalable, high-performance data-location service that tracks
where objects are replicated. We describe how to construct
such a service and how to use this service to provide direct
access to remote data and push-based data replication. We
evaluate our system through trace-driven simulation and find
that these strategies together provide response time speedups
of 1.27 to 2.43 compared to a traditional three-level cache
hierarchy for a range of trace workloads and simulated environments.


Hierarchical caching has been examined in the context of
file systems[5, 37, 20]. Muntz and Honeyman [32] concluded
that the additional hops in such a system often more than offset improvements in hit rate and characterized the extra level
of cache as a “delay server.” We reach similar conclusions in
the context of Internet caching, leading to our design principle of minimizing the number of hops on a hit or miss.
Several researchers have proposed improving the scalability of a data hierarchy by splitting responsibilities according
to a hash function [12, 44]. This approach may work well
for distributing load across a set of caches that are near one
another and near their clients, but in larger systems where
clients are closer to some caches than others, the hash function will prevent the system from exploiting locality.
Several studies have examined push caching and prefetching in the context of web workloads [22, 23, 34]. These systems all used more elaborate history information to predict
future references than the algorithm we examine. Because
large, shared caches do a good job at satisfying references
to popular objects, we explore prefetching strategies that will
work well for the remaining large number of objects about
whose access patterns little is known. Kroeger et. al [27]
examined the limits of performance for caching and prefetching. They found that the rate of change of data and the rate of
accesses to new data and new servers limits achievable performance.
Conclusions
Although caching is increasingly used in the Internet to reduce network traffic and the load on web servers, it has been
less successful in reducing response time observed by clients.
We examine several environments and workloads and conclude that this may be because traditional hierarchical caches
violate several basic design principles for distributed caching
on the Internet. To address these systems, he have constructed
a hint hierarchy that supports direct access and push. Overall,
our techniques provide speedups of 1.27 to 2.43 compared to
a traditional cache hierarchy



'via Blog this'

No comments:

Post a Comment