Amble, Meghana Mukund (2010-12). Content-aware Caching and Traffic Management in Content Distribution Networks. Master's Thesis.
The rapid increase of content delivery over the Internet has lead to the proliferation of content distribution networks (CDNs). Management of CDNs requires algorithms for request routing, content placement, and eviction in such a way that user delays are small. Our objective in this work is to design feasible algorithms that solve this trio of problems. We abstract the system of front-end source nodes and back-end caches of the CDN in the likeness of the input and output nodes of a switch. In this model, queues of requests for different pieces of content build up at the source nodes, which route these requests to a cache that contains the content. For each request that is routed to a cache, a corresponding data file is transmitted back to the source across links of finite capacity. Caches are of finite size, and the content of the caches can be refreshed periodically. A requested but missing item is fetched to the cache from the media vault of the CDN. In case of a lack of adequate space at the cache, an existing, unrequested item may be evicted from the cache in order to accommodate a new item. Every such cache refresh or media vault access incurs a finite cost. Hence the refresh periodicity allowed to the system represents our system cost. In order to obtain small user delays, our algorithms must consider the lengths of the request queues that build up at the nodes. Stable policies ensure the finiteness of the request queues, while good polices also lead to short queue lengths. We first design a throughput-optimal algorithm that solves the routing-placement eviction problem using instantaneous system state information. The design yields insight into the impact of different cache refresh and eviction policies on queue length. We use this and construct throughput optimal algorithms that engender short queue lengths. We then propose a regime of algorithms which remedies the inherent problem of wastage of capacity. We also develop heuristic variants, and we study their performance. We illustrate the potential of our approach and validate all our claims and results through simulations on different CDN topologies.