Latency Analysis for Distributed Coded Storage Systems
Additional Document Info
Modern communication and computation systems often consist of large networks of unreliable nodes. Still, it is well known that such systems can provide aggregate reliability via redundancy. While duplication may increase the load on a system, it can lead to significant performance improvement when combined with the judicious management of extra system resources. Prime examples of this abstract paradigm include multi-path routing across communication networks, content access from multiple caches in delivery networks, and master/slave computations on compute clusters. Several recent contributions in the area establish bounds on the performance of redundant systems, characterizing the latency-redundancy tradeoff under specific load profiles. Following a similar line of research, this paper introduces new analytical bounds and approximation techniques for the latency-redundancy tradeoff for a range of system loads and a class of symmetric redundancy schemes, under the assumption of Poisson arrivals, exponential service-rates, and fork-join scheduling policy. The proposed approach can be employed to efficiently approximate the latency distribution of a queueing system at equilibrium. Various metrics can subsequently be derived for this system, including the mean and variance of the sojourn time, and the tail decay rate of the stationary distribution. This paper also establishes the stability region in terms of arrival rates for redundant systems with certain symmetries. Finally, it offers selection guidelines for design parameters to provide latency guarantees based on the proposed approximations. Findings are substantiated by numerical results.