Synchronized Progress in Interconnection Networks (SPIN) : A New Theory for Deadlock Freedom

abstract

2018 IEEE. One of the most fundamental design challenges in any interconnection network is that of routing deadlocks. A deadlock is a cyclic dependence between buffers that renders forward progress impossible. Deadlocks are a necessary evil and almost every on-chip/HPC network today avoids it either via routing restrictions across physical channels (Daily's Theory) or with at least one escape virtual channel (Duato's Theory). This ensures that a cyclic dependence between buffers is never created in the first place. Moreover, each solution is tied to a specific topology, requiring an updated policy if the topology were to change. Alternately, solutions have also been proposed to reserve certain resources (buffers) and allocate them only upon detection of a deadlock, thereby breaking the dependence chain and recovering from the deadlock. Unfortunately, all these approaches fundamentally lead to a loss in available bandwidth due to routing restrictions or buffer resource usage restrictions. In this work, we challenge the theoretical notion of viewing deadlocks as a lack of routing resource (buffers) problem that every solution to date is based on. We argue that a deadlock can in fact be considered as a lack of coordination between distributed entities. We prove that orchestrating a forward movement of every flit in the deadlocked ring at exactly the same time, which we call a spin, can guarantee forward progress and eventually lead to deadlock resolution with a bounded number of spins. We name this novel theory as SPIN (Synchronized Progress in Interconnection Networks). SPIN eliminates the need for virtual channels to achieve deadlock freedom thereby enabling fully adaptive routing with only one buffer per message class. We illustrate this capability by designing FAvORS, a novel truly one VC fully-adaptive routing algorithm. We also present a low-cost distributed implementation of SPIN and compare it against state-of-the-art deadlock avoidance/recovery schemes. SPIN provides up to 80% higher throughput, 52% lower area and 50% lower power for an on-chip 64-core mesh, and up to 83% higher throughput, 53% lower area and 55% lower power for an off-chip 1024-node dragon-fly.

name of conference

2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA)

authors

Gratz, Paul

published proceedings

2018 ACM/IEEE 45TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA)

author list (cited authors)

Ramrakhyani, A., Gratz, P. V., & Krishna, T.

citation count

23

complete list of authors

Ramrakhyani, Aniruddh||Gratz, Paul V||Krishna, Tushar

publication date

June 2018

publisher

Institute of Electrical and Electronics Engineers (IEEE) Publisher

published in

Proceedings / Annual International Symposium on Computer Architecture. International Symposium on Computer Architecture Journal

Synchronized Progress in Interconnection Networks (SPIN) : A New Theory for Deadlock Freedom Conference Paper

Overview

abstract

name of conference

authors

published proceedings

author list (cited authors)

citation count

complete list of authors

publication date

publisher

published in

Research

keywords

Identity

Digital Object Identifier (DOI)

International Standard Book Number (ISBN) 13

Additional Document Info

start page

end page

Other

URL