FAULT-TOLERANT DISTRIBUTED SUBCUBE MANAGEMENT SCHEME FOR HYPERCUBE MULTICOMPUTER SYSTEMS uri icon

abstract

  • This paper proposes a fault-tolerant distributed subcube management scheme for hypercube multicomputer systems. Gracefully degradable subcube management is supported by a data structure, called the distributed subcube table (DST), and a fault-tolerant broadcast protocol, called the reliably synchronized broadcast (RSB). In an n-dimensional hypercube, DST is the collection of 2" local subcube tables (LSTs), dst= {lst0,LST1,., LST2-1}, where LSTXis a bit-mapped table assigned to Nx, a fault-free node whose address is x. LSTX, Vr, is n +1 bits long, and it records the status (free/busy) of certain subcubes adjacent to Nx. The RSB diagnoses and avoids faults during interprocessor communication to prevent faulty nodes from being allocated for job execution. In addition to possessing a fault-tolerant design, our scheme can also achieve comparable or better performance than existing centralized schemes, as verified by extensive simulation. 1995 IEEE

published proceedings

  • IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS

author list (cited authors)

  • CHEN, Y. L., & LIU, J. C.

citation count

  • 2

complete list of authors

  • CHEN, YL||LIU, JC

publication date

  • January 1995