Probabilistic analysis on mesh network fault tolerance
Academic Article

Overview

Research

Identity

Additional Document Info

View All

Overview

abstract

Mesh networks are among the most important interconnection network topologies for large multicomputer systems. Mesh networks perform poorly in tolerating faults in the view of worst-case analysis. On the other hand, such worst cases occur very rarely in realistic situations. In this paper, we study the fault tolerance of 2-D and 3-D mesh networks under a more realistic model in which each network node has an independent failure probability. We first observe that if the node failure probability is fixed, then the connectivity probability of these mesh networks can be arbitrarily small when the network size is sufficiently large. Thus, it is practically important for multicomputer system manufacture to determine the upper bound for node failure probability when the probability of network connectivity and the network size are given. We develop a novel technique to formally derive lower bounds on the connectivity probability for 2-D and 3-D mesh networks. Our study shows that these mesh networks of practical size can tolerate a large number of faulty nodes thus are reliable enough for multicomputer systems. For example, it is formally proved that as long as the node failure probability is bounded by 0.5 %, a 3-D mesh network of up to a million nodes remains connected with a probability larger than 99 %. 2006 Elsevier Inc. All rights reserved.