Timing Failure Analysis of Commercial CPUs Under Operating Stress Conference Paper uri icon

abstract

  • The timing margin of an operating physical device suffers from crosstalk, power supply voltage fluctuation, and temperature variation among other elements. This problem is increasingly pronounced with deep-submicron technology. A conservative testing, binning and marketing policy alleviates the reliability concerns but at a loss of realizable performance of the device. This paper presents a methodology for a more practical estimation of the timing margin through analytical and empirical analysis of noise sources. First, the sources of noise are modeled. Then physical experiments are conducted to measure time-to-failure of the target CPUs under stress. The accelerated test results are used for parameterizing the models to empirically determine the device timing margin under realistic operating conditions. The results indicate that the actual safe-operating region for a set of tested microprocessors is significantly wider than that reported in manufacturer's' specifications for new devices. 2006 IEEE.

name of conference

  • 2006 21st IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems

published proceedings

  • 2010 IEEE 25th International Symposium on Defect and Fault Tolerance in VLSI Systems

author list (cited authors)

  • Chang, S., & Choi, G.

citation count

  • 0

complete list of authors

  • Chang, Sanghoan||Choi, Gwan

publication date

  • January 2006