A large number of the permutations realized by interconnection networks in parallel processing systems and digital arithmetic circuits, fall in the class of bit-permute-complement (BPC) permutations. This paper presents a methodology for routing this class of permutations in VLSI, under various I/O, area, and time trade-offs. The resulting VLSI designs can route a BPC permutation of size N, using a chip with N/Q I/O pins, O(N2/Q2) area, and O(wQ) time, where w is the word length of the permuted elements and 1QN/w.
name of conference
Proceedings Sixth International Parallel Processing Symposium