Fast Subword Permutation Instructions Using Omega and Flip Network Stages
Abstract:
This paper proposes a new way of efficiently doing arbitrary ¢-bit permutations in programmable processors modeled on the theory of omega and flip networks. The newomflip instruction we introduce can perform any permutation of ¢ subwords in £¥¤§¦¨ ¢ instructions, with the subwords ranging from half-words down to single bits. Each omflip instruction can be done in a single cycle, with very efficient hardware implementation. The omflip instruction enhances a programmable processor’s capability for handling multimedia and security applications which use subword permutations extensively. 1
Citations
| 1206 | Introduction to Parallel Algorithms and Architectures: Arrays – Leighton - 1992 |
| 494 | Applied Cryptography: Protocols, Algorithms and Source Code in C – Schneier - 1995 |
| 192 | MMX technology extension to the Intel architecture – Peleg, Weiser - 1996 |
| 125 | Subword parallelism with MAX-2”, in – Lee - 1996 |
| 77 | Accelerating Multimedia with Enhanced Microprocessors – Lee - 1995 |
| 35 | Bit permutation instructions for accelerating software cryptography – Shi, Lee |
| 12 | Subword Permutation Instructions for Two-Dimensional Multimedia Processing – Lee - 2000 |
| 7 | Fast Subword Permutation Instructions Based on Butterfly networks – Yang, Vachharajani, et al. - 2000 |
| 2 | libdes DES implementation – Young - 1997 |

