Results 1  10
of
58
A scalable and unified multiplier architecture for finite fields GF(p) and GF(2 m
 and GF (2 m ). In Cryptographic Hardware and Embedded Systems — CHES 2000, LNCS
, 2000
"... We describe a scalable and unified architecture for a Montgomery multiplication module which operates in both types of finite fields GF(p) and GF(2m). The unified architecture requires only slightly more area than that of the multiplier architecture for the field GF(p). The multiplier is scalable,wh ..."
Abstract

Cited by 61 (13 self)
 Add to MetaCart
(Show Context)
We describe a scalable and unified architecture for a Montgomery multiplication module which operates in both types of finite fields GF(p) and GF(2m). The unified architecture requires only slightly more area than that of the multiplier architecture for the field GF(p). The multiplier is scalable,which means that a fixedarea multiplication module can handle operands of any size,and also,the wordsize can be selected based on the area and performance requirements. We utilize the concurrency in the Montgomery multiplication operation by employing a pipelining design methodology. We also describe a scalable and unified adder module to carry out concomitant operations in our implementation of the Montgomery multiplication. The upper limit on the precision of the scalable and unified Montgomery multiplier is dictated only by the available memory to store the operands and internal results,and the module is capable of performing infiniteprecision Montgomery multiplication in both types of finite fields. Key Words: Prime fields,binary extension fields,multiplication,Montgomery multiplication, scalability,hardware implementation.
A Scalable Architecture for Modular Multiplication Based on Montgomery's Algorithm
 IEEE TRANSACTIONS ON COMPUTERS
, 2003
"... This paper presents a scalable architecture for the computation of modular multiplication, based on the Montgomery multiplication (MM) algorithm. A wordbased version of MM is presented and used to explain the main concepts in the hardware design. The proposed multiplier is able to work with any pr ..."
Abstract

Cited by 44 (2 self)
 Add to MetaCart
(Show Context)
This paper presents a scalable architecture for the computation of modular multiplication, based on the Montgomery multiplication (MM) algorithm. A wordbased version of MM is presented and used to explain the main concepts in the hardware design. The proposed multiplier is able to work with any precision of the input operands, limited only by memory or control constraints. Its architecture gives enough freedom to select the word size and the degree of parallelism to be used, according to the available area and/or desired performance. Design trade offs are analyzed in order to identify adequate hardware configurations for a given area or bandwidth requirement.
HighRadix Design of a Scalable Modular Multiplier
 in Cryptographic Hardware and Embedded Systems — CHES 2001, Ç. K. Koç and C. Paar, Eds. 2001, Lecture Notes in Computer Science
, 2001
"... This paper describes an algorithm and architecture based on an extension of a scalable radix2 architecture proposed in a previous work. The algorithm is proven to be correct and the hardware design is discussed in detail. Experimental results are shown to compare a radix8 implementation with a ..."
Abstract

Cited by 30 (8 self)
 Add to MetaCart
(Show Context)
This paper describes an algorithm and architecture based on an extension of a scalable radix2 architecture proposed in a previous work. The algorithm is proven to be correct and the hardware design is discussed in detail. Experimental results are shown to compare a radix8 implementation with a radix2 design. The scalable Montgomery multiplier is adjustable to constrained areas yet being able to work on any given precision of the operands. Similar to some systolic implementations, this design avoid the high load on signals that broadcast to several components, making the delay independent of operand's precision.
How to Maximize the Potential of FPGA Resources for Modular Exponentiation
 In CHES Workshop
, 2007
"... Abstract. This paper describes a modular exponentiation processing method and circuit architecture that can exhibit the maximum performance of FPGA resources. The modular exponentiation architecture proposed by us comprises three main techniques. The first technique is to improve the Montgomery mu ..."
Abstract

Cited by 19 (0 self)
 Add to MetaCart
(Show Context)
Abstract. This paper describes a modular exponentiation processing method and circuit architecture that can exhibit the maximum performance of FPGA resources. The modular exponentiation architecture proposed by us comprises three main techniques. The first technique is to improve the Montgomery multiplication algorithm in order to maximize the performance of the multiplication unit in FPGA. The second technique is to improve and balance the circuit delay. The third technique is to ensure and make fast the scalability of the effective FPGA resource. We propose a circuit architecture that can handle multiple data lengths using the same circuits. In addition, our architecture can perform fast operations using smallscale resources; in particular, it can complete 512bit modular exponentiation in 0.26 ms by means of XC4VF1210SF363, which is the minimum logic resources in the Virtex4 Series FPGAs. Also, the number of SLICEs used is approx. 4000 to make a very compact design. Moreover, 1024, 1536 and 2048bit modular exponentiations can be processed in the same circuit with the scalability. 1
Scalable VLSI Architecture for GF(p) Montgomery Modular Inverse Computation
 ISVLSI 2002: IEEE COMPUTER SOCIETY ANNUAL SYMPOSIUM ON VLSI
, 2002
"... Modular inverse computation is needed in several public key cryptographic applications. In this work, we present two VLSI hardware implementations used in the calculation of Montgomery modular inverse operation. The implementations are based on the same inversion algorithm, however, one is fixed (fu ..."
Abstract

Cited by 13 (8 self)
 Add to MetaCart
(Show Context)
Modular inverse computation is needed in several public key cryptographic applications. In this work, we present two VLSI hardware implementations used in the calculation of Montgomery modular inverse operation. The implementations are based on the same inversion algorithm, however, one is fixed (fully parallel) and the other is scalable. The scalable design is the novel modification performed on the fixed hardware to make it occupy a small area and operate within better or similar speed. Both hardware designs are compared based on their speed and area. The area of the scalable design is on average 42% smaller than the fixed one. The delay of the designs, however, depends on the actual data size and the maximum numbers the hardware can handle. As the actual data size approach the hardware limit the scalable hardware speedup reduces in comparison to the fixed one, but still its delay is practical.
Versatile Montgomery multiplier architectures
, 2002
"... Several algorithms for Public Key Cryptography (PKC), such as RSA, DieHellman, and Elliptic Curve Cryptography, require modular multiplication of very large operands (sizes from 160 to 4096 bits) as their core arithmetic operation. To perform this operation reasonably fast, general purpose process ..."
Abstract

Cited by 11 (2 self)
 Add to MetaCart
(Show Context)
Several algorithms for Public Key Cryptography (PKC), such as RSA, DieHellman, and Elliptic Curve Cryptography, require modular multiplication of very large operands (sizes from 160 to 4096 bits) as their core arithmetic operation. To perform this operation reasonably fast, general purpose processors are not always the best choice. This is why specialized hardware, in the form of cryptographic coprocessors, become more attractive. Based upon the analysis of recent publications on hardware design for modular multiplication, this M.S. thesis presents a new architecture that is scalable with respect to word size and pipelining depth. To our knowledge, this is the rst time a word based algorithm for Montgomery's method is realized using highradix bitparallel multipliers working with two dierent types of nite elds (unied architecture for GF (p) and GF (2n)). Previous approaches have relied mostly on bit serial multiplication in combination with massive pipelining, or Radix8 multiplication with the limitation to a single type of nite eld. Our approach is centered around the notion that the optimal delay in bitparallel multipliers grows with logarithmic complexity with respect to the operand size n, O(log3=2 n), while the delay of bit serial implementations grows with linear i complexity O(n). Our design has been implemented in VHDL, simulated and synthesized in 0.5 CMOS technology. The synthesized net list has been veried in backannotated timing simulations and analyzed in terms of performance and area consumption. ii
Efficient Hardware Implementation of Finite Fields with Applications to Cryptography
 ACTA APPL MATH (2006 ) 93 : 75–118
, 2006
"... The paper presents a survey of most common hardware architectures for finite field arithmetic especially suitable for cryptographic applications. We discuss architectures for three types of finite fields and their special versions popularly used in cryptography: binary fields, prime fields and exten ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
The paper presents a survey of most common hardware architectures for finite field arithmetic especially suitable for cryptographic applications. We discuss architectures for three types of finite fields and their special versions popularly used in cryptography: binary fields, prime fields and extension fields. We summarize algorithms and hardware architectures for finite field multiplication, squaring, addition/subtraction, and inversion for each of these fields. Since implementations in hardware can either focus on highspeed or on areatime efficiency, a careful choice of the appropriate set of architectures has to be made depending on the performance requirements and available area.
An efficient and scalable radix4 modular multiplier design using recoding techniques
 Proc Asilomar Conf. Signals, Systems, and Computers
, 2003
"... Abstract — This paper presents the algorithm and architecture of a scalable radix4 Montgomery Multiplier. The straightforward implementation of a radix4 design based on the techniques already published results in a poor solution. In this paper we present an algorithm and architecture for the scala ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
(Show Context)
Abstract — This paper presents the algorithm and architecture of a scalable radix4 Montgomery Multiplier. The straightforward implementation of a radix4 design based on the techniques already published results in a poor solution. In this paper we present an algorithm and architecture for the scalable radix4 multiplier that makes use of two types of digit recoding in order to generate an efficient solution. The wordbyword algorithm used in the multiplier gives to the designer the freedom to select the level of parallelism according to the available area. Experimental results are shown to demonstrate that the proposed radix4 Montgomery Multiplier design has better area/performance tradeoff than previous radix2 and 8 scalable designs. I.
Hardware Implementation of a Montgomery Modular Multiplier in a Systolic Array
"... This paper describes a hardware architecture for modular multiplication operation which is efficient for bitlengths suitable for both commonly used types of Public Key Cryptography (PKC) i.e. ECC and RSA Cryptosystems. The challenge of current PKC implementations is to deal with long numbers (1602 ..."
Abstract

Cited by 9 (2 self)
 Add to MetaCart
This paper describes a hardware architecture for modular multiplication operation which is efficient for bitlengths suitable for both commonly used types of Public Key Cryptography (PKC) i.e. ECC and RSA Cryptosystems. The challenge of current PKC implementations is to deal with long numbers (1602048 bits) in order to achieve system's efficiency, as well as security. RSA, still the most popular PKC, has at its root the modular exponentiation operation. Modular exponentiation consists of repeated modular multiplications, which is also the basic operation for ECC protocols. The solution proposed in this work uses a systolic array implementation and can be used for arbitrary precisions. We also present modular exponentiation based on the Montgomery's Multiplication Method (MMM).
New hardware architectures for Montgomery modular multiplication algorithm
 Computers, IEEE Transactions on
, 2011
"... Abstract—Montgomery modular multiplication is one of the fundamental operations used in cryptographic algorithms, such as RSA ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
Abstract—Montgomery modular multiplication is one of the fundamental operations used in cryptographic algorithms, such as RSA