Results 1  10
of
22
Selecting Elliptic Curves for Cryptography: An Efficiency and Security Analysis
"... Abstract. We select a set of elliptic curves for cryptography and analyze our selection from a performance and security perspective. This analysis complements recent curve proposals that suggest (twisted) Edwards curves by also considering the Weierstrass model. Working with both Montgomeryfriendly ..."
Abstract

Cited by 9 (2 self)
 Add to MetaCart
Abstract. We select a set of elliptic curves for cryptography and analyze our selection from a performance and security perspective. This analysis complements recent curve proposals that suggest (twisted) Edwards curves by also considering the Weierstrass model. Working with both Montgomeryfriendly and pseudoMersenne primes allows us to consider more possibilities which improves the overall efficiency of base field arithmetic. Our Weierstrass curves are backwards compatible with current implementations of prime order NIST curves, while providing improved efficiency and stronger security properties. We choose algorithms and explicit formulas to demonstrate that our curves support constanttime, exceptionfree scalar multiplications, thereby offering high practical security in cryptographic applications. Our implementation shows that variablebase scalar multiplication on the new Weierstrass curves at the 128bit security level is about 1.4 times faster than the recent implementation record on the corresponding NIST curve. For practitioners who are willing to use a different curve model and sacrifice a few bits of security, we present a collection of twisted Edwards curves with particularly efficient arithmetic that are up to 1.43, 1.26 and 1.24 times faster than the new Weierstrass curves at the 128, 192 and 256bit security levels, respectively. Finally, we discuss how these curves behave in a real world protocol by considering different scalar multiplication scenarios in the transport layer security (TLS) protocol. 1
Kummer strikes back: new DH speed records
 In Cryptology ePrint Archive, Report 2014/134
, 2014
"... Abstract. This paper introduces highsecurity constanttime variablebasepoint Diffie–Hellman soft ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
(Show Context)
Abstract. This paper introduces highsecurity constanttime variablebasepoint Diffie–Hellman soft
Montgomery Multiplication Using Vector Instructions
, 2013
"... Abstract. In this paper we present a parallel approach to compute interleaved Montgomery multiplication. This approach is particularly suitable to be computed on 2way single instruction, multiple data platforms as can be found on most modern computer architectures in the form of vector instruction ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
Abstract. In this paper we present a parallel approach to compute interleaved Montgomery multiplication. This approach is particularly suitable to be computed on 2way single instruction, multiple data platforms as can be found on most modern computer architectures in the form of vector instruction set extensions. We have implemented this approach for tablet devices which run the x86 architecture (Intel Atom Z2760) using SSE2 instructions as well as devices which run on the ARM platform (Qualcomm MSM8960, NVIDIA Tegra 3 and 4) using NEON instructions. When instantiating modular exponentiation with this parallel version of Montgomery multiplication we observed a performance increase of more than a factor of 1.5 compared to the sequential implementation in OpenSSL for the classical arithmetic logic unit on the Atom platform for 2048bit moduli. Key words: Montgomery multiplication, SIMD, software implementation, vector instructions 1
Twisted Hessian curves
, 2015
"... This paper presents new speed records for arithmetic on a large family of elliptic curves with cofactor 3: specifically, 8.77M per bit for 256bit variablebase singlescalar multiplication when curve parameters are chosen properly. This is faster than the best results known for cofactor 1, showi ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
This paper presents new speed records for arithmetic on a large family of elliptic curves with cofactor 3: specifically, 8.77M per bit for 256bit variablebase singlescalar multiplication when curve parameters are chosen properly. This is faster than the best results known for cofactor 1, showing for the first time that points of order 3 are useful for performance and narrowing the gap to the speeds of curves with cofactor 4.
Curve41417: Karatsuba revisited
"... Abstract. This paper introduces constanttime ARM CortexA8 ECDH software that (1) is faster than the fastest ECDH option in the latest version of OpenSSL but (2) achieves a security level above 2200 using a prime above 2400. For comparison, this OpenSSL ECDH option is not constanttime and has a se ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
Abstract. This paper introduces constanttime ARM CortexA8 ECDH software that (1) is faster than the fastest ECDH option in the latest version of OpenSSL but (2) achieves a security level above 2200 using a prime above 2400. For comparison, this OpenSSL ECDH option is not constanttime and has a security level of only 280. The new speeds are achieved in a quite different way from typical primefield ECC software: they rely on a synergy between Karatsuba’s method and choices of radix smaller than the CPU word size.
FourQ: fourdimensional decompositions on a Qcurve over the Mersenne prime
"... Abstract. We introduce FourQ, a highsecurity, highperformance elliptic curve that targets the 128bit security level. At the highest arithmetic level, cryptographic scalar multiplications on FourQ can use a fourdimensional GallantLambertVanstone decomposition to minimize the total number of ell ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
(Show Context)
Abstract. We introduce FourQ, a highsecurity, highperformance elliptic curve that targets the 128bit security level. At the highest arithmetic level, cryptographic scalar multiplications on FourQ can use a fourdimensional GallantLambertVanstone decomposition to minimize the total number of elliptic curve group operations. At the group arithmetic level, FourQ admits the use of extended twisted Edwards coordinates and can therefore exploit the fastest known elliptic curve addition formulas over large prime characteristic fields. Finally, at the finite field level, arithmetic is performed modulo the extremely fast Mersenne prime p = 2127 − 1. We show that this powerful combination facilitates scalar multiplications that are significantly faster than all prior works. On Intel’s Haswell, Ivy Bridge and Sandy Bridge architectures, our software computes a variablebase scalar multiplication in 59,000, 71,000 cycles and 74,000 cycles, respectively; and, on the same platforms, our software computes a DiffieHellman shared secret in 92,000, 110,000 cycles and 116,000 cycles, respectively. These results show that, in practice, FourQ is around four to five times faster than the original NIST P256 curve and between two and three times faster than curves that are currently under consideration as NIST alternatives, such as Curve25519. 1
Faster software for fast endomorphisms
"... Abstract. GLV curves (Gallant et al.) have performance advantages over standard elliptic curves, using half the number of point doublings for scalar multiplication. Despite their introduction in 2001, implementations of the GLV method have yet to permeate widespread software libraries. Furthermore, ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
Abstract. GLV curves (Gallant et al.) have performance advantages over standard elliptic curves, using half the number of point doublings for scalar multiplication. Despite their introduction in 2001, implementations of the GLV method have yet to permeate widespread software libraries. Furthermore, sidechannel vulnerabilities, specifically cachetiming attacks, remain unpatched in the OpenSSL code base since the first attack in 2009 (Brumley and Hakala) even still after the most recent attack in 2014 (Benger et al.). This work reports on the integration of the GLV method in OpenSSL for curves from 160 to 256 bits, as well as deploying and evaluating two sidechannel defenses. Performance gains are up to 51%, and with these improvements GLV curves are now the fastest elliptic curves in OpenSSL for these bit sizes.
Point compression for the trace zero subgroup over a small degree extension field
, 2014
"... Point compression for the trace zero subgroup over a small degree extension field∗ ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
Point compression for the trace zero subgroup over a small degree extension field∗
VLSI implementation of doublebase scalar multiplication on a twisted edwards curve with an efficiently computable endomorphism. Cryptology ePrint Archive: Report 2015/421
, 2015
"... Abstract. The verification of an ECDSA signature requires a doublebase scalar multiplication, an operation of the form k ·G+ l ·Q where G is a generator of a large elliptic curve group of prime order n, Q is an arbitrary element of said group, and k, l are two integers in the range of [1, n − 1]. W ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Abstract. The verification of an ECDSA signature requires a doublebase scalar multiplication, an operation of the form k ·G+ l ·Q where G is a generator of a large elliptic curve group of prime order n, Q is an arbitrary element of said group, and k, l are two integers in the range of [1, n − 1]. We introduce in this paper an areaoptimized VLSI design of a PrimeField Arithmetic Unit (PFAU) that can serve as a looselycoupled or tightlycoupled hardware accelerator in a systemonchip to speed up the execution of doublebase scalar multiplication. Our design is optimized for twisted Edwards curves with an efficiently computable endomorphism that allows one to reduce the number of point doublings by some 50 % compared to a conventional implementation. An example for such a special curve is −x2 + y2 = 1 + x2y2 over the 207bit prime field Fp with p = 2207 − 5131. The PFAU prototype we describe in this paper features a (16 × 16)bit multiplier and has an overall silicon area of 5821 gates when synthesized with a 0.13µ standardcell library. It can be clocked with a frequency of up to 50 MHz and is capable to perform a constanttime multiplication in the mentioned 207bit prime field in only 198 clock cycles. A complete doublebase scalar multiplication has an execution time of some 365k cycles and requires the precomputation of 15 points. Our design supports many tradeoffs between performance and RAM requirements, which is a highly desirable property for future InternetofThings (IoT) applications.