The Most Efficient Known Addition Chains for Field Element & Scalar Inversion for the Most Popular & Most Unpopular Elliptic Curves

May 31st, 2017

June 13th, 2017: I added the addition chain for secp256k1 field element inversion and corrected the authorship of the secp256k1 scalar inversion addition chain, based on feedback by Peter Dettman.

September 3rd, 2017: I added the addition chain for Curve25519 scalar inversion.

Introduction
Curve25519 Field Element Inversion
P-256 (secp256r1) Field Element Inversion
P-384 (secp384r1) Field Element Inversion
secp256k1 (Bitcoin's Curve) Field Element Inversion
secp256k1 (Bitcoin's Curve) Scalar Inversion
P-256 (secp256r1) Scalar Inversion
P-384 (secp384r1) Scalar Inversion
Curve25519 Scalar Inversion
Verifying the Addition Chains

Introduction

Given 0 < x < p where p is prime, we can calculate the modular multiplicative inverse of x, denoted x⁻¹ (mod p), by calculating x^p − 2 (mod p). As Donald Knuth explained in Section 4.6.3 of The Art of Computer Programming, Volume 2: Seminumerical Algorithms (3rd Edition), given an addition chain for an exponent e, we can calculate x^e (mod p) by using squaring a² (mod p) as the doubling operation and multiplication ab (mod p) as the addition operation. The correctness & efficiency of such a computation depends (only) on the correctness & efficiency of the implementations of modular squaring & modular multiplication and the correctness & efficiency of the addition chain used for the exponentiation.

Assuming we've already verified & optimized the modular multiplication & squaring implementations, we'll focus here on finding the most efficient addition chains useful for inversion in commonly-used elliptic curves. The curves we are interested in here are Curve25519, P-256 (secp256r1), P-384 (secp384r1), and secp256k1 (the curve used in Bitcoin). The notable characteristic of this use of addition chains is that the exponent is a well-known constant, so we can construct the addition chain offline using as much effort as we are willing to expend. In contrast, most other practical uses of addition chains require that the addition chain be constructed online for a previously-unknown value and so the cost of constructing the chain has to be added to the cost of evaluating the chain; that complication makes that problem quite different from what's discussed below.

We have no proofs that the addition chains presented here are optimal. In fact it is almost certainly the case that at least a couple of the ones presented here are not optimal. Your help is requested in finding better ones. Please email me at brian@briansmith.org with (links to) better addition chains and/or links to tools and algorithms for finding better addition chains for these very large values. I will publish updates here as better chains are found. Similarly, any help in finding a better accounting of the history of the search for optimal addition chains for these curves would be greatly appreciated. Let me know if there are other curves that are ubiquitous that should have inversion addition chains documented here.

All of the addition chains documented here are written in the notation developed in Addition Chains as Polymorphic Higher-order Functions. In particular, this notation is executable Haskell code that forms the module ECCInversionAdditionChains in the file ECCInversionAdditionChains.lhs. With this, alongside the module AdditionChainComputation in AdditionChainComputation.lhs, we can automatically verify the correctness of these addition chains and measure the efficiency of each. This code only uses library functions from the base package and it does not enable any language extensions. It was tested in GHCi 8.0.1.

We use some naming conventions to make the code easier to read. A variable named with a leading underscore has the value represented by interpreting the part of the name after the underscore as a binary number. For example, _101 represents the number 5 since 101₂ = 5. We will deal with numbers that consist of a long series of binary 1 digits, and we'll name these xn where n is the number of ones. For example, x3 represents 111₂ which is 7 in decimal notation; as another example, x16 represents 1111111111111111₂ which is 65535 (FFFF₁₆). With that in mind, let's consider an example step

_110111 = (nth double 4 `andThen` add _111) _11.

This says that 55 (_110111) is constructed from 3 (_11) by doubling it four times (nth double 4) and then (andThen) adding 7 (_111); that is, 55 = 3 ⋅ 2 ⋅ 2 ⋅ 2 ⋅ 2 + 7.

module ECCInversionAdditionChains(
  curve25519FieldInverseExponent, curve25519FieldInverseExponentValue,

  p256FieldInverseSquaredExponent, p256FieldInverseSquaredExponentValue,
  p384FieldInverseSquaredExponent, p384FieldInverseSquaredExponentValue,
  inverseCubedFromInverseSquared, inverseFromInverseSquared,

  secp256k1ScalarInverseExponent, secp256k1ScalarInverseExponentValue,
  p256ScalarInverseExponent, p256ScalarInverseExponentValue,
  p384ScalarInverseExponent, p384ScalarInverseExponentValue,
  curve25519ScalarInverseExponent, curve25519ScalarInverseExponentValue,

  invalid,
) where
import AdditionChainComputation
import Numeric.Natural

Curve25519 Field Element Inversion

Scalar multiplication on the Curve25519 curve require inverting a Z coordinate when converting a non-affine point representation to an affine point representation. This involves calculating z⁻¹ = z^{q − 2} (mod q), where q is Curve25519's field characteristic, 2²⁵⁵ − 19. Daniel. J. Bernstein described an addition chain for that in Curve25519: new Diffie-Hellman speed records as “a straightforward sequence of 254 squarings and 11 multiplications.” The addition chain shown here is the one described, reverse-engineered from the function curve25519_athlon_recip in the code bundle curve25519-20050915.tar.gz he published alongside the paper. Both the code and the paper are available from the Curve25519 page on his website.

Hexadecimal representation of q − 2:: 7fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffeb
Binary representation of q − 2:: 111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111101011

-- https://cr.yp.to/ecdh/curve25519-20060209.pdf
curve25519FieldInverseExponentValue :: Natural
curve25519FieldInverseExponentValue = q - 2 where q = 2^255 - 19

curve25519FieldInverseExponent double add one =
  let
    _1    = one
    _10   = (nth double   1                    ) _1
    _1001 = (nth double   2 `andThen` add _1   ) _10
    _1011 = (                         add _10  ) _1001
    x5    = (nth double   1 `andThen` add _1001) _1011
    x10   = (nth double   5 `andThen` add x5   ) x5
    x20   = (nth double  10 `andThen` add x10  ) x10
    x40   = (nth double  20 `andThen` add x20  ) x20
    x50   = (nth double  10 `andThen` add x10  ) x40
    x100  = (nth double  50 `andThen` add x50  ) x50
  in        (nth double 100 `andThen` add x100   `andThen`
             nth double  50 `andThen` add x50    `andThen`
             nth double   5 `andThen` add _1011) x100
---------------------------------------------------------
-- Total length: 265 =  254 doubles +  11 adds

P-256 (secp256r1) Field Element Inversion

P-256 implementations usually use Jacobian coordinates and then convert the Jacobian coordinates to affine coordinates at the end of the computation. This conversion requires calculating z⁻² (mod q) for the X coordinate. When the affine value of the Y coordinate is needed (which is usually, but not always, the case) then z⁻³ (mod q) is also needed.

Shay Gueron and Vlad Krasnov published an efficient addition chain in the OpenSSL patch that accompanied the paper Fast Prime Field Elliptic Curve Cryptography with 256 Bit Primes that is quite similar in overall structure to the one for Curve25519 above. They noted that when the Y coordinate isn't needed, we can compute z⁻² = z^q − 3 (mod q) in one fewer multiplication than is required in their addition chain for z⁻¹ (mod q). The last step of the addition chain for z⁻¹ (mod q) is a multiplication by z, which means that before that multiplication, we must have had z⁻² (mod q) since z⁻² ⋅ z = z⁻¹ (mod q).

inverseFromInverseSquared add one inverseSquared =
  add one inverseSquared

They didn't actually implement this optimization in their OpenSSL patch.

Actually the optimization of directly computing z⁻² (mod q) using the addition chain for q − 3 is useful even when z⁻³ (mod q) is also needed, as we can easily compute z⁻⁴ = (z⁻²)² (mod q) and then compute z⁻³ = z⁻⁴ ⋅ z (mod q):

inverseCubedFromInverseSquared double add one inverseSquared =
  (double `andThen` add one) inverseSquared

Also, I found an addition chain for the earlier part of p − 3 that further reduces the number of adds by one. The resulting addition chains save one double (squaring) and two adds (multiplications) when the Y coordinate is not needed, and it saves two adds (multiplications) when the Y coordinate is needed, relative to the one implemented by Gueron & Krasnov for OpenSSL.

Hexadecimal representation of q − 3:: ffffffff00000001000000000000000000000000fffffffffffffffffffffffc
Binary representation of q − 3:: 1111111111111111111111111111111100000000000000000000000000000001000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111100

-- http://csrc.nist.gov/groups/ST/toolkit/documents/dss/NISTReCur.pdf
p256FieldInverseSquaredExponentValue =
  (q - 2) - 1 where q = 2^256 - 2^224 + 2^192 + 2^96 - 1

p256FieldInverseSquaredExponent double add one =
  let
    x1  = one
    x2  = (nth double        1  `andThen` add x1 ) x1
    x3  = (nth double        1  `andThen` add x1 ) x2
    x6  = (nth double        3  `andThen` add x3 ) x3
    x12 = (nth double        6  `andThen` add x6 ) x6
    x15 = (nth double        3  `andThen` add x3 ) x12
    x30 = (nth double       15  `andThen` add x15) x15
    x32 = (nth double        2  `andThen` add x2 ) x30
  in      (nth double (31 +  1) `andThen` add x1   `andThen`
           nth double (96 + 32) `andThen` add x32  `andThen`
           nth double       32  `andThen` add x32  `andThen`
           nth double       30  `andThen` add x30  `andThen`
           nth double        2                   ) x32
-----------------------------------------------------------
-- Total length: 266   =   255 doubles  +  11 adds

P-384 (secp384r1) Field Element Inversion

I could not find any addition chains for field element inversion for the P-384 curve, so I made my own using the same ideas as the one used for P-256. This addition chain calculates the square of the inverse; the inverse and the cube of the inverse can be recovered using the same logic presented for P-256 above.

Hexadecimal representation of q − 3:: fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffeffffffff0000000000000000fffffffc
Binary representation of q − 3:: 111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111011111111111111111111111111111111000000000000000000000000000000000000000000000000000000000000000011111111111111111111111111111100

-- http://csrc.nist.gov/groups/ST/toolkit/documents/dss/NISTReCur.pdf
p384FieldInverseSquaredExponentValue =
  (q - 2) - 1 where q = 2^384 - 2^128 - 2^96 + 2^32 - 1

p384FieldInverseSquaredExponent double add one =
  let
    x1   = one
    x2   = (nth double        1  `andThen` add x1  ) x1
    x3   = (nth double        1  `andThen` add x1  ) x2
    x6   = (nth double        3  `andThen` add x3  ) x3
    x12  = (nth double        6  `andThen` add x6  ) x6
    x15  = (nth double        3  `andThen` add x3  ) x12
    x30  = (nth double       15  `andThen` add x15 ) x15
    x60  = (nth double       30  `andThen` add x30 ) x30
    x120 = (nth double       60  `andThen` add x60 ) x60
  in       (nth double      120  `andThen` add x120  `andThen`
            nth double       15  `andThen` add x15   `andThen`
            nth double  (1 + 30) `andThen` add x30   `andThen`
            nth double        2  `andThen` add x2    `andThen`
            nth double (64 + 30) `andThen` add x30   `andThen`
            nth double        2                    ) x120
--------------------------------------------------------------
-- Total length: 396   =    383 doubles  +  13 adds

secp256k1 (Bitcoin's Curve) Field Element Inversion

Peter Dettman contributed an optimized addition chain to the libsecp256k1 project back in January of 2014 for secp256k1 field element inversion (and modular square root). His addition chain directly calculates the inverse z⁻¹ = z^{q − 2} (mod q). The addition chain here is that one, modified only by omitting the final multiplication by z (mod q) so that it calculates z⁻² = z^{q − 3} (mod q), following the same logic as the addition chains for P-256 and P-384 field element inversion presented above.

Hexadecimal representation of q − 3:: fffffffffffffffffffffffffffffffffffffffffffffffffffffffefffffc2c
Binary representation of q − 3:: 1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111011111111111111111111110000101100

-- http://www.secg.org/sec2-v2.pdf
secp256k1FieldInverseSquaredExponentValue =
  (q - 2) - 1 where q = 2^256 - 2^32 - 2^9 - 2^8 - 2^7 - 2^6 - 2^4 - 1

secp256k1FieldInverseSquaredExponent double add one =
  let
    x1  = one
    x2  = (nth double        1  `andThen` add x1    ) x1
    x3  = (nth double        1  `andThen` add x1    ) x2
    x11 = (nth double        3  `andThen` add x3      `andThen`
           nth double        3  `andThen` add x3      `andThen`
           nth double        2  `andThen` add x2    ) x3
    x22 = (nth double       11  `andThen` add x11   ) x11
    x44 = (nth double       22  `andThen` add x22   ) x22
    x88 = (nth double       44  `andThen` add x44   ) x44
  in      (nth double       88  `andThen` add x88     `andThen`
           nth double       44  `andThen` add x44     `andThen`
           nth double        3  `andThen` add x3      `andThen`
           nth double  (1 + 22) `andThen` add x22     `andThen`
           nth double  (4 +  1) `andThen` add x1      `andThen`
           nth double  (1 +  2) `andThen` add x2      `andThen`
           nth double        2                      ) x88
---------------------------------------------------------
-- Total length: 269   =   255 doubles  +  14 adds

secp256k1 (Bitcoin's Curve) Scalar Inversion

ECDSA signing for the secp256k1 curve requires scalar inversion, i.e. computing k⁻¹ = k^{n − 2} (mod n) where n is the curve's group order. Pieter Wuille implemented an addition chain for scalar inversion for this curve based (AFAICT) on the same idea as Peter Dettman's addition chain for field element inversion above, using sliding windows 1, 11₂, and 1111₂. Recently in 2017 Peter Dettman improved it by also making use of the windows 101₂ and 111₂. When I stumbled across that change I suspected that 3-bit windows were unlikely to be optimal. I then improved it by making use of all the previously-unused four-bit windows 1001₂, 1011₂, and 1101₂, resulting in the addition chain here. That improvement relied on constructing all odd numbers between 3 and 13 by constructing 1 + 1 = 2 and then 1 + 2 = 3, and then repeatedly adding 2, which is a well-known technique that is often overlooked. Because so little effort has been expended on optimizing this so far, I think it would probably be relatively easy to further improve it.

Hexadecimal representation of n − 2:: fffffffffffffffffffffffffffffffebaaedce6af48a03bbfd25e8cd036413f
Binary representation of n − 2:: 1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111010111010101011101101110011100110101011110100100010100000001110111011111111010010010111101000110011010000001101100100000100111111

-- http://www.secg.org/sec2-v2.pdf
secp256k1ScalarInverseExponentValue :: Natural
secp256k1ScalarInverseExponentValue =
  n - 2 where
  n = 0xfffffffffffffffffffffffffffffffebaaedce6af48a03bbfd25e8cd0364141

secp256k1ScalarInverseExponent double add one =
  let
    _1    = one
    _10   =  nth double      1                       _1
    _11   =                               add _10    _1
    _101  =                               add _10    _11
    _111  =                               add _10    _101
    _1001 =                               add _10    _111
    _1011 =                               add _10    _1001
    _1101 =                               add _10    _1011
    x6    = (nth double      2  `andThen` add _1011) _1101
    x8    = (nth double      2  `andThen` add _11  ) x6
    x14   = (nth double      6  `andThen` add x6   ) x8
    x28   = (nth double     14  `andThen` add x14  ) x14
    x56   = (nth double     28  `andThen` add x28  ) x28
  in        (nth double     56  `andThen` add x56    `andThen`
             nth double     14  `andThen` add x14    `andThen`
             nth double      3  `andThen` add _101   `andThen`
             nth double (1 + 3) `andThen` add _111   `andThen`
             nth double (1 + 3) `andThen` add _101   `andThen`
             nth double (1 + 4) `andThen` add _1011  `andThen`
             nth double      4  `andThen` add _1011  `andThen`
             nth double (1 + 3) `andThen` add _111   `andThen`
             nth double (2 + 3) `andThen` add _111   `andThen`
             nth double (2 + 4) `andThen` add _1101  `andThen`
             nth double (1 + 3) `andThen` add _101   `andThen`
             nth double      3  `andThen` add _111   `andThen`
             nth double (1 + 4) `andThen` add _1001  `andThen`
             nth double (3 + 3) `andThen` add _101   `andThen`
             nth double (7 + 3) `andThen` add _111   `andThen`
             nth double (1 + 3) `andThen` add _111   `andThen`
             nth double (1 + 8) `andThen` add x8     `andThen`
             nth double (1 + 4) `andThen` add _1001  `andThen`
             nth double (2 + 4) `andThen` add _1011  `andThen`
             nth double      4  `andThen` add _1101  `andThen`
             nth double (3 + 2) `andThen` add _11    `andThen`
             nth double (2 + 4) `andThen` add _1101  `andThen`
             nth double (6 + 4) `andThen` add _1101  `andThen`
             nth double      4  `andThen` add _1001  `andThen`
             nth double (5 + 1) `andThen` add _1     `andThen`
             nth double (2 + 6) `andThen` add x6   ) x56
-------------------------------------------------------------
-- Total length: 290   =   253 doubles  +  37 adds

P-256 (secp256r1) Scalar Inversion

ECDSA signing for the P-256 curve also requires scalar inversion, which again is the computation of k⁻¹ = k^{n − 2} (mod n) where n is the curve's group order. Vlad Krasnov published C code in April 2015 that implements an addition chain that used the standard bit duplication technique for the easy part at the beginning and then fixed windows (where each window is one 4-bit hex digit) for the hard part at the end. In 2015 I improved his addition chain by switching from fixed windows to sliding windows and then further improved it by looking for longer windows that were a net benefit to construct. In 2016 and again in 2017 I improved it further. Recently after improving the addition chain for secp256k1's n − 2, I realized that the technique of constructing small odd multiples of k by repeatedly adding 2 is also useful, to a limited extent, for this curve's n − 2.

Hexadecimal representation of n − 2:: ffffffff00000000ffffffffffffffffbce6faada7179e84f3b9cac2fc63254f
Binary representation of n − 2:: 1111111111111111111111111111111100000000000000000000000000000000111111111111111111111111111111111111111111111111111111111111111110111100111001101111101010101101101001110001011110011110100001001111001110111001110010101100001011111100011000110010010101001111

-- http://csrc.nist.gov/groups/ST/toolkit/documents/dss/NISTReCur.pdf
p256ScalarInverseExponentValue =
  n - 2
  where
  n = 115792089210356248762697446949407573529996955224135760342422259061068512044369

p256ScalarInverseExponent double add one =
  let
    _1      = one
    _10     = (nth double      1                       ) _1
    _11     =                               add _10      _1
    _101    =                               add _10      _11
    _111    =                               add _10      _101
    _1010   =  nth double      1                         _101
    _1111   =                               add _101     _1010
    _10101  = (nth double      1  `andThen` add _1     ) _1010
    _101010 =  nth double      1                         _10101
    _101111 =                               add _101     _101010
    x6      =                               add _10101   _101010
    x8      = (nth double      2  `andThen` add _11    ) x6
    x16     = (nth double      8  `andThen` add x8     ) x8
    x32     = (nth double     16  `andThen` add x16    ) x16
  in          (nth double (32+32) `andThen` add x32      `andThen`
               nth double     32  `andThen` add x32      `andThen`
               nth double      6  `andThen` add _101111  `andThen`
               nth double (2 + 3) `andThen` add _111     `andThen`
               nth double (2 + 2) `andThen` add _11      `andThen`
               nth double (1 + 4) `andThen` add _1111    `andThen`
               nth double      5  `andThen` add _10101   `andThen`
               nth double (1 + 3) `andThen` add _101     `andThen`
               nth double      3  `andThen` add _101     `andThen`
               nth double      3  `andThen` add _101     `andThen`
               nth double (2 + 3) `andThen` add _111     `andThen`
               nth double (3 + 6) `andThen` add _101111  `andThen`
               nth double (2 + 4) `andThen` add _1111    `andThen`
               nth double (1 + 1) `andThen` add _1       `andThen`
               nth double (4 + 1) `andThen` add _1       `andThen`
               nth double (2 + 4) `andThen` add _1111    `andThen`
               nth double (2 + 3) `andThen` add _111     `andThen`
               nth double (1 + 3) `andThen` add _111     `andThen`
               nth double (2 + 3) `andThen` add _111     `andThen`
               nth double (2 + 3) `andThen` add _101     `andThen`
               nth double (1 + 2) `andThen` add _11      `andThen`
               nth double (4 + 6) `andThen` add _101111  `andThen`
               nth double (    2) `andThen` add _11      `andThen`
               nth double (3 + 2) `andThen` add _11      `andThen`
               nth double (3 + 2) `andThen` add _11      `andThen`
               nth double (2 + 1) `andThen` add _1       `andThen`
               nth double (2 + 5) `andThen` add _10101   `andThen`
               nth double (2 + 4) `andThen` add _1111  ) x32
-----------------------------------------------------------------
-- Total length: 292    =    254 doubles  +  38 adds

P-384 (secp384r1) Scalar Inversion

ECDSA signing for the P-384 curve also requires scalar inversion, i.e. computing k⁻¹ = k^{n − 2} (mod n) where n is the curve's group order. I could not find any previous attempts to construct an optimal addition chain for this number, so I made one based on the same ideas as the one for scalar inversion in P-256. Although I have improved it iteratively over time, I have not spent a significant effort optimizing this. Consequently, of all the addition chains described here, this would probably the easiest one to improve and it is probably the one that can be improved the most.

Hexadecimal representation of n − 2:: ffffffffffffffffffffffffffffffffffffffffffffffffc7634d81f4372ddf581a0db248b0a77aecec196accc52971
Binary representation of n − 2:: 111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111110001110110001101001101100000011111010000110111001011011101111101011000000110100000110110110010010010001011000010100111011110101110110011101100000110010110101011001100110001010010100101110001

-- http://csrc.nist.gov/groups/ST/toolkit/documents/dss/NISTReCur.pdf
p384ScalarInverseExponentValue =
  n - 2
  where n = 39402006196394479212279040100143613805079739270465446667946905279627659399113263569398956308152294913554433653942643

p384ScalarInverseExponent double add one =
  let
    _1    = one
    _10   = (nth double      1                     ) _1
    _11   =                               add _10    _1
    _101  =                               add _10    _11
    _111  =                               add _10    _101
    _1001 =                               add _10    _111
    _1011 =                               add _10    _1001
    _1101 =                               add _10    _1011
    _1111 =                               add _10    _1101
    x8    = (nth double      4  `andThen` add _1111) _1111
    x16   = (nth double      8  `andThen` add x8   ) x8
    x32   = (nth double     16  `andThen` add x16  ) x16
    x96   = (nth double     32  `andThen` add x32    `andThen`
             nth double     32  `andThen` add x32  ) x32
  in        (nth double     96  `andThen` add x96    `andThen`
             nth double      2  `andThen` add _11    `andThen`
             nth double (3 + 3) `andThen` add _111   `andThen`
             nth double (1 + 2) `andThen` add _11    `andThen`
             nth double (3 + 2) `andThen` add _11    `andThen`
             nth double (1 + 4) `andThen` add _1001  `andThen`
             nth double      4  `andThen` add _1011  `andThen`
             nth double (6 + 4) `andThen` add _1111  `andThen`
             nth double      3  `andThen` add _101   `andThen`
             nth double (4 + 1) `andThen` add _1     `andThen`
             nth double      4  `andThen` add _1011  `andThen`
             nth double      4  `andThen` add _1001  `andThen`
             nth double (1 + 4) `andThen` add _1101  `andThen`
             nth double      4  `andThen` add _1101  `andThen`
             nth double      4  `andThen` add _1111  `andThen`
             nth double (1 + 4) `andThen` add _1011  `andThen`
             nth double (6 + 4) `andThen` add _1101  `andThen`
             nth double (5 + 4) `andThen` add _1101  `andThen`
             nth double      4  `andThen` add _1011  `andThen`
             nth double (2 + 4) `andThen` add _1001  `andThen`
             nth double (2 + 1) `andThen` add _1     `andThen`
             nth double (3 + 4) `andThen` add _1011  `andThen`
             nth double (4 + 3) `andThen` add _101   `andThen`
             nth double (2 + 3) `andThen` add _111   `andThen`
             nth double (1 + 4) `andThen` add _1111  `andThen`
             nth double (1 + 4) `andThen` add _1011  `andThen`
             nth double      4  `andThen` add _1011  `andThen`
             nth double (2 + 3) `andThen` add _111   `andThen`
             nth double (1 + 2) `andThen` add _11    `andThen`
             nth double (5 + 2) `andThen` add _11    `andThen`
             nth double (2 + 4) `andThen` add _1011  `andThen`
             nth double (1 + 3) `andThen` add _101   `andThen`
             nth double (1 + 2) `andThen` add _11    `andThen`
             nth double (2 + 2) `andThen` add _11    `andThen`
             nth double (2 + 2) `andThen` add _11    `andThen`
             nth double (3 + 3) `andThen` add _101   `andThen`
             nth double (2 + 3) `andThen` add _101   `andThen`
             nth double (2 + 3) `andThen` add _101   `andThen`
             nth double      2  `andThen` add _11    `andThen`
             nth double (3 + 1) `andThen` add _1   ) x96
-------------------------------------------------------------
-- Total length: 433   =   381 doubles  +  52 adds

Curve25519 Scalar Inversion

Back-Maxwell Rangeproofs require scalar inversion, and some people are interested in implementing them using Curve25519. Curve25519 isn't a prime order curve, but it is sufficient to restrict the domain and range of the inversion to the largest (prime order) subgroup so that Fermat's Little Theorem can be used. All the implementations of this I've seen up to this point have used a simple one-bit square-and-multiply ladder that requires 251 squarings and 72 multiplications. The addition chain here requires only 250 squarings and 34 multiplications. Again, very little time has been spent optimizing this addition chain, and I expect improvements should be easy to find.

Hexadecimal representation of n − 2:: 1000000000000000000000000000000014def9dea2f79cd65812631a5cf5d3eb
Binary representation of n − 2:: 1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000010100110111101111100111011110101000101111011110011100110101100101100000010010011000110001101001011100111101011101001111101011

curve25519ScalarInverseExponentValue =
  n - 2
  where n = 2^252 + 27742317777372353535851937790883648493

curve25519ScalarInverseExponent double add one =
  let
    _1     = one
    _10    = (nth double      1                        ) _1
    _100   = (nth double      1                        ) _10
    _11    =                                 add _10     _1
    _101   =                                 add _10     _11
    _111   =                                 add _10     _101
    _1001  =                                 add _10     _111
    _1011  =                                 add _10     _1001
    _1111  =                                 add _100    _1011
    _10000 =                                 add _1      _1111
  in         (nth double (123 + 3) `andThen` add _101   `andThen`
              nth double (  2 + 2) `andThen` add _11    `andThen`
              nth double (  1 + 4) `andThen` add _1111  `andThen`
              nth double (  1 + 4) `andThen` add _1111  `andThen`
              nth double (      4) `andThen` add _1001  `andThen`
              nth double (      2) `andThen` add _11    `andThen`
              nth double (  1 + 4) `andThen` add _1111  `andThen`
              nth double (  1 + 3) `andThen` add _101   `andThen`
              nth double (  3 + 3) `andThen` add _101   `andThen`
              nth double (      3) `andThen` add _111   `andThen`
              nth double (  1 + 4) `andThen` add _1111  `andThen`
              nth double (  2 + 3) `andThen` add _111   `andThen`
              nth double (  2 + 2) `andThen` add _11    `andThen`
              nth double (  1 + 4) `andThen` add _1011  `andThen`
              nth double (  2 + 4) `andThen` add _1011  `andThen`
              nth double (  6 + 4) `andThen` add _1001  `andThen`
              nth double (  2 + 2) `andThen` add _11    `andThen`
              nth double (  3 + 2) `andThen` add _11    `andThen`
              nth double (  3 + 2) `andThen` add _11    `andThen`
              nth double (  1 + 4) `andThen` add _1001  `andThen`
              nth double (  1 + 3) `andThen` add _111   `andThen`
              nth double (  2 + 4) `andThen` add _1111  `andThen`
              nth double (  1 + 4) `andThen` add _1011  `andThen`
              nth double (      3) `andThen` add _101   `andThen`
              nth double (  2 + 4) `andThen` add _1111  `andThen`
              nth double (      3) `andThen` add _101   `andThen`
              nth double (  1 + 2) `andThen` add _11)   _10000
-----------------------------------------------------------------
-- Total length: 284    =     250 doubles  +  34 adds

Verifying the Addition Chains

AdditionChainComputation exposes the denote function that allows us to evaluate the value of an addition chain. Using it, it is easy to verify that an addition chain is structured correctly. All of the addition chains here were checked this way. Here, invalid will return a list of the names of the chains that aren't valid.

invalid = [name | (name, value, chain) <- chains, invalid' value chain]
  where
  invalid' value chain = value /= denote chain
  chains =
    [("curve25519FieldInverseExponent",
      curve25519FieldInverseExponentValue,
      curve25519FieldInverseExponent),

     ("p256FieldInverseSquaredExponent",
      p256FieldInverseSquaredExponentValue,
      p256FieldInverseSquaredExponent),

     ("p384FieldInverseSquaredExponent",
      p384FieldInverseSquaredExponentValue,
      p384FieldInverseSquaredExponent),

     ("secp256k1FieldInverseSquaredExponent",
      secp256k1FieldInverseSquaredExponentValue,
      secp256k1FieldInverseSquaredExponent),

     ("secp256k1ScalarInverseExponent",
      secp256k1ScalarInverseExponentValue,
      secp256k1ScalarInverseExponent),

     ("p256ScalarInverseExponent",
      p256ScalarInverseExponentValue,
      p256ScalarInverseExponent),

     ("p384ScalarInverseExponent",
      p384ScalarInverseExponentValue,
      p384ScalarInverseExponent),

     ("curve25519ScalarInverseExponent",
      curve25519ScalarInverseExponentValue,
      curve25519ScalarInverseExponent)
    ]