An Introduction & Supplement to Knuth's Introduction to Addition Chains
(See Addition Chains as Polymorphic Higher-order Functions for a different way of working with addition chains that works for very large numbers.)
Let's learn a bit about addition chains. Addition chains are representations of positive numbers, constructed from only the value 1 and the addition operation as building blocks. The best introduction to addition chains is Section 4.6.3 of The Art of Computer Programming, Volume 2: Seminumerical Algorithms (3rd Edition) by Donald Knuth. The presentation here is intended to supplement what Knuth wrote. The nondescript names of the functions r
, d
, f
, λ
, and v
are consistent with other descriptions of addition chains, Knuth's in particular.
I wrote some Haskell code to help with our exploration. I didn't use any language extensions beyond GHCi 8.0.1's defaults, and I didn't use any library code outside the base package. I lack a proper background in algebra so I avoided abstractions like Semigroup
, Functor
, and Monad
. I didn't even use folds and maps. The result is the module AdditionChainConstruction
in the file AdditionChainConstruction.lhs. I use binary literal notation in the discussion (e.g. 0b101 == 5
), but not in the code. You can enable this at the GHCi prompt with :set -XBinaryLiterals
or with {-# LANGUAGE BinaryLiterals #-}
at the top of your source files, but it isn't required.
module AdditionChainConstruction(
Peano(..), peano_42,
DoubleAndAdd1(..), doubleAndAdd1_42, doubleAndAdd1_45,
AdditionChain(..), _45, add, double,
Denotable(denote),
ToBits(bits), bitString,
r, d, f, λ, lambda, v,
Components, Measurable,
) where
import Data.Bits
import Data.Char(intToDigit)
import Data.List(nub)
import Numeric(showIntAtBase)
import Numeric.Natural
The simplest representation of positive integers is the Peano encoding, which uses only the unit 1
and an n + 1
operation. (In other presentations of Peano numbers, the n + 1
operation is usually called succ
for successor. Here we call it PAdd1
for symmetry with what we do later. Also, the unit value for Peano numbers is often defined to be 0, but we use 1 here because 0 isn't a posiive integer.)
data Peano
= POne
| PAdd1 Peano
deriving (Show, Eq)
POne
represents 1, PAdd1 POne
represents 2, PAdd1 (PAdd1 POne)
represents 3, PAdd1 (PAdd1 (PAdd1 (PAdd1 POne)))
represents 5, etc. We can say this in code by creating a function that maps a Peano
to the natural number it represents. We'll call this function denote
since it is a sort of denotational semantics for our representations, and we'll make it generic so that we can implement it for every type of representation of positive integers we make:
class Denotable a where
denote :: a -> Natural
We can now define the meaning of every Peano
value:
instance Denotable Peano where
denote POne = 1
denote (PAdd1 n_minus_1) = (denote n_minus_1) + 1
Let's construct the Peano representation of 42:
peano_42 :: Peano
peano_42 =
PAdd1 (PAdd1 (PAdd1 (PAdd1 (PAdd1 (PAdd1 (PAdd1 (PAdd1 (PAdd1 (PAdd1 (
PAdd1 (PAdd1 (PAdd1 (PAdd1 (PAdd1 (PAdd1 (PAdd1 (PAdd1 (PAdd1 (PAdd1 (
PAdd1 (PAdd1 (PAdd1 (PAdd1 (PAdd1 (PAdd1 (PAdd1 (PAdd1 (PAdd1 (PAdd1 (
PAdd1 (PAdd1 (PAdd1 (PAdd1 (PAdd1 (PAdd1 (PAdd1 (PAdd1 (PAdd1 (PAdd1 (
PAdd1 POne))))))))))))))))))))))))))))))))))))))))
We can verify that peano_42
is the correct encoding of 42 by checking denote peano_42 == 42
.
It takes n - 1
applications of PAdd1
to represent the number n
. For example, peano_42
has POne
nested inside 41 layers of PAdd1
. Representing even a tiny number like 42 is unwieldy with the Peano encoding. It is impractical for encoding large numbers. Let's define a new representation similar to Peano
, but with an additional doubling operation n + n
that augments the Peano n + 1
operation:
data DoubleAndAdd1
= DA1One
| DA1Double DoubleAndAdd1
| DA1DoubleAndAdd1 DoubleAndAdd1
deriving (Show, Eq)
instance Denotable DoubleAndAdd1 where
denote DA1One = 1
denote (DA1Double n) = denoteDouble n
denote (DA1DoubleAndAdd1 n) = (denoteDouble n) + 1
denoteDouble n = d + d where d = denote n
Let's represent 42 as a double-and-add-1 chain:
doubleAndAdd1_42 =
let
__1 = DA1One -- given 1: 1
__2 = DA1Double __1 -- append 0: 10
__5 = DA1DoubleAndAdd1 __2 -- append 1: 101
_10 = DA1Double __5 -- append 0: 1010
_21 = DA1DoubleAndAdd1 _10 -- append 1: 10101
_42 = DA1Double _21 -- append 0: 101010
in
_42 -- 42 == 0b101010
We can verify that doubleAndAdd1_42
is a representation of 42 by checking denote doubleAndAdd1_42 == 42
.
It is safe to assume doubling is not more expensive than adding one. (In many applications of addition chains, doubling is actually cheaper than other forms of addition.) Note that a DA1DoubleAndAdd1
represents two additions, combined only to prevent an addition of 1 without the doubling that must precede it. This double-and-add-one chain only requires 5 doublings + 2 other additions = 7 additions to represent 42, which is much more efficient than the Peano encoding, which required 41 additions.
class Measurable a where
-- The number of doublings in a single component.
d' :: a -> Count
-- The number of non-doublings in a single component.
f' :: a -> Count
-- The number of additions of any sort in a single component.
r' :: Measurable a => a -> Count
r' n = d' n + f' n
type Count = Int -- Differentiate our metadata numbers.
instance Measurable Peano where
d' _ = 0 -- Peano doesn't have the concept of doubling.
f' POne = 0
f' (PAdd1 n_minus_1) = (f' n_minus_1) + 1
instance Measurable DoubleAndAdd1 where
d' DA1One = 0
d' (DA1Double _) = 1
d' (DA1DoubleAndAdd1 _) = 1
f' DA1One = 0
f' (DA1Double _) = 0
f' (DA1DoubleAndAdd1 _) = 1
It's useful to be able to see our representations of positive integers as bit strings so let's define some tools to facilitate that:
-- A representation of a positive number in terms of 'Data.Bits'.
class ToBits a where
-- Converts the representation to one that is easier to look at bitwise.
-- Ideally the result type of 'bits' would be 'Bits b => b', but it is
-- restricted to 'Natural' here for the reason mentioned below.)
bits :: a -> Natural
-- Construct a string of the binary representation of 'a'.
bitString :: ToBits a => a -> String
bitString n = showBin (bits n) ""
where
showBin x s = showIntAtBase 2 intToDigit x s -- Internet copy-pasta.
-- Haskell doesn't provide a convenient way to see the bits of a 'Natural'.
-- Make 'bitString' work for 'Natural' too, to rectify this. For example,
-- bitString (42::Natural) == "101010".
instance ToBits Natural where
bits x = x
Every component in the double-and-add-one representation appends a single binary digit to the result; every DA1Double
appends a zero and every DA1DoubleAndAdd1
appends a one:
instance ToBits DoubleAndAdd1 where
bits d = bits' 0 d
where
bits' i DA1One = bit i -- Set the most significant bit.
bits' i (DA1Double d) = (bits' (i + 1) d)
bits' i (DA1DoubleAndAdd1 d) = setBit (bits' (i + 1) d) i
Note that bits d
computes the same results as denote d
, i.e. bits d == denote d
for all d
. bits
does it with bitwise operations (directly setting bits) whereas DoubleAndAdd1
's denote
implementation does it arithmetically (using addition). When trying to find the most efficient addition chain to construct a number, it will often be useful to alternate between thinking in terms of bitwise operations and arithmetic operations.
A Peano representation of a number n has Θ(n) complexity, whereas the double-and-add-one representation has Θ(lg n) complexity. Why is the double-and-add-one representation so much more efficient? There are roughly λ(n) = ⌊lg n⌋ doublings in the double-and-add-one representation. That is, there will be one doubling for every bit in the number, except for the first bit. The ith doubling itself gets doubled λ(n) - i times, which means that such a doubling is equivalent to (roughly) 2λ(n) - i Peano PAdd1
operations. Summing this series will result in a value that's very close to n
already, which means there is not much more work to do. In fact, no more than v(n)
additions are needed in addition to the doubles, where v(n)
is the number of set bits in n
, i.e. its binary Hamming weight.
-- |The length in bits of the given number; aliased as 'lambda'.
λ :: ToBits a => a -> Count
λ n = floor (logBase 2 (fromIntegral (bits n))) -- Internet copy-pasta.
-- |The length in bits of the given number; an easy-to-type alias for 'λ'.
lambda :: ToBits a => a -> Count
lambda = λ
-- |The number of set bits in the given number; its binary Hamming weight.
v :: ToBits a => a -> Count
v n = popCount (bits n)
Doubling is the most powerful possible operation. We always start with only the value 1 (DA1One
) given to us. The only way we can get to 2 from 1 is by adding 1 to itself, i.e. doubling it. Then we have 1 and 2 available. We can construct 3 by adding 1 + 2
and we can construct 4 by doubling 2. But there's no way to use a single addition, given just 1 and 2, that will result in a value larger than 4. Thus the introduction of doubling is the biggest efficiency improvement we can possibly make.
We can still make improvements, but they will be much smaller than the introduction of doubling. The two operations we've done so far are doubling and adding one. Doubling (n + n
) is the biggest step we can take, and adding one (n + 1
) is the smallest step we can take. Are there medium-sized steps? What if, instead of just adding n + n
or n + 1
, we could add any two previously-constructed values a + b
?
It turns out that for the number 42, the double-and-add-1 chain we developed is the best we can do. But it is not the best choice for every number. Let's consider 45:
doubleAndAdd1_45 =
let
__1 = DA1One -- given 1: 1
__2 = DA1Double __1 -- append 0: 10
__5 = DA1DoubleAndAdd1 __2 -- append 1: 101
_11 = DA1DoubleAndAdd1 __5 -- append 1: 1011
_22 = DA1Double _11 -- append 0: 10110
_45 = DA1DoubleAndAdd1 _22 -- append 1: 101101
in
_45 -- 45 == 0b101101
We can verify that this is really a representation of 45 by checking denote doubleAndAdd1_45 == 45
.
The double-and-add-1 chain for 45 requires 5 doublings and 3 additions of 1. Now let's try our generalized addition idea to see how we can construct 45 more efficiently:
data AdditionChain
= One
| Add AdditionChain AdditionChain
deriving (Eq, Show)
instance Denotable AdditionChain where
denote One = 1
denote (Add a b) = (denote a) + (denote b)
Although this new structure doesn't differentiate between doubles and adds, when constructing an addition chain it will be clearer if we indicate which additions are doubles, especially since we count doubles and adds separately.
double a = Add a a
add a b = Add a b
instance Measurable AdditionChain where
d' One = 0
d' (Add a b) | a == b = 1
| otherwise = 0
f' One = 0
f' (Add a b) | a == b = 0
| otherwise = 1
Now let's use it to create an efficient representation of 45:
_45 =
let
__1 = One -- given 1: 1
__2 = double __1 -- append 0: 10
__4 = double __2 -- append 0: 100
__5 = add __4 __1 -- add 1: 101
_10 = double __5 -- append 0: 1010
_20 = double _10 -- append 0: 10100
_40 = double _20 -- append 0: 101000
_45 = add _40 __5 -- add 0b101: 101101
in
_45 -- 45 == 0b101101
As we've done before, we can verify that _45
really represents 45 by checking denote _45 == 45
.
This new representation uses the same number of doublings (5) as the double-and-add-1 representation, but only uses 2 additions instead of 3, a savings of one addition. Looking at the addition chain arithmetically, we constructed the value 5 just like before and then reuse the value 5 later in the computation, after we construct 40, to compute 45 as 40 + 5
. Looking at it bitwise, we can think of the doubling operations as each appending one bit of empty (zero-valued) space, and addition as overwriting that space. In particular, the last three doublings reserve 3 bits of space 0b000
and the final addition of 5 (0b101
) overwrites that space with 0b101
2. Since 0b101
has two bits set, adding it does the same amount of work as two additions of 1.
We can verify that _45
is one addition shorter than doubleAndAdd1_45
by verifying r _45 == (r doubleAndAdd1_45) - 1
; we can verify they have the same number of doubles by checking d _45 == d doubleAndAdd1_45
, and we can verify that _45
has one fewer non-doubling addition than doubleAndAdd1_45
checking f _45 == (f doubleAndAdd1_45) - 1
:
-- |The length of the representation; i.e. 'r n == d n + f n'.
r :: (Components a, Measurable a) => a -> Count
r n = d n + f n
-- |The total number of doubles in a representation; 'd n == r n - f n'.
d :: (Components a, Measurable a) => a -> Count
d n = sum [d' s | s <- components n]
-- |The number of non-doubles in a representation; 'f n == r n - d n'.
f :: (Components a, Measurable a) => a -> Count
f n = sum [f' s | s <- components n]
-- The set of unique components in a representation.
components :: Components a => a -> [a]
components n = nub (components' n)
class Eq a => Components a where
-- Maps a representation to a list of all its components. There may be
-- duplicates in the result if a step is used multiple times in the
-- representation.
components' :: a -> [a]
instance Components Peano where
components' POne = [POne]
components' s@(PAdd1 n_minus_1) = s:(components' POne)
instance Components DoubleAndAdd1 where
components' DA1One = [DA1One]
components' s@(DA1Double d) = s:(components' d)
components' s@(DA1DoubleAndAdd1 d) = s:(components' d)
instance Components AdditionChain where
components' One = [One]
components' s@(Add a b) =
s:(components' a) ++ components' b
Saving one addition in this example is a tiny benefit compared to how much introducing doubling helped. However, for very large numbers such as those used in public key cryptography, these savings seem to scale up approximately proportionally with the increase in size of the numbers involved.
Double-and-add-1 chains have a convenient mapping to bit strings, but general addition chains do not. This matches the situation with arithmetic on positive integers. A double step is just a left shift by one bit, and a double-and-add-1 step is just a left shift by one bit followed by setting the least significant bit. We can do this in our heads without too much trouble. In contrast, adding two arbitrary positive integers is much more complicated because we have to carry bits. Although we can add 13 + 3
in our heads, it is not easy to visualize the bitwise effects because of carries:
1101 + 11 ------ 10000
Since it is non-trivial to implement addition in terms of setting bits, let's just (ab)use the previously-noted equivalence between bits
and denote
to define the mapping from addition chains to bits:
instance ToBits AdditionChain where
-- The return type of 'bits' was artificially constrained to 'Natural'
-- above just to allow this to typecheck.
bits d = denote d
At this point the AdditionChainConstruction
can do three useful things: It can help us (manually) construct efficient addition chains, it can help us verify that an addition chain we construct for a particular value actually represents that value, and it can help us measure the complexity of our addition chains. This is all we need to start experimenting with creating efficient addition chains for whatever numbers we find interesting. From here you can read the resources recommended below and start solving these unsolved problems in the overlapping fields of computer science, additive number theory, algebra, and cryptography:
-
For a given number
n
, what is a practical algorithm for constructing an addition chain denotingn
that is more efficient than the addition chains generated forn
by any previously known algorithm for automatically constructing addition chains? -
For a given number
n
, what is a practical algorithm for verifying whether an addition chainc
is the most efficient addition chain possible forn
? -
For a given number
n
, what is a practical algorithm for constructing the most efficient addition chain possible forn
?
Recommended Reading
- Section 4.6.3 of The Art of Computer Programming, Volume 2: Seminumerical Algorithms (3rd Edition) by Donald Knuth is the best introduction to addition chains.
-
Shortest Addition Chains by Achim Flammenkamp is the most comprehensive resource regarding addition chains, with an awesome bibliography and reports on recent research.