Lazy Binary Numbers

Posted on March 21, 2019
Part 1 of a 1-part series on Binary Numbers
Tags: Agda, Haskell

Number Representations

When working with numbers in Agda, we usually use the following definition:

data N = Z | S N deriving (Eq, Ord)

instance Num N where
    Z + n = n
    S n + m = S (n + m)

    Z * m = Z
    S n * m = m + n * m
data: Set where
  zero :
  suc :

_+_ :
zero  + y = y
suc x + y = suc (x + y)

_*_ :
zero  * y = zero
suc x * y = y + (x * y)



In Haskell it’s less common, for obvious reasons:

Operation Complexity
n+mn + m 𝒪(n)\mathcal{O}(n)
n×mn \times m 𝒪(nm)\mathcal{O}(nm)

Why use them at all, then? Well, in Agda, we need them so we can prove things about the natural numbers. Machine-level integers are fast, but they’re opaque: their implementation isn’t written in Agda, and therefore it’s not available for the compiler to reason about.

In Haskell, they occasionally find uses due to their laziness. This can help in Agda as well. By lazy here I mean that operations on them don’t have to inspect the full structure before giving some output.

>>> Z < S undefined
*-zeroˡ :  x  zero * x ≡ zero
*-zeroˡ x = refl

In Haskell, as we can see, this lets us run computations without scrutinising some arguments. Agda benefits similarly: here it lets the compiler see more “obvious” facts that it may have missed otherwise.

It’s not completely lazy, though. In particular, it tends to be left-biased:

>>> undefined * Z == Z
** Exception: Prelude.undefined
*-zeroʳ :  x  x * zero ≡ zero
*-zeroʳ x = refl
-- x * zero != zero of type ℕ

Like Boolean short-circuiting operators, operations on Peano numbers will usually have to scrutinise the left-hand-side argument quite a bit before giving an output.

So, Peano numbers are good because:

  1. We can prove things about them.
  2. They’re lazy.

In this post, I’m going to look at some other number representations that maintain these two desirable properties, while improving on the efficiency somewhat.


The first option for an improved representation is binary numbers. We can represent binary numbers as a list of bits:

data Bit = O | I deriving (Eq, Show, Ord)

type B = [Bit]
data Bit : Set where O I : Bit

𝔹 : Set
𝔹 = List Bit

As we’re using these to represent natural numbers, we’ll need to define a way to convert between them:

eval :: B -> N
eval = foldr f Z
    f O xs = xs + xs
    f I xs = S (xs + xs)

inc :: B -> B
inc [] = [I]
inc (O:xs) = I : xs
inc (I:xs) = O : inc xs

fromN :: N -> B
fromN Z = []
fromN (S n) = inc (fromN n)
_⇓⟧ : 𝔹 
_⇓⟧ = foldr  { O xs  xs + xs
                ; I xs  suc (xs + xs) })

inc : 𝔹  𝔹
inc [] = I ∷ []
inc (O ∷ xs) = I ∷ xs
inc (I ∷ xs) = O ∷ inc xs

_⇑⟧ : 𝔹
⟦ zero  ⇑⟧ = []
⟦ suc n ⇑⟧ = inc ⟦ n ⇑⟧

And here we run into our first problem: redundancy. There are multiple ways to represent the same number according to the semantics defined above. We can actually prove this in Agda:

redundant : ∃₂ λ x y  x ≢ y × ⟦ x ⇓⟧ ≡ ⟦ y ⇓⟧
redundant = [] , O ∷ [] ,  ()) , refl

In English: “There are two binary numbers which are not the same, but which do evaluate to the same natural number”. (This proof was actually automatically filled in for me after writing the signature)

This represents a huge problem for proofs. It means that even simple things like x×0=0x \times 0 = 0 aren’t true, depending on how multiplication is implemented. On to our next option:


Instead of looking at the bits directly, let’s think about a binary number as a list of chunks of 0s, each followed by a 1. In this way, we simply can’t have trailing zeroes, because the definition implies that every number other than 0 ends in 1.

data Gap = Z | S Gap
type B = [Gap]
𝔹 : Set
𝔹 = List ℕ

This guarantees a unique representation. As in the representation above, it has much improved time complexities for the familiar operations:

Operation Complexity
n+mn + m 𝒪(log2n)\mathcal{O}(\log_2 n)
n×mn \times m 𝒪(log2(n+m))\mathcal{O}(\log_2 (n + m))

Encoding the zeroes as gaps also makes multiplication much faster in certain cases: multiplying by a high power of 2 is a constant-time operation, for instance.

It does have one disadvantage, and it’s to do with the increment function:

inc :: B -> B
inc = uncurry (flip (:)) . inc'
    inc' [] = ([], Z)
    inc' (x:xs) = inc'' x xs
    inc'' Z ns = fmap S (inc' ns)
    inc'' (S n) ns = (n:ns,Z)
𝔹⁺ : Set
𝔹⁺ = ℕ × 𝔹

inc : 𝔹  𝔹
inc = uncurry __ ∘ inc′
  module Inc where
    inc′ : 𝔹  𝔹⁺
    inc′ [] = 0 , []
    inc′ (x ∷ xs) = inc″ x xs

    inc″ : 𝔹  𝔹⁺
    inc″ zero ns = map₁ suc (inc′ ns)
    inc″ (suc n) ns = 0 , n ∷ ns

With all of their problems, Peano numbers performed this operation in constant time. The above implementation is only amortised constant-time, though, with a worst case of 𝒪(log2n)\mathcal{O}(\log_2 n) (same as the list-of-bits version). There are a number of ways to remedy this, the most famous being:

Skew Binary

This encoding has three digits: 0, 1, and 2. To guarantee a unique representation, we add the condition that there can be at most one 2 in the number, which must be the first non-zero digit if it’s present.

To represent this we’ll encode “gaps”, as before, with the condition that if the second gap is 0 it actually represents a 2 digit in the preceding position. That weirdness out of the way, we are rewarded with an inc implementation which is clearly 𝒪(1)\mathcal{O}(1).

inc :: B -> B
inc [] = Z : []
inc (x:[]) = Z : x : []
inc (x  : Z : xs) = S x : xs
inc (x1 : S x2 : xs) = Z : x1 : x2 : xs
inc : 𝔹  𝔹
inc [] = 0 ∷ []
inc (x ∷ []) = 0 ∷ x ∷ []
inc (x₁ ∷ zero ∷ xs) = suc x₁ ∷ xs
inc (x₁ ∷ suc x₂ ∷ xs) = 0 ∷ x₁ ∷ x₂ ∷ xs

Unfortunately, though, we’ve lost the other efficiencies! Addition and multiplication have no easy or direct encoding in this system, so we have to convert back and forth between this and regular binary to perform them.


The key problem with incrementing in the normal binary system is that it can cascade: when we hit a long string of 1s, all the 1s become 0 followed by a single 1. We can turn this problem to our advantage if we use a representation which encodes both 1s and 0s as strings of gaps. We’ll have to use a couple more tricks to ensure a unique representation, but all in all this is what we have (switching to just Agda now):

data 0≤_ (A : Set) : Set where
  0₂  : 0≤ A
  0<_ : A  0≤ A

  record 𝔹₀ : Set where
    constructor _0&_
      H₀ :
      T₀ : 𝔹₁

  record 𝔹₁ : Set where
    constructor _1&_
      H₁ :
      T₁ : 0≤  𝔹₀
open 𝔹₀ public
open 𝔹₁ public

data 𝔹⁺ : Set where
  B₀_ : 𝔹₀  𝔹⁺
  B₁_ : 𝔹₁  𝔹⁺

𝔹 : Set
𝔹 = 0≤ 𝔹⁺

inc⁺ : 𝔹  𝔹⁺
inc⁺ 0₂                               =      B₁ 0     1& 0₂
inc⁺ (0< B₀ zero  0& y 1& xs        ) =      B₁ suc y 1& xs
inc⁺ (0< B₀ suc x 0& y 1& xs        ) =      B₁ 0     1& 0< x 0& y 1& xs
inc⁺ (0< B₁ x 1& 0₂                 ) = B₀ x 0& 0     1& 0₂
inc⁺ (0< B₁ x 1& 0< zero  0& z 1& xs) = B₀ x 0& suc z 1& xs
inc⁺ (0< B₁ x 1& 0< suc y 0& z 1& xs) = B₀ x 0& 0     1& 0< y 0& z 1& xs

inc : 𝔹  𝔹
inc x = 0< inc⁺ x

Perfect! Increments are obviously 𝒪(1)\mathcal{O}(1), and we’ve guaranteed a unique representation.

I’ve been working on this type for a couple of days, and you can see my code here. So far, I’ve done the following:

Defined inc, addition, and multiplication

These were a little tricky to get right (addition is particularly hairy), but they’re all there, and maximally lazy.

Proved Homomorphism

For each one of the functions, you want them to correspond precisely to the equivalent functions on Peano numbers. Proving that fact amounts to filling in definitions for the following:

inc-homo :  x  ⟦ inc x ⇓⟧ ≡ suc ⟦ x ⇓⟧
+-homo :  x y  ⟦ x + y ⇓⟧ ≡ ⟦ x ⇓⟧ + ⟦ y ⇓⟧
*-homo :  x y  ⟦ x * y ⇓⟧ ≡ ⟦ x ⇓⟧ * ⟦ y ⇓⟧
Proved Bijection

As we went to so much trouble, it’s important to prove that these numbers form a one-to-one correspondence with the Peano numbers. As well as that, once done, we can use it to prove facts about the homomorphic functions above, by reusing any proofs about the same functions on Peano numbers. For instance, here is a proof of commutativity of addition:

+-comm :  x y  x + y ≡ y + x
+-comm x y = injective (+-homo x y ⟨ trans ⟩
.+-comm ⟦ x ⇓⟧ ⟦ y ⇓⟧ ⟨ trans ⟩
                        sym (+-homo y x))


So now that we have our nice number representation, what can we do with it? One use is as a general-purpose number type in Agda: it represents a good balance between speed and “proofiness”, and Coq uses a similar type in its standard library.

There are other, more unusual uses of such a type, though.

Data Structures

It’s a well-known technique to build a data structure out of some number representation (Hinze 1998): in fact, all of the representations above are explored in Okasaki (1999, chap. 9.2).

Logic Programming

Logic programming languages like Prolog let us write programs in a backwards kind of way. We say what the output looks like, and the unifier will figure out the set of inputs that generates it.

In Haskell, we have a very rough approximation of a similar system: the list monad.

pyth :: [(Int,Int,Int)]
pyth = do
  x <- [1..10]
  y <- [1..10]
  z <- [1..10]
  guard (x*x + y*y == z*z)
  return (x,y,z)

There are tons of inefficiencies in the above code: for us, though, we can look at one: the number representation. In the equation:

x2+y2=z2x^2 + y^2 = z^2

If we know that xx and yy are both odd, then zz must be even. If the calculation of the equation is expensive, this is precisely the kind of shortcut we’d want to take advantage of. Luckily, our binary numbers do just that: it is enough to scrutinise just the first bits of xx and yy in order to determine the first bit of the output.

After seeing that example, you may be thinking that lazy evaluation is a perfect fit for logic programming. You’re not alone! Curry (Hanus (ed.) 2016) is a lazy, functional logic programming language, with a similar syntax to Haskell. It also uses lazy binary numbers to optimise testing.

Lazy Predicates

In order for queries to be performed efficiently on binary numbers, we will also need a way to describe lazy predicates on them. A lot of these predicates are more easily expressible on the list-of-bits representation above, so we’ll be working with that representation for this bit. Not to worry, though: we can convert from the segmented representation into the list-of-bits, and we can prove that the conversion is injective:

toBits-injective :  xs ys  toBits xs ≡ toBits ys  xs ≡ ys

Here’s the curious problem: since our binary numbers are expressed least-significant-bit-first, we have to go to the end before knowing which is bigger. Luckily, we can use one of my favourite Haskell tricks, involving the ordering monoid:

data Ordering : Set where
  lt eq gt : Ordering

__ : Ordering  Ordering  Ordering
lt ∙ y = lt
eq ∙ y = y
gt ∙ y = gt

cmpBit : Bit  Bit  Ordering
cmpBit O O = eq
cmpBit O I = lt
cmpBit I O = gt
cmpBit I I = eq

compare : Bits  Bits  Ordering
compare [] [] = eq
compare [] (__) = lt
compare (__) [] = gt
compare (x ∷ xs) (y ∷ ys) = compare xs ys ∙ cmpBit x y

Thanks to laziness, this function first compares the length of the lists, and then does a lexicographical comparison in reverse only if the lengths are the same. This is exactly what we want for our numbers.

Future Posts

That’s all I have for now, but I’m interested to formalise the laziness of these numbers in Agda. Usually that’s done with coinduction: I would also like to see the relationship with exact real arithmetic.

I wonder if it can be combined with O’Connor (2016) to get some efficient proof search algorithms, or with Escardo (2014) to get more efficient exhaustive search.


Escardo, Martin. 2014. “Seemingly Impossible Constructive Proofs | Mathematics and Computation.” Mathematics and Computation.

Hanus (ed.), M. 2016. Curry: An Integrated Functional Logic Language (Vers. 0.9.0). Available at

Hinze, Ralf. 1998. Numerical Representations as Higher-Order Nested Datatypes. Institut für Informatik III, Universität Bonn.\#R5.

O’Connor, Liam. 2016. “Applications of Applicative Proof Search.” In Proceedings of the 1st International Workshop on Type-Driven Development, 43–55. TyDe 2016. New York, NY, USA: ACM. doi:10.1145/2976022.2976030.

Okasaki, Chris. 1999. Purely Functional Data Structures. Cambridge University Press.