Lazy Binary Numbers
Number Representations
When working with numbers in Agda, we usually use the following definition:
data N = Z | S N deriving (Eq, Ord)
instance Num N where
Z + n = n
S n + m = S (n + m)
Z * m = Z
S n * m = m + n * m
data ℕ : Set where
: ℕ
zero : ℕ → ℕ
suc
_+_ : ℕ → ℕ → ℕ
= y
zero + y = suc (x + y)
suc x + y
_*_ : ℕ → ℕ → ℕ
= zero
zero * y = y + (x * y) suc x * y
Haskell
Agda
In Haskell it’s less common, for obvious reasons:
Operation | Complexity |
---|---|
Why use them at all, then? Well, in Agda, we need them so we can prove things about the natural numbers. Machine-level integers are fast, but they’re opaque: their implementation isn’t written in Agda, and therefore it’s not available for the compiler to reason about.
In Haskell, they occasionally find uses due to their laziness. This can help in Agda as well. By lazy here I mean that operations on them don’t have to inspect the full structure before giving some output.
>>> Z < S undefined
True
: ∀ x → zero * x ≡ zero
*-zeroˡ = refl *-zeroˡ x
In Haskell, as we can see, this lets us run computations without scrutinising some arguments. Agda benefits similarly: here it lets the compiler see more “obvious” facts that it may have missed otherwise.
It’s not completely lazy, though. In particular, it tends to be left-biased:
>>> undefined * Z == Z
** Exception: Prelude.undefined
: ∀ x → x * zero ≡ zero
*-zeroʳ = refl
*-zeroʳ x -- x * zero != zero of type ℕ
Like Boolean short-circuiting operators, operations on Peano numbers will usually have to scrutinise the left-hand-side argument quite a bit before giving an output.
So, Peano numbers are good because:
- We can prove things about them.
- They’re lazy.
In this post, I’m going to look at some other number representations that maintain these two desirable properties, while improving on the efficiency somewhat.
List-of-Bits-Binary
The first option for an improved representation is binary numbers. We can represent binary numbers as a list of bits:
data Bit = O | I deriving (Eq, Show, Ord)
type B = [Bit]
data Bit : Set where O I : Bit
: Set
𝔹 = List Bit 𝔹
As we’re using these to represent natural numbers, we’ll need to define a way to convert between them:
eval :: B -> N
= foldr f Z
eval where
O xs = xs + xs
f I xs = S (xs + xs)
f
inc :: B -> B
= [I]
inc [] O:xs) = I : xs
inc (I:xs) = O : inc xs
inc (
fromN :: N -> B
Z = []
fromN S n) = inc (fromN n) fromN (
_⇓⟧ : 𝔹 → ℕ
⟦_⇓⟧ = foldr (λ { O xs → xs + xs
⟦; I xs → suc (xs + xs) })
zero
: 𝔹 → 𝔹
inc = I ∷ []
inc [] (O ∷ xs) = I ∷ xs
inc (I ∷ xs) = O ∷ inc xs
inc
_⇑⟧ : ℕ → 𝔹
⟦= []
⟦ zero ⇑⟧ = inc ⟦ n ⇑⟧ ⟦ suc n ⇑⟧
And here we run into our first problem: redundancy. There are multiple ways to represent the same number according to the semantics defined above. We can actually prove this in Agda:
: ∃₂ λ x y → x ≢ y × ⟦ x ⇓⟧ ≡ ⟦ y ⇓⟧
redundant = [] , O ∷ [] , (λ ()) , refl redundant
In English: “There are two binary numbers which are not the same, but which do evaluate to the same natural number”. (This proof was actually automatically filled in for me after writing the signature)
This represents a huge problem for proofs. It means that even simple things like aren’t true, depending on how multiplication is implemented. On to our next option:
List-of-Gaps-Binary
Instead of looking at the bits directly, let’s think about a binary number as a list of chunks of 0s, each followed by a 1. In this way, we simply can’t have trailing zeroes, because the definition implies that every number other than 0 ends in 1.
data Gap = Z | S Gap
type B = [Gap]
: Set
𝔹 = List ℕ 𝔹
This guarantees a unique representation. As in the representation above, it has much improved time complexities for the familiar operations:
Operation | Complexity |
---|---|
Encoding the zeroes as gaps also makes multiplication much faster in certain cases: multiplying by a high power of 2 is a constant-time operation, for instance.
It does have one disadvantage, and it’s to do with the increment function:
inc :: B -> B
= uncurry (flip (:)) . inc'
inc where
= ([], Z)
inc' [] :xs) = inc'' x xs
inc' (x
Z ns = fmap S (inc' ns)
inc'' S n) ns = (n:ns,Z) inc'' (
: Set
𝔹⁺ = ℕ × 𝔹
𝔹⁺
: 𝔹 → 𝔹
inc = uncurry _∷_ ∘ inc′
inc module Inc where
mutual
: 𝔹 → 𝔹⁺
inc′ = 0 , []
inc′ [] (x ∷ xs) = inc″ x xs
inc′
: ℕ → 𝔹 → 𝔹⁺
inc″ = map₁ suc (inc′ ns)
inc″ zero ns (suc n) ns = 0 , n ∷ ns inc″
With all of their problems, Peano numbers performed this operation in constant time. The above implementation is only amortised constant-time, though, with a worst case of (same as the list-of-bits version). There are a number of ways to remedy this, the most famous being:
Skew Binary
This encoding has three digits: 0, 1, and 2. To guarantee a unique representation, we add the condition that there can be at most one 2 in the number, which must be the first non-zero digit if it’s present.
To represent this we’ll encode “gaps”, as before, with the condition
that if the second gap is 0 it actually represents a 2 digit in
the preceding position. That weirdness out of the way, we are rewarded
with an inc
implementation which is clearly
.
inc :: B -> B
= Z : []
inc [] :[]) = Z : x : []
inc (x: Z : xs) = S x : xs
inc (x : S x2 : xs) = Z : x1 : x2 : xs inc (x1
: 𝔹 → 𝔹
inc = 0 ∷ []
inc [] (x ∷ []) = 0 ∷ x ∷ []
inc (x₁ ∷ zero ∷ xs) = suc x₁ ∷ xs
inc (x₁ ∷ suc x₂ ∷ xs) = 0 ∷ x₁ ∷ x₂ ∷ xs inc
Unfortunately, though, we’ve lost the other efficiencies! Addition and multiplication have no easy or direct encoding in this system, so we have to convert back and forth between this and regular binary to perform them.
List-of-Segments-Binary
The key problem with incrementing in the normal binary system is that it can cascade: when we hit a long string of 1s, all the 1s become 0 followed by a single 1. We can turn this problem to our advantage if we use a representation which encodes both 1s and 0s as strings of gaps. We’ll have to use a couple more tricks to ensure a unique representation, but all in all this is what we have (switching to just Agda now):
data 0≤_ (A : Set) : Set where
: 0≤ A
0₂ _ : A → 0≤ A
0<
mutual
record 𝔹₀ : Set where
constructor _0&_
inductive
field
: ℕ
H₀ : 𝔹₁
T₀
record 𝔹₁ : Set where
constructor _1&_
inductive
field
: ℕ
H₁ : 0≤ 𝔹₀
T₁ open 𝔹₀ public
open 𝔹₁ public
data 𝔹⁺ : Set where
_ : 𝔹₀ → 𝔹⁺
B₀_ : 𝔹₁ → 𝔹⁺
B₁
: Set
𝔹 = 0≤ 𝔹⁺
𝔹
: 𝔹 → 𝔹⁺
inc⁺ = B₁ 0 1& 0₂
inc⁺ 0₂ (0< B₀ zero 0& y 1& xs ) = B₁ suc y 1& xs
inc⁺ (0< B₀ suc x 0& y 1& xs ) = B₁ 0 1& 0< x 0& y 1& xs
inc⁺ (0< B₁ x 1& 0₂ ) = B₀ x 0& 0 1& 0₂
inc⁺ (0< B₁ x 1& 0< zero 0& z 1& xs) = B₀ x 0& suc z 1& xs
inc⁺ (0< B₁ x 1& 0< suc y 0& z 1& xs) = B₀ x 0& 0 1& 0< y 0& z 1& xs
inc⁺
: 𝔹 → 𝔹
inc = 0< inc⁺ x inc x
Perfect! Increments are obviously , and we’ve guaranteed a unique representation.
I’ve been working on this type for a couple of days, and you can see my code here. So far, I’ve done the following:
- Defined
inc
, addition, and multiplication -
These were a little tricky to get right (addition is particularly hairy), but they’re all there, and maximally lazy.
- Proved Homomorphism
-
For each one of the functions, you want them to correspond precisely to the equivalent functions on Peano numbers. Proving that fact amounts to filling in definitions for the following:
: ∀ x → ⟦ inc x ⇓⟧ ≡ suc ⟦ x ⇓⟧ inc-homo : ∀ x y → ⟦ x + y ⇓⟧ ≡ ⟦ x ⇓⟧ + ⟦ y ⇓⟧ +-homo : ∀ x y → ⟦ x * y ⇓⟧ ≡ ⟦ x ⇓⟧ * ⟦ y ⇓⟧ *-homo
- Proved Bijection
-
As we went to so much trouble, it’s important to prove that these numbers form a one-to-one correspondence with the Peano numbers. As well as that, once done, we can use it to prove facts about the homomorphic functions above, by reusing any proofs about the same functions on Peano numbers. For instance, here is a proof of commutativity of addition:
: ∀ x y → x + y ≡ y + x +-comm = injective (+-homo x y ⟨ trans ⟩ +-comm x y .+-comm ⟦ x ⇓⟧ ⟦ y ⇓⟧ ⟨ trans ⟩ ℕ(+-homo y x)) sym
Applications
So now that we have our nice number representation, what can we do with it? One use is as a general-purpose number type in Agda: it represents a good balance between speed and “proofiness”, and Coq uses a similar type in its standard library.
There are other, more unusual uses of such a type, though.
Data Structures
It’s a well-known technique to build a data structure out of some number representation (Hinze 1998): in fact, all of the representations above are explored in Okasaki (1999, chap. 9.2).
Logic Programming
Logic programming languages like Prolog let us write programs in a backwards kind of way. We say what the output looks like, and the unifier will figure out the set of inputs that generates it.
In Haskell, we have a very rough approximation of a similar system: the list monad.
pyth :: [(Int,Int,Int)]
= do
pyth <- [1..10]
x <- [1..10]
y <- [1..10]
z *x + y*y == z*z)
guard (xreturn (x,y,z)
There are tons of inefficiencies in the above code: for us, though, we can look at one: the number representation. In the equation:
If we know that and are both odd, then must be even. If the calculation of the equation is expensive, this is precisely the kind of shortcut we’d want to take advantage of. Luckily, our binary numbers do just that: it is enough to scrutinise just the first bits of and in order to determine the first bit of the output.
After seeing that example, you may be thinking that lazy evaluation is a perfect fit for logic programming. You’re not alone! Curry (Hanus (ed.) 2016) is a lazy, functional logic programming language, with a similar syntax to Haskell. It also uses lazy binary numbers to optimise testing.
Lazy Predicates
In order for queries to be performed efficiently on binary numbers, we will also need a way to describe lazy predicates on them. A lot of these predicates are more easily expressible on the list-of-bits representation above, so we’ll be working with that representation for this bit. Not to worry, though: we can convert from the segmented representation into the list-of-bits, and we can prove that the conversion is injective:
: ∀ xs ys → toBits xs ≡ toBits ys → xs ≡ ys toBits-injective
Here’s the curious problem: since our binary numbers are expressed least-significant-bit-first, we have to go to the end before knowing which is bigger. Luckily, we can use one of my favourite Haskell tricks, involving the ordering monoid:
data Ordering : Set where
: Ordering
lt eq gt
_∙_ : Ordering → Ordering → Ordering
= lt
lt ∙ y = y
eq ∙ y = gt
gt ∙ y
: Bit → Bit → Ordering
cmpBit = eq
cmpBit O O = lt
cmpBit O I = gt
cmpBit I O = eq
cmpBit I I
: Bits → Bits → Ordering
compare = eq
compare [] [] (_ ∷ _) = lt
compare [] (_ ∷ _) [] = gt
compare (x ∷ xs) (y ∷ ys) = compare xs ys ∙ cmpBit x y compare
Thanks to laziness, this function first compares the length of the lists, and then does a lexicographical comparison in reverse only if the lengths are the same. This is exactly what we want for our numbers.
Future Posts
That’s all I have for now, but I’m interested to formalise the laziness of these numbers in Agda. Usually that’s done with coinduction: I would also like to see the relationship with exact real arithmetic.
I wonder if it can be combined with O’Connor (2016) to get some efficient proof search algorithms, or with Escardo (2014) to get more efficient exhaustive search.