Tags: Haskell

I have packaged up the more interesting bits from the Algebras for Weighted Search paper and put it up on hackage.

You can see it here.

It contains the `HeapT`

monad, the `Monus`

class, and an implementation of Dijkstra’s algorithm, the Viterbi algorithm, and probabilistic parsing.

Check it out!

]]>The paper “Algebras for Weighted Search” has just been accepted unconditionally to ICFP. I wrote it with my supervisor, Nicolas Wu, and it covers a lot of the topics I’ve written about on this blog (including hyperfunctions and breadth-first traversals).

The preprint is available here.

]]>
Tags: Haskell

Check out this type:

`newtype a -&> b = Hyp { invoke :: (b -&> a) -> b } `

This a hyperfunction (Launchbury, Krstic, and Sauerwein 2013; 2000; 2000), and I think it’s one of the weirdest and most interesting newtypes you can write in Haskell.

The first thing to notice is that the recursion pattern is weird. For a type to refer to itself recursively on the *left* of a function arrow is pretty unusual, but on top of that the recursion is *non-regular*. That means that the recursive reference has different type parameters to its parent: `a -&> b`

is on the left-hand-side of the equals sign, but on the right we refer to `b -&> a`

.

Being weird is reason enough to write about them, but what’s really shocking about hyperfunctions is that they’re *useful*. Once I saw the definition I realised that a bunch of optimisation code I had written (to fuse away zips in particular) was actually using hyperfunctions (Ghani et al. 2005). After that, I saw them all over the place: in coroutine implementations, queues, breadth-first traversals, etc.

Anyways, since coming across hyperfunctions a few months ago I thought I’d do a writeup on them. I’m kind of surprised they’re not more well-known, to be honest: they’re like a slightly more enigmatic `Cont`

monad, with a far cooler name. Let’s get into it!

The newtype noise kind of hides what’s going on with hyperfunctions: expanding the definition out might make things slightly clearer.

```
type a -&> b = (b -&> a) -> b
= ((a -&> b) -> a) -> b
= (((b -&> a) -> b) -> a) -> b
= ((((... -> b) -> a) -> b) -> a) -> b
```

So a value of type `a -&> b`

is kind of an infinitely left-nested function type. One thing worth noticing is that all the `a`

s are in negative positions and all the `b`

s in positive. This negative and positive business basically refers to the position of arguments in relation to a function arrow: to the left are negatives, and to the right are positives, but two negatives cancel out.

```
... -> b) -> a) -> b) -> a) -> b
((((+ - + - +
```

All the things in negative positions are kind of like the things a function “consumes”, and positive positions are the things “produced”. It’s worth fiddling around with very nested function types to get a feel for this notion. For hyperfunctions, though, it’s enough to know that `a -&> b`

does indeed (kind of) take in a bunch of `a`

s, and it kind of produces `b`

s.

By the way, one of the ways to get to grips with polarity in this sense is to play around with the Cont monad, codensity monad, or selection monad (Hedges 2015). If you do, you may notice one of the interesting parallels about hyperfunctions: the type `a -&> a`

is in fact the fixpoint of the continuation monad (`Fix (Cont a)`

). Suspicious!

Before diving further into the properties of the type itself, I’d like to give some examples of how it can show up in pretty standard optimisation code.

Let’s say you wanted to write `zip`

with `foldr`

(I have already described this particular algorithm in a previous post). Not `foldr`

on the left argument, mind you, but `foldr`

on *both*. If you proceed mechanically, replacing every recursive function with `foldr`

, you can actually arrive at a definition:

```
zip :: [a] -> [b] -> [(a,b)]
zip xs ys = foldr xf xb xs (foldr yf yb ys)
where
= yk x xk
xf x xk yk = []
xb _
= (x,y) : xk yk
yf y yk x xk = [] yb _ _
```

In an untyped language, or a language with recursive types, such a definition would be totally fine. In Haskell, though, the compiler will complain with the following:

```
• Occurs check: cannot construct the infinite type:
t0 ~ a -> (t0 -> [(a, b)]) -> [(a, b)]
```

Seasoned Haskellers will know, though, that this is not a type error: no, this is a type *recipe*. The compiler is telling you what parameters it wants you to stick in the newtype:

```
newtype Zip a b =
Zip { runZip :: a -> (Zip a b -> [(a,b)]) -> [(a,b)] }
zip :: forall a b. [a] -> [b] -> [(a,b)]
zip xs ys = xz yz
where
xz :: Zip a b -> [(a,b)]
= foldr f b xs
xz where
= runZip yk x xk
f x xk yk = []
b _
yz :: Zip a b
= foldr f b ys
yz where
= Zip (\x xk -> (x,y) : xk yk)
f y yk = Zip (\_ _ -> []) b
```

And here we see the elusive hyperfunction: hidden behind a slight change of parameter order, `Zip a b`

is in fact the same as `[(a,b)] -&> (a -> [(a,b)])`

.

```
zip :: forall a b. [a] -> [b] -> [(a,b)]
zip xs ys = invoke xz yz
where
xz :: (a -> [(a,b)]) -&> [(a,b)]
= foldr f b xs
xz where
= Hyp (\yk -> invoke yk xk x)
f x xk = Hyp (\_ -> [])
b
yz :: [(a,b)] -&> (a -> [(a,b)])
= foldr f b ys
yz where
= Hyp (\xk x -> (x,y) : invoke xk yk)
f y yk = Hyp (\_ _ -> []) b
```

In another previous post I derived the following function to do a breadth-first traversal of a tree:

```
data Tree a = a :& [Tree a]
newtype Q a = Q { q :: (Q a -> [a]) -> [a] }
bfe :: Tree a -> [a]
= q (f t b) e
bfe t where
f :: Tree a -> Q a -> Q a
:& xs) fw = Q (\bw -> x : q fw (bw . flip (foldr f) xs))
f (x
b :: Q a
= Q (\k -> k b)
b
e :: Q a -> [a]
Q q) = q e e (
```

That `Q`

type there is another hyperfunction.

```
bfe :: Tree a -> [a]
= invoke (f t e) e
bfe t where
f :: Tree a -> ([a] -&> [a]) -> ([a] -&> [a])
:& xs) fw = Hyp (\bw -> x : invoke fw (Hyp (invoke bw . flip (foldr f) xs)))
f (x
e :: [a] -&> [a]
= Hyp (\k -> invoke k e) e
```

One of the problems I had with the above function was that it didn’t terminate: it could enumerate all the elements of the tree but it didn’t know when to stop. A similar program (Allison 2006; described and translated to Haskell in Smith 2009) manages to solve the problem with a counter. Will it shock you to find out this solution can also be encoded with a hyperfunction?

```
= invoke (f t e) e 1
bfe t where
f :: Tree a -> (Int -> [a]) -&> (Int -> [a])
-> (Int -> [a]) -&> (Int -> [a])
:& xs) fw =
f (x Hyp (\bw n -> x :
invoke fwHyp (invoke bw . flip (foldr f) xs)) (n+1))
(
e :: (Int -> [a]) -&> (Int -> [a])
= Hyp b
e
0 = []
b x = invoke x (Hyp b) (n-1) b x n
```

(my version here is actually a good bit different from the one in Smith 2009, but the basic idea is the same)

Hyperfunctions seem to me to be quite deeply related to coroutines. At the very least several of the types involved in coroutine implementations are actual hyperfunctions. The `ProdPar`

and `ConsPar`

types from Pieters and Schrijvers (2019) are good examples:

```
newtype ProdPar a b = ProdPar (ConsPar a b -> b)
newtype ConsPar a b = ConsPar (a -> ProdPar a b -> b)
```

`ProdPar a b`

is isomorphic to `(a -> b) -&> b`

, and `ConsPar a b`

to `b -&> (a -> b)`

, as witnessed by the following functions:

`ProdPar`

, `ConsPar`

and hyperfunctions```
fromP :: ProdPar a b -> (a -> b) -&> b
ProdPar x) = Hyp (x . toC)
fromP (
toC :: b -&> (a -> b) -> ConsPar a b
Hyp h) = ConsPar (\x p -> h (fromP p) x)
toC (
toP :: (a -> b) -&> b -> ProdPar a b
Hyp x) = ProdPar (x . fromC)
toP (
fromC :: ConsPar a b -> b -&> (a -> b)
ConsPar p) = Hyp (\h x -> p x (toP h)) fromC (
```

In fact this reveals a little about what was happening in the `zip`

function: we convert the left-hand list to a `ProdPar`

(producer), and the right-hand to a consumer, and apply them to each other.

Aside from just being kind of weird intuitively, hyperfunctions are weird *in theory*. Set-theoretically, for instance, you cannot form the set of `a -&> b`

: if you tried, you’d run into those pesky size restrictions which stop us from making things like “the set of all sets”. Haskell types, however, are not sets, precisely because we can define things like `a -&> b`

.

For slightly different reasons to the set theory restrictions, we can’t define the type of hyperfunctions in Agda. The following will get an error:

```
record _↬_ (A : Type a) (B : Type b) : Type (a ℓ⊔ b) where
inductive; constructor hyp
field invoke : (B ↬ A) → B
```

And for good reason! Agda doesn’t allow recursive types where the recursive call is in a negative position. If we turn off the positivity checker, we can write Curry’s paradox (example proof taken from here):

```
: ⊥ ↬ ⊥
yes? .invoke h = h .invoke h
yes?
: (⊥ ↬ ⊥) → ⊥
no! = h .invoke h
no! h
: ⊥
boom = no! yes? boom
```

Note that this isn’t an issue with the termination checker: the above example passes all the normal termination conditions without issue (yes, even if `↬`

is marked as `coinductive`

). It’s directly because the type itself is not positive.

Interestingly, there is a slightly different, and nearly equivalent, definition of hyperfunctions which doesn’t allow us to write the above proof:

```
record _↬_ (A : Type a) (B : Type b) : Type (a ℓ⊔ b) where
inductive; constructor hyp
field invoke : ((A ↬ B) → A) → B
```

This is basically a slightly expanded out version of the hyperfunction type, and importantly it’s *positive*. Not *strictly* positive however, since the recursive call does occur to the left of a function arrow: it’s just positive, in that it’s to the left of an even number of function arrows.

I found in a blog post by Sjöberg (2015) some interesting discussion regarding the question of this extra strictness: in Coq, allowing certain positive but not *strictly* positive types does indeed introduce an inconsistency (Coquand and Paulin 1990). However this inconsistency relies on an impredicative universe, which Agda doesn’t have. As far as I understand it, it would likely be safe to allow types like `↬`

above in Agda (Coquand 2013), although I’m not certain that with all of Agda’s newer features that’s still the case.

The connection between non-strictly-positive types and breadth-first traversals has been noticed before: Berger, Matthes, and Setzer (2019) make the argument for their inclusion in Agda and Coq using a breadth-first traversal algorithm by Hofmann (1993), which uses the following type:

```
data Rou
= Over
| Next ((Rou -> [Int]) -> [Int])
```

Now this type *isn’t* a hyperfunction (but it’s close); we’ll see soon what kind of thing it is.

So we’ve seen that hyperfunctions show up kind of incidentally through certain optimisations, and we’ve seen that they occupy a strange space in terms of their theoretical interpretation: we haven’t yet seen much about the type itself in isolation. Luckily Ed Kmett has already written the hyperfunctions package (2015), where a laundry list of instances are provided, which can tell us a little more about what hyperfunctions can actually do on their own.

The `Category`

instance gives us the following:

```
instance Category (-&>) where
id = Hyp (\k -> invoke k id)
. g = Hyp (\k -> invoke f (g . k)) f
```

We’ve actually seen the identity function a few times: we used it as the base case for recursion in the breadth-first traversal algorithms.

Composition we actually have used as well but it’s more obscured. An analogy to help clear things up is to think of hyperfunctions as a kind of *stack*. `id`

is the empty stack, and we can use the following function to push items onto the stack:

```
push :: (a -> b) -> a -&> b -> a -&> b
= Hyp (\k -> f (invoke k q)) push f q
```

Understood in this sense, composition acts like a zipping operation on stacks, since we have the following law:

`. push g q ≡ push (f . g) (p . q) push f p `

While we can’t really pop elements off the top of the stack directly, we can get close with `invoke`

, since it satisfies the following law:

` invoke (push f p) q ≡ f (invoke q p)`

Along with the `id`

implementation we have, this will let us run a hyperfunction, basically folding over the contents of the stack:

```
run :: a -&> a -> a
= invoke f id run f
```

This analogy helps us understand how the breadth-first traversals worked: the hyperfunctions are kind of like stacks with $\mathcal{O}(1)$ `push`

and `zip`

, which is precisely what you need for an efficient breadth-first traversal.

```
bfe :: Tree a -> [a]
= run . f
bfe where
:& xs) = push (x:) (zips (map f xs))
f (x
= foldr (.) id zips
```

Finally, hyperfunctions are of course monads:

```
instance Monad ((-&>) a) where
>>= f = Hyp (\k -> invoke (f (invoke m (Hyp (invoke k . (>>=f))))) k) m
```

I won’t pretend to understand what’s going on here, but it looks a little like a nested reader monad. Perhaps there’s some intuition to be gained from noticing that `a -&> a ~ Fix (Cont a)`

.

As I said in the introduction I’m kind of surprised there’s not more research out there on hyperfunctions. Aside from the excellent papers by Launchbury, Krstic, and Sauerwein (2013) there’s just not much out there. Maybe it’s that there’s not that much theoretical depth to them, but all the same there are some clear questions worth looking into.

For example: is there a hyperfunction monad transformer? Or, failing that, can you thread a monad through the type at any point, and do you get anything interesting out?

I have made a little headway on this question, while fiddling with one of the `bfe`

definitions above. Basically I wanted to remove the `Int`

counter for the terminating `bfe`

, and I wanted to use a `Maybe`

somewhere instead. I ended up generalising from `Maybe`

to any `m`

, yielding the following type:

`newtype HypM m a b = HypM { invokeM :: m ((HypM m a b -> a) -> b) }`

This does the job for the breadth-first traversal:

```
= r (f t e)
bfe t where
f :: Tree a -> HypM Maybe [a] [a] -> HypM Maybe [a] [a]
:& xs) fw = HypM (Just (\bw -> x : fromMaybe (\k -> k e) (invokeM fw) (bw . flip (foldr f) xs)))
f (x
e :: HypM Maybe [a] [a]
= HypM Nothing
e
r :: HypM Maybe [a] [a] -> [a]
= maybe [] (\k -> k r) . invokeM r
```

(In fact, when `m`

is specialised to `Maybe`

we have the same type as `Rou`

)

This type has a very practical use, as it happens, which is related to the church-encoded list monad transformer:

`newtype ListT m a = ListT { runListT :: forall b. (a -> m b -> m b) -> m b -> m b }`

Just like `-&>`

allowed us to write `zip`

on folds (i.e. using `foldr`

), `HypM`

will allow us to write `zipM`

on `ListT`

:

```
zipM :: Monad m => ListT m a -> ListT m b -> ListT m (a,b)
= ListT (\c n ->
zipM xs ys let
= pure (\yk -> yk (HypM xk) x)
xf x xk = pure (\_ -> n)
xb
= pure (\xk x -> c (x, y) (join (invokeM xk <*> yk)))
yf y yk = pure (\_ _ -> n)
yb in join (runListT xs xf xb <*> runListT ys yf yb))
```

I actually think this function could be used to seriously improve the running time of several of the functions on `LogicT`

: my reading of them suggests that `interleave`

is $\mathcal{O}(n^2)$ (or worse), but the zip above could be trivially repurposed to give a $\mathcal{O}(n)$ `interleave`

. This would also have knock-on effects on, for instance, `>>-`

and so on.

Another question is regarding the arrows of the hyperfunction. We’ve seen that a hyperfunction kind of adds “stacking” to functions, can it do the same for other arrows? Basically, does the following type do anything useful?

`newtype HypP p a b = HypP { invokeP :: p (HypP p b a) b }`

Along a similar vein, many of the breadth-first enumeration algorithms seem to use “hyperfunctions over the endomorphism monoid”. Basically, they all produce hyperfunctions of the type `[a] -&> [a]`

, and use them quite similarly to how we would use difference lists. But we know that there are Cayley transforms in other monoidal categories, for instance in the applicative monoidal category: can we construct the “hyperfunction” version of those?

Allison, Lloyd. 2006. “Circular Programs and Self-Referential Structures.” *Software: Practice and Experience* 19 (2) (October): 99–109. doi:10.1002/spe.4380190202. http://users.monash.edu/~lloyd/tildeFP/1989SPE/.

Berger, Ulrich, Ralph Matthes, and Anton Setzer. 2019. “Martin Hofmann’s Case for Non-Strictly Positive Data Types.” In *24th International Conference on Types for Proofs and Programs (TYPES 2018)*, ed by. Peter Dybjer, José Espírito Santo, and Luís Pinto, 130:22. Leibniz International Proceedings in Informatics (LIPIcs). Dagstuhl, Germany: Schloss DagstuhlLeibniz-Zentrum fuer Informatik. doi:10.4230/LIPIcs.TYPES.2018.1. http://drops.dagstuhl.de/opus/volltexte/2019/11405.

Coquand, Thierry. 2013. “[Agda] Defining Coinductive Types.” https://lists.chalmers.se/pipermail/agda/2013/006189.html.

Coquand, Thierry, and Christine Paulin. 1990. “Inductively Defined Types.” In *COLOG-88*, ed by. Per Martin-Löf and Grigori Mints, 50–66. Lecture Notes in Computer Science. Berlin, Heidelberg: Springer. doi:10.1007/3-540-52335-9_47.

Ghani, Neil, Patricia Johann, Tarmo Uustalu, and Varmo Vene. 2005. “Monadic Augment and Generalised Short Cut Fusion.” In *Proceedings of the Tenth ACM SIGPLAN International Conference on Functional Programming*, 294–305. ICFP ’05. New York, NY, USA: Association for Computing Machinery. doi:10.1145/1086365.1086403. https://doi.org/10.1145/1086365.1086403.

Hedges, Jules. 2015. “The Selection Monad as a CPS Transformation.” *arXiv:1503.06061 [cs]* (March). http://arxiv.org/abs/1503.06061.

Hofmann, Martin. 1993. “Non Strictly Positive Datatypes in System F.” https://www.seas.upenn.edu/~sweirich/types/archive/1993/msg00027.html.

Kmett, Edward. 2015. “Hyperfunctions: Hyperfunctions.” https://hackage.haskell.org/package/hyperfunctions.

Krstic, Sava, and John Launchbury. 2000. “A Category of Hyperfunctions.” http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.36.2421.

Launchbury, J., S. Krstic, and T. E. Sauerwein. 2013. “Coroutining Folds with Hyperfunctions.” *Electron. Proc. Theor. Comput. Sci.* 129 (September): 121–135. doi:10.4204/EPTCS.129.9. http://arxiv.org/abs/1309.5135.

Launchbury, John, Sava Krstic, and Timothy E. Sauerwein. 2000. *Zip Fusion with Hyperfunctions*. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.36.4961.

Pieters, Ruben P., and Tom Schrijvers. 2019. “Faster Coroutine Pipelines: A Reconstruction.” In *Practical Aspects of Declarative Languages*, ed by. José Júlio Alferes and Moa Johansson, 133–149. Lecture Notes in Computer Science. Cham: Springer International Publishing. doi:10.1007/978-3-030-05998-9_9. https://people.cs.kuleuven.be/~tom.schrijvers/portfolio/padl2019.html.

Sjöberg, Vilhelm. 2015. “Why Must Inductive Types Be Strictly Positive?” *Code and stuff*. https://vilhelms.github.io/posts/why-must-inductive-types-be-strictly-positive/.

Smith, Leon P. 2009. “Lloyd Allison’s Corecursive Queues: Why Continuations Matter.” *The Monad.Reader* 14 (14) (July): 28. https://meldingmonads.files.wordpress.com/2009/06/corecqueues.pdf.

Tags: Agda

The final version of my master’s thesis got approved recently so I thought I’d post it here for people who might be interested.

And all of the theorems in the thesis have been formalised in Agda. The code is organised to follow the structure of the pdf here.

The title of the thesis is “Finiteness in Cubical Type Theory”: basically it’s all about formalising the notion of “this type is finite” in CuTT. I also wanted to write something that could serve as a kind of introduction to some components of modern dependent type theory which didn’t go the standard length-indexed vector route.

]]>
Tags: Haskell

The Cayley monoid is well-known in Haskell (difference lists, for instance, are a specific instance of the Cayley monoid), because it gives us $O(1)$ `<>`

. What’s less well known is that it’s also important in dependently typed programming, because it gives us definitional associativity. In other words, the type `x . (y . z)`

is definitionally equal to `(x . y) . z`

in the Cayley monoid.

```
data Nat = Z | S Nat
type family (+) (n :: Nat) (m :: Nat) :: Nat where
Z + m = m
S n + m = S (n + m)
```

I used a form of the type-level Cayley monoid in a previous post to type vector reverse without proofs. I figured out the other day another way to use it to type tree flattening.

Say we have a size-indexed tree and vector:

```
data Tree (a :: Type) (n :: Nat) :: Type where
Leaf :: a -> Tree a (S Z)
(:*:) :: Tree a n -> Tree a m -> Tree a (n + m)
data Vec (a :: Type) (n :: Nat) :: Type where
Nil :: Vec a Z
(:-) :: a -> Vec a n -> Vec a (S n)
```

And we want to flatten it to a list in $O(n)$ time:

```
treeToList :: Tree a n -> Vec a n
= go xs Nil
treeToList xs where
go :: Tree a n -> Vec a m -> Vec a (n + m)
Leaf x) ks = x :- ks
go (:*: ys) ks = go xs (go ys ks) go (xs
```

Haskell would complain specifically that you hadn’t proven the monoid laws:

```
• Couldn't match type ‘n’ with ‘n + 'Z’
• Could not deduce: (n2 + (m1 + m)) ~ ((n2 + m1) + m)
```

But it seems difficult at first to figure out how we can apply the same trick as we used for vector reverse: there’s no real way for the `Tree`

type to hold a function from `Nat`

to `Nat`

.

To solve this problem we can borrow a trick that Haskellers had to use in the good old days before type families to represent type-level functions: types (or more usually classes) with multiple parameters.

```
data Tree' (a :: Type) (n :: Nat) (m :: Nat) :: Type where
Leaf :: a -> Tree' a n (S n)
(:*:) :: Tree' a n2 n3
-> Tree' a n1 n2
-> Tree' a n1 n3
```

The `Tree'`

type here has three parameters: we’re interested in the last two. The first of these is actually an argument to a function in disguise; the second is its result. To make it back into a normal size-indexed tree, we apply that function to zero:

```
type Tree a = Tree' a Z
three :: Tree Int (S (S (S Z)))
= (Leaf 1 :*: Leaf 2) :*: Leaf 3 three
```

This makes the `treeToList`

function typecheck without complaint:

```
treeToList :: Tree a n -> Vec a n
= go xs Nil
treeToList xs where
go :: Tree' a x y -> Vec a x -> Vec a y
Leaf x) ks = x :- ks
go (:*: ys) ks = go xs (go ys ks) go (xs
```

Consider the following puzzle:

Given a list of $n$ labels, list all the trees with those labels in order.

For instance, given the labels [1,2,3,4], the answer (for binary trees) is the following:

```
┌1 ┌1 ┌1 ┌1 ┌1
┤ ┤ ┌┤ ┌┤ ┌┤
│┌2 │ ┌2 ││┌2 │└2 │└2
└┤ │┌┤ │└┤ ┤ ┌┤
│┌3 ││└3 │ └3 │┌3 │└3
└┤ └┤ ┤ └┤ ┤
└4 └4 └4 └4 └4
```

This problem (the “enumeration” problem) turns out to be quite fascinating and deep, with connections to parsing and monoids. It’s also just a classic algorithmic problem which is fun to try and solve.

The most general version of the algorithm is on forests of rose trees:

```
data Rose a = a :& Forest a
type Forest a = [Rose a]
```

It’s worth having a go at attempting it yourself, but if you’d just like to see the slick solutions the following is one I’m especially proud of:

```
enumForests :: [a] -> [Forest a]
= foldrM f []
enumForests where
= zipWith ((:) . (:&) x) (inits xs) (tails xs) f x xs
```

In the rest of this post I’ll go through the intuition behind solutions like the one above and I’ll try to elucidate some of the connections to other areas of computer science.

I first came across the enumeration problem when I was writing my master’s thesis: I needed to prove (in Agda) that there were finitely many binary trees of a given size, and that I could list them (this proof was part of a larger verified solver for the countdown problem). My first few attempts were unsuccessful: the algorithm presented in the countdown paper (Hutton 2002) was not structurally recursive, and did not seem amenable to Agda-style proofs.

Instead, I looked for a type which was isomorphic to binary trees, and which might be easier to reason about. One such type is Dyck words.

A “Dyck word” is a string of balanced parentheses.

```
()()
(()())()
(())()
```

It’s (apparently) well-known that these strings are isomorphic to binary trees (although the imperative descriptions of algorithms which actually computed this isomorphism addled my brain), but what made them interesting for me was that they are a *flat* type, structured like a linked list, and as such should be reasonably straightforward to prove to be finite.

Our first task, then, is to write down a type for Dyck words. Te following is a first possibility:

```
data Paren = LParen | RParen
type Dyck = [Paren]
```

But this type isn’t correct. It includes many values which *don’t* represent balanced parentheses, i.e. the expressions `[LParen,RParen] :: Dyck`

are well-typed. To describe dyck words properly we’ll need to reach for the GADTs:

```
data DyckSuff (n :: Nat) :: Type where
Done :: DyckSuff Z
Open :: DyckSuff (S n) -> DyckSuff n
Clos :: DyckSuff n -> DyckSuff (S n)
type Dyck = DyckSuff Z
```

The first type here represents suffixes of Dyck words; a value of type `DyckSuff n`

represents a string of parentheses which is balanced except for `n`

extraneous closing parentheses. `DyckSuff Z`

, then, has no extraneous closing parens, and as such is a proper Dyck word.

```
>>> Open $ Clos $ Open $ Clos $ Done :: Dyck
()()
>>> Clos $ Open $ Clos $ Done :: DyckSuff (S Z)
)()
>>> Open $ Open $ Clos $ Open $ Clos $ Clos $ Open $ Clos $ Done :: Dyck
(()())()
>>> Open $ Open $ Clos $ Clos $ Open $ Clos $ Done :: Dyck
(())()
```

The next task is to actually enumerate these words. Here’s an $O(n)$ algorithm which does just that:

```
enumDyck :: Int -> [Dyck]
= go Zy sz Done []
enumDyck sz where
right :: Natty n -> Int -> DyckSuff n -> [Dyck] -> [Dyck]
go, zero, left,
= zero n m k . left n m k . right n m k
go n m k
Zy 0 k = (k:)
zero = id
zero _ _ _
Sy n) m k = go n m (Open k)
left (Zy _ _ = id
left
0 _ = id
right _ = go (Sy n) (m-1) (Clos k)
right n m k
>>> mapM_ print (enumDyck 3)
"()()()"
"(())()"
"()(())"
"(()())"
"((()))"
```

A variant of this function was what I needed in my thesis: I also needed to prove that it produced every possible value of the type `Dyck`

, which was not too difficult.

The difficult part is still ahead, though: now we need to convert between this type and a binary tree.

First, for the conversion algorithms we’ll actually need another GADT:

```
infixr 5 :-
data Stack (a :: Type) (n :: Nat) :: Type where
Nil :: Stack a Z
(:-) :: a -> Stack a n -> Stack a (S n)
```

The familiar length-indexed vector will be extremely useful for the next few bits of code: it will act as a stack in our stack-based algorithms. Here’s one of those algorithms now:

```
dyckToTree :: Dyck -> Tree
= go dy (Leaf :- Nil)
dyckToTree dy where
go :: DyckSuff n -> Stack Tree (S n) -> Tree
Open d) ts = go d (Leaf :- ts)
go (Clos d) (t1 :- t2 :- ts) = go d (t2 :*: t1 :- ts)
go (Done (t :- Nil) = t go
```

This might be familiar: it’s actually shift-reduce parsing dressed up with some types. The nice thing about it is that it’s completely total: all pattern-matches are accounted for here, and when written in Agda it’s clearly structurally terminating.

The function in the other direction is similarly simple:

```
treeToDyck :: Tree -> Dyck
= go t Done
treeToDyck t where
go :: Tree -> DyckSuff n -> DyckSuff n
Leaf = id
go :*: ys) = go xs . Open . go ys . Clos go (xs
```

Much of this stuff has been on my mind recently because of this (2020) video on the computerphile channel, in which Graham Hutton goes through using QuickCheck to test an interesting compiler. The compiler itself is explored more in depth in Bahr and Hutton (2015), where the algorithms developed are really quite similar to those that we have here.

The advantage of the code above is that it’s all *total*: we will never pop items off the stack that aren’t there. This is a nice addition, and it’s surprisingly simple to add: let’s see if we can add it to the compiler presented in the paper.

The first thing we need to change is we need to add a payload to our tree type: the one above is just the *shape* of a binary tree, but the language presented in the paper contains values.

```
data Expr (a :: Type) where
Val :: a -> Expr a
(:+:) :: Expr a -> Expr a -> Expr a
```

We’ll need to change the definition of `Dyck`

similarly:

```
data Code (n :: Nat) (a :: Type) :: Type where
HALT :: Code (S Z) a
PUSH :: a -> Code (S n) a -> Code n a
ADD :: Code (S n) a -> Code (S (S n)) a
```

After making it so that these data structures can now store contents, there are two other changes worth pointing out:

- The names have been changed, to match those in the paper. It’s a little clearer now that the Dyck word is a bit like code for a simple stack machine.
- The numbering on
`Code`

has changed. Now, the`HALT`

constructor has a parameter of`1`

(well,`S Z`

), where its corresponding constructor in`Dyck`

(`Done`

) had`0`

. Why is this? I am not entirely sure! To get this stuff to all work out nicely took a huge amount of trial and error, I would love to see a more principled reason why the numbering changed here.

With these definitions we can actually transcribe the `exec`

and `comp`

functions almost verbatim (from page 11 and 12 of 2015).

```
exec :: Code n Int -> Stack Int (n + m) -> Stack Int (S m)
HALT st = st
exec PUSH v is) st = exec is (v :- st)
exec (ADD is) (t1 :- t2 :- st) = exec is (t2 + t1 :- st)
exec (
comp :: Expr a -> Code Z a
= comp' e HALT
comp e where
comp' :: Expr a -> Code (S n) a -> Code n a
Val x) = PUSH x
comp' (:+: ys) = comp' xs . comp' ys . ADD comp' (xs
```

As I have mentioned, a big benefit of all of this stuff is that it can be translated into Agda readily. The real benefit of *that* is that we can show the two representations of programs are fully isomorphic. I have proven this here: the proof is surprisingly short (about 20 lines), and the rest of the code follows the Haskell stuff quite closely. I got the idea for much of the proof from this bit of code by Callan McGill (2020).

I’ll include it here as a reference.

```
open import Prelude
open import Data.Nat using (_+_)
open import Data.Vec.Iterated using (Vec; _∷_; []; foldlN; head)
private
variable
: ℕ
n
--------------------------------------------------------------------------------
-- Binary trees: definition and associated functions
--------------------------------------------------------------------------------
data Tree (A : Type a) : Type a where
_] : A → Tree A
[_*_ : Tree A → Tree A → Tree A
--------------------------------------------------------------------------------
-- Programs: definition and associated functions
--------------------------------------------------------------------------------
data Prog (A : Type a) : ℕ → Type a where
: Prog A 1
halt : A → Prog A (1 + n) → Prog A n
push : Prog A (1 + n) → Prog A (2 + n)
pull
--------------------------------------------------------------------------------
-- Conversion from a Prog to a Tree
--------------------------------------------------------------------------------
: Prog A n → Vec (Tree A) n → Tree A
prog→tree⊙ (v ∷ []) = v
prog→tree⊙ halt (push v is) st = prog→tree⊙ is ([ v ] ∷ st)
prog→tree⊙ (pull is) (t₁ ∷ t₂ ∷ st) = prog→tree⊙ is (t₂ * t₁ ∷ st)
prog→tree⊙
: Prog A zero → Tree A
prog→tree = prog→tree⊙ ds []
prog→tree ds
--------------------------------------------------------------------------------
-- Conversion from a Tree to a Prog
--------------------------------------------------------------------------------
: Tree A → Prog A (suc n) → Prog A n
tree→prog⊙ = push x
tree→prog⊙ [ x ] (xs * ys) = tree→prog⊙ xs ∘ tree→prog⊙ ys ∘ pull
tree→prog⊙
: Tree A → Prog A zero
tree→prog = tree→prog⊙ tr halt
tree→prog tr
--------------------------------------------------------------------------------
-- Proof of isomorphism
--------------------------------------------------------------------------------
: (e : Tree A) (is : Prog A (1 + n)) (st : Vec (Tree A) n) →
tree→prog→tree⊙ (tree→prog⊙ e is) st ≡ prog→tree⊙ is (e ∷ st)
prog→tree⊙ = refl
tree→prog→tree⊙ [ x ] is st (xs * ys) is st = tree→prog→tree⊙ xs _ st ;
tree→prog→tree⊙ (pull is) (xs ∷ st)
tree→prog→tree⊙ ys
: (e : Tree A) → prog→tree (tree→prog e) ≡ e
tree→prog→tree = tree→prog→tree⊙ e halt []
tree→prog→tree e
: (is : Prog A n) (st : Vec (Tree A) n) →
prog→tree→prog⊙ (prog→tree⊙ is st) ≡ foldlN (Prog A) tree→prog⊙ is st
tree→prog = refl
prog→tree→prog⊙ halt st (push i is) st = prog→tree→prog⊙ is ([ i ] ∷ st)
prog→tree→prog⊙ (pull is) (t₁ ∷ t₂ ∷ ts) = prog→tree→prog⊙ is ((t₂ * t₁) ∷ ts)
prog→tree→prog⊙
: (is : Prog A 0) → tree→prog (prog→tree is) ≡ is
prog→tree→prog = prog→tree→prog⊙ is []
prog→tree→prog is
: Prog A zero ⇔ Tree A
prog-iso .fun = prog→tree
prog-iso .inv = tree→prog
prog-iso .rightInv = tree→prog→tree
prog-iso .leftInv = prog→tree→prog prog-iso
```

Another thing I’ll mention is that all of the `exec`

functions presented are *folds*. In particular, they’re *left* folds. Here’s how we’d rewrite `exec`

to make that fact clear:

```
foldlCode :: (∀ n. a -> b n -> b (S n))
-> (∀ n. b (S (S n)) -> b (S n))
-> b m
-> Code m a -> b (S Z)
HALT = h
foldlCode _ _ h PUSH x xs) = foldlCode p a (p x h) xs
foldlCode p a h (ADD xs) = foldlCode p a (a h) xs
foldlCode p a h (
shift :: Int -> Stack Int n -> Stack Int (S n)
= x :- xs
shift x xs
reduce :: Stack Int (S (S n)) -> Stack Int (S n)
:- t2 :- st) = t2 + t1 :- st
reduce (t1
execFold :: Code Z Int -> Int
= pop . foldlCode shift reduce Nil execFold
```

I think the “foldl-from-foldr” trick could be a nice way to explain the introduction of continuations in Bahr and Hutton (2015).

It turns out that you can follow relatively straightforward rewriting steps from the Dyck-based enumeration algorithm to get to one which avoids Dyck words entirely:

```
enumTrees :: [a] -> [Expr a]
= fmap (foldl1 (flip (:+:))) . foldlM f []
enumTrees where
= [[Val v]]
f [] v = [[Val v, t1]]
f [t1] v :t2:st) v = (Val v : t1 : t2 : st) : f ((t2 :+: t1) : st) v f (t1
```

Maybe in a future post I’ll go through the derivation of this algorithm.

It turns out that the Dyck-based enumeration can be applied without much difficulty to rose trees as well:

```
data Rose a = a :& Forest a
type Forest a = [Rose a]
dyckToForest :: Dyck -> Forest ()
= go dy ([] :- Nil)
dyckToForest dy where
go :: DyckSuff n -> Stack (Forest ()) (S n) -> Forest ()
Open d) ts = go d ([] :- ts)
go (Clos d) (t1 :- t2 :- ts) = go d ((() :& t2 : t1) :- ts)
go (Done (t :- Nil) = t
go
forestToDyck :: Forest () -> Dyck
= go t Done
forestToDyck t where
go :: Forest () -> DyckSuff n -> DyckSuff n
= id
go [] :& x):xs) = go x . Open . go xs . Clos go ((()
```

And again, following relatively mechanical derivations, we arrive at an elegant algorithm:

```
enumForests :: [a] -> [Forest a]
= foldrM f []
enumForests where
= zipWith ((:) . (:&) x) (inits xs) (tails xs) f x xs
```

While researching this post I found that enumeration of trees has been studied *extensively* elsewhere: see Knuth (2006), for example, or the excellent blog post by Tychonievich (2013), or the entire field of Boltzmann sampling. This post has only scratched the surface of all of that: I hope to write much more on the topic in the future.

As I mentioned, the Agda code for this stuff can be found here, I have also put all of the Haskell code in one place here.

Bahr, Patrick, and Graham Hutton. 2015. “Calculating Correct Compilers.” *Journal of Functional Programming* 25 (e14) (September). doi:10.1017/S0956796815000180. https://nottingham-repository.worktribe.com/output/761112.

Hutton, Graham. 2002. “The Countdown Problem.” *J. Funct. Program.* 12 (6) (November): 609–616. doi:10.1017/S0956796801004300. http://www.cs.nott.ac.uk/~pszgmh/countdown.pdf.

Knuth, Donald E. 2006. *The Art of Computer Programming, Volume 4, Fascicle 4: Generating All Trees–History of Combinatorial Generation (Art of Computer Programming)*. Addison-Wesley Professional. http://www.cs.utsa.edu/~wagner/knuth/fasc4a.pdf.

McGill, Callan. 2020. “Compiler Correctness for Addition Language.” https://gist.github.com/Boarders/9d83f9cbcfaffb04cf2464588fc46df9.

Riley, Sean. 2020. “Program Correctness - Computerphile.” University of Nottingham. https://www.youtube.com/watch?v=T_IINWzQhow.

Tychonievich, Luther. 2013. “Enumerating Trees.” *Luther’s Meanderings*. https://www.cs.virginia.edu/~lat7h/blog/posts/434.html.

Part 10 of a 10-part series on Breadth-First Traversals

Tags: Haskell

We pick up the story again at the question of a breadth-first (Applicative) traversal of a rose tree (Gibbons 2015). In the last post, I finally came up with an implementation I was happy with:

```
data Tree a = a :& [Tree a]
bft :: Applicative f => (a -> f b) -> Tree a -> f (Tree b)
:& xs) = liftA2 (:&) (f x) (bftF f xs)
bft f (x
bftF :: Applicative f => (a -> f b) -> [Tree a] -> f [Tree b]
= fmap head . foldr (<*>) (pure []) . foldr f [pure ([]:)]
bftF t where
:& xs) (q : qs) = liftA2 c (t x) q : foldr f (p qs) xs
f (x
= [pure ([]:)]
p [] :xs) = fmap (([]:).) x : xs
p (x
: ks) = ((x :& xs) : y) : ys
c x k (xs where (y : ys) = k ks
```

It has the correct semantics and asymptotics.

```
=
tree 1 :&
2 :&
[ 5 :&
[ 9 :& []
[ 10 :& []]
, 6 :& []]
, 3 :& []
, 4 :&
, 7 :&
[ 11 :& []
[ 12 :& []]
, 8 :& []]]
,
>>> bft print tree
1
2
3
4
5
6
7
8
9
10
11
12
:&
() :&
[ () :&
[ () :& []
[ () :& []]
, () :& []]
, () :& []
, () :&
, () :&
[ () :& []
[ () :& []]
, () :& []]] , ()
```

But it’s quite difficult to understand, and doesn’t lend much insight into what’s going on with the whole “breadth-first” notion. The technique the function uses also isn’t reusable.

A much nicer function uses the `Phases`

Applicative (Easterly 2019):

```
bft :: Applicative f => (a -> f b) -> Tree a -> f (Tree b)
= runPhases . go
bft f where
:& xs) = liftA2 (:&) (Lift (f x)) (later (traverse go xs)) go (x
```

But this function is quadratic.

So the task for this post today is to derive a type like the `Phases`

type with a `later`

operation, but which has the appropriate performance characteristics. At the end I’ll look into what the theoretical properties of this type are.

At its core, the `Phases`

type is basically a free Applicative (Capriotti and Kaposi 2014). I’ll reimplement it here as a slightly different free Applicative (one that’s based on `liftA2`

rather than `<*>`

):

```
data Free f a where
Pure :: a -> Free f a
Lift :: (a -> b -> c) -> f a -> Free f b -> Free f c
lower :: Applicative f => Free f a -> f a
Pure x) = pure x
lower (Lift f x xs) = liftA2 f x (lower xs) lower (
```

The key with the `Phases`

type is to observe that there’s actually two possible implementations of `Applicative`

for the `Free`

type above: one which makes it the “correct” free applicative:

```
instance Applicative (Free f) where
pure = Pure
Pure x) ys = fmap (c x) ys
liftA2 c (Lift f x xs) ys = Lift (\x (y,z) -> c (f x y) z) x (liftA2 (,) xs ys) liftA2 c (
```

And then one which *zips* effects together:

```
instance Applicative f => Applicative (Free f) where
pure = Pure
Pure x) ys = fmap (c x) ys
liftA2 c (Pure y) = fmap (flip c y) xs
liftA2 c xs (Lift f x xs) (Lift g y ys) =
liftA2 c (Lift
-> c (f x xs) (g y ys))
(\(x,y) (xs,ys)
(liftA2 (,) x y) (liftA2 (,) xs ys)
```

This second instance makes the `Free`

type into not a free Applicative at all: instead it’s some kind of Applicative transformer which we can use to reorder effects. Since effects are combined only when they’re at the same point in the list, we can use it to do our breadth-first traversal.

As an aside, from this perspective it’s clear that this is some kind of `FunList`

(van Laarhoven 2009): this opens up a lot of interesting curiosities about the type, since that type in particular is quite well-studied.

Anyway, we’re able to do the `later`

operation quite simply:

```
later :: Free f a -> Free f a
= Lift (const id) (pure ()) later
```

The problem at the moment is that the Applicative instance has an $\mathcal{O}(n)$ `liftA2`

implementation: this translates into an $\mathcal{O}(n^2)$ traversal overall.

If we were working in a more simple context of just enumerating the contents of the tree, we might at this point look to something like difference lists: these use the cayley transform on the list monoid to turn the append operation from $\mathcal{O}(n)$ to $\mathcal{O}(n^2)$. It turns out that there is a similar cayley transformation for Applicative functors (Rivas and Jaskelioff 2014; Rivas, Jaskelioff, and Schrijvers 2015):

```
newtype Day f a = Day { runDay :: ∀ b. f b -> f (a, b) }
instance Functor f => Functor (Day f) where
fmap f xs = Day (fmap (first f) . runDay xs)
instance Functor f => Applicative (Day f) where
pure x = Day (fmap ((,) x))
=
liftA2 c xs ys Day (fmap (\(x,(y,z)) -> (c x y, z)) . runDay xs . runDay ys)
```

And with this type we can implement our queue of applicative effects:

```
type Queue f = Day (Free f)
runQueue :: Applicative f => Queue f a -> f a
= fmap fst . lower . flip runDay (Pure ())
runQueue
now :: Applicative f => f a -> Queue f a
= Day \case
now xs Pure x -> Lift (,) xs (Pure x)
Lift f y ys -> Lift (\(x,y) z -> (x, f y z)) (liftA2 (,) xs y) ys
later :: Applicative f => Queue f a -> Queue f a
= Day \case
later xs Pure x -> Lift (const id) (pure ()) (runDay xs (Pure x))
Lift f y ys -> Lift (\x (y,z) -> (y, f x z)) y (runDay xs ys)
```

As expected, this gives us the clean implementation of a breadth-first traversal with the right asymptotics (I think):

```
bft :: Applicative f => (a -> f b) -> Tree a -> f (Tree b)
= runQueue . go
bft f where
:& xs) = liftA2 (:&) (now (f x)) (later (traverse go xs)) go (x
```

(it’s worth pointing out that we haven’t actually used the applicative instance on the free applicative at any point: we have inlined all of the “zipping” to make it absolutely clear that everything has stayed linear).

I have yet to really dive deep on any of the theory involved in this type, I just quickly wrote up this post when I realised I was able to use the cayley transform from the mentioned papers to implement the proper breadth-first traversal. It certainly seems worth looking at more!

Capriotti, Paolo, and Ambrus Kaposi. 2014. “Free Applicative Functors.” *Electronic Proceedings in Theoretical Computer Science* 153 (June): 2–30. doi:10.4204/EPTCS.153.2. http://www.paolocapriotti.com/assets/applicative.pdf.

Easterly, Noah. 2019. “Functions and Newtype Wrappers for Traversing Trees: Rampion/Tree-Traversals.” https://github.com/rampion/tree-traversals.

Gibbons, Jeremy. 2015. “Breadth-First Traversal.” *Patterns in Functional Programming*. https://patternsinfp.wordpress.com/2015/03/05/breadth-first-traversal/.

Rivas, Exequiel, and Mauro Jaskelioff. 2014. “Notions of Computation as Monoids.” *arXiv:1406.4823 [cs, math]* (May). http://arxiv.org/abs/1406.4823.

Rivas, Exequiel, Mauro Jaskelioff, and Tom Schrijvers. 2015. “From Monoids to Near-Semirings: The Essence of MonadPlus and Alternative.” In *Proceedings of the 17th International Symposium on Principles and Practice of Declarative Programming*, 196–207. ACM. doi:10.1145/2790449.2790514. http://www.fceia.unr.edu.ar/~mauro/pubs/FromMonoidstoNearsemirings.pdf.

van Laarhoven, Twan. 2009. “A Non-Regular Data Type Challenge.” *Twan van Laarhoven’s Blog*. https://twanvl.nl/blog/haskell/non-regular1.

Tags: Agda

Recently travis-ci.org announced that they were closing down, and moving to travis-ci.com. For people who use the service, this basically means that the free component is going away, and you’ll have to pay in the future.

As a result, a lot of people are looking to move to another ci service, so I thought I’d put this short guide together on how to use GitHub actions to typecheck an Agda project and host the rendered code through GitHub pages. The system I have is quite fast: for a quite large project it takes about a minute from pushing for the action to complete.

If you just want to use the same script as me, you can see it here: the rest of this post will just be going through that script and explaining it.

First things first: in order to make an action, you need to put a YAML file in the `.github/workflows`

directory of your repository. You can have the following lines at the start:

```
name: Compile Agda and Deploy HTML
on:
push:
branches:
- master
```

This gives a name for the action (which will show up in the actions tab online for the repo), and says that the action should be run whenever there’s a push to the branch named `master`

.

We then list the “jobs” the actions does: just one for this action, called `build`

:

```
jobs:
build:
```

GitHub actions run on GitHub’s servers, the specifications of which can be seen here. For this action we won’t need anything special, so we’ll just use the following:

` runs-on: ubuntu-18.04`

Next we will have the matrix:

```
strategy:
matrix:
cubical-ref: ["v0.2"]
agda-ref: ["v2.6.1.1"]
ghc-ver: ["8.10.2"]
cabal-ver: ["3.4.0.0"]
```

I’m using this matrix as a crude system for environment variables; if this was a CI for some software I wanted to deploy, you could include multiple values for each variable here, to check that the whole thing runs properly with each.

We’re now onto the “steps” portion of the script, where we write small bash-esque script to be run. As such we have the line:

` steps:`

The first step is to cache all the cabal packages we’re going to install. Agda takes about 45 minutes to install so this step is crucial:

```
- uses: actions/cache@v2
name: Cache cabal packages
id: cache-cabal
with:
path: |
~/.cabal/packages
~/.cabal/store
~/.cabal/bin
dist-newstyle key: ${{ runner.os }}-${{ matrix.ghc-ver }}-${{ matrix.cabal-ver }}-${{ matrix.agda-ref }}
```

The `path`

field tells the action which folders to cache, the `key`

field tells it what key to store them under.

To install Agda we first need to install cabal:

```
- name: Install cabal
if: steps.cache-cabal.outputs.cache-hit != 'true'
uses: actions/setup-haskell@v1.1.3
with:
ghc-version: ${{ matrix.ghc-ver }}
cabal-version: ${{ matrix.cabal-ver }}
```

The `if`

field here allows us to skip this step if we had a cache hit previously (i.e. if Agda is already installed).

Next we need to ensure that all of the programs installed by cabal are in the path:

```
- name: Put cabal programs in PATH
run: echo "~/.cabal/bin" >> $GITHUB_PATH
```

And then we download and install Agda (along with some dependencies that aren’t installed automatically):

```
- name: Download Agda from github
if: steps.cache-cabal.outputs.cache-hit != 'true'
uses: actions/checkout@v2
with:
repository: agda/agda
path: agda
ref: ${{ matrix.agda-ref }}
- name: Install Agda
if: steps.cache-cabal.outputs.cache-hit != 'true'
run: |
cabal update
cabal install --overwrite-policy=always --ghc-options='-O2 +RTS -M6G -RTS' alex-3.2.5
cabal install --overwrite-policy=always --ghc-options='-O2 +RTS -M6G -RTS' happy-1.19.12
cd agda
mkdir -p doc
touch doc/user-manual.pdf cabal install --overwrite-policy=always --ghc-options='-O1 +RTS -M6G -RTS'
```

The strange flags to `cabal install`

here are *probably* necessary: I was running out of memory when I tried to install Agda without them. This might be fixed in future versions of Agda.

We next need to install any Agda libraries your code depends on. For instance, in my project, I use the cubical library: since Agda doesn’t have a package manager, we basically have to handle all the versioning and so on manually. Also, in order to speed up the build we have to cache the typecheck files for the library.

```
- name: Checkout cubical library
uses: actions/checkout@v2
with:
repository: agda/cubical
path: cubical
ref: ${{ matrix.cubical-ref }}
- name: Cache cubical library
uses: actions/cache@v2
id: cache-cubical
with:
path: ~/cubical-build
key: ${{ runner.os }}-${{ matrix.agda-ver }}-${{ matrix.cubical-ref }}
```

So the library is accessible as an import we need to put it in the Agda library list:

```
- name: Put cubical library in Agda library list
run: |
mkdir -p ~/.agda/
touch ~/.agda/libraries echo "$GITHUB_WORKSPACE/cubical/cubical.agda-lib" > ~/.agda/libraries
```

We then need to typecheck the library: this bit is a little tricky, since not all files in the cubical library actually typecheck.

```
- name: Compile cubical library
if: steps.cache-cubical.outputs.cache-hit != 'true'
run: |
cd $GITHUB_WORKSPACE/cubical
agda Cubical/Core/Everything.agda
agda Cubical/Foundations/Everything.agda
find Cubical/Data -type f -name "*.agda" | while read -r code ; do
agda $code
done
find Cubical/HITs -type f -name "*.agda" | while read -r code ; do
agda $code
done cp -f -r _build/ ~/cubical-build
```

Finally, if the cubical library was already typechecked then we don’t need to do any of that, and we instead just retrieve it from the cache:

```
- name: Retrieve cubical library
if: steps.cache-cubical.outputs.cache-hit == 'true'
run: |
mkdir -p cubical/_build cp -f -r ~/cubical-build/* cubical/_build
```

Finally we have to typecheck the library itself. We want to cache the output from this step as well, but importantly we want to support incremental recompilation: i.e. if we only make a small change in one file we don’t want to have to typecheck every other. We can do this with `restore-keys`

in the cache:

```
- name: Checkout main
uses: actions/checkout@v2
with:
path: main
- uses: actions/cache@v2
name: Cache main library
id: cache-main
with:
path: ~/main-build
key: html-and-tex-${{ runner.os }}-${{ matrix.agda-ver }}-${{ matrix.cubical-ref }}-${{ hashFiles('main/**') }}
restore-keys: |
html-and-tex-${{ runner.os }}-${{ matrix.agda-ver }}-${{ matrix.cubical-ref }}-
html-and-tex-${{ runner.os }}-${{ matrix.agda-ver }}-
- name: Retrieve main library
if: steps.cache-main.outputs.cache-hit == 'true'
run: cp -f -R ~/main-build/* $GITHUB_WORKSPACE/main
```

Finally, we need to make an “Everything” file: this is an Agda module which contains an import for every module in the project. Typechecking this file is faster than typechecking each file individually.

```
- name: Compile main library
if: steps.cache-main.outputs.cache-hit != 'true'
run: |
mkdir -p ~/main-build/_build
cp -f -R ~/main-build/_build $GITHUB_WORKSPACE/main/_build
rm -r ~/main-build
cd main
find . -type f \( -name "*.agda" -o -name "*.lagda" \) > FileList
sort -o FileList FileList
echo "{-# OPTIONS --cubical #-}" > Everything.agda
echo "" >> Everything.agda
echo "module Everything where" >> Everything.agda
echo "" >> Everything.agda
echo "-- This file imports every module in the project. Click on" >> Everything.agda
echo "-- a module name to go to its source." >> Everything.agda
echo "" >> Everything.agda
cat FileList | cut -c 3- \
| cut -f1 -d'.' \
| sed 's/\//\./g' \
| sed 's/^/open import /' \
>> Everything.agda
rm FileList
agda --html --html-dir=docs Everything.agda
rm Everything.agda
cd .. cp -f -R main/ ~/main-build/
```

And then we need to deploy the generated `html`

so we can see the rendered library.

```
- name: Deploy html to github pages
uses: peaceiris/actions-gh-pages@v3
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: main/docs
```

This last step will need you to turn on the github pages setting in your repository, and have it serve from the `gh-pages`

branch.

Hopefully this script will be useful to some other people! The first time it runs it should take between 30 minutes and an hour; subsequently it takes about a minute for me.

]]>
Tags: Combinators

There are a bunch of “minimal” computational models out there: Turing machines, lambda calculus, PowerPoint (Wildenhain 2017), etc. These are radically simple languages which are nonetheless Turing complete, so theoretically “as powerful” as each other. Of those, lambda calculus is without question my favourite to actually write programs in: it’s the one which is closest to crawling out of the Turing tarpit.

In terms of implementation, though, it is *far* from simple. Lambda calculus has *variables*, which introduce huge complexity into the interpreter: especially if you want to do any kind of formal reasoning about programs, this complexity is a problem. We might want to reach for something even lower-level than lambda calculus: this is where combinator calculi come in.

You may have heard of SKI combinator calculus: it’s the “simplest” of the calculi, but it’s not actually very easy to understand, and it’s absolute murder to try use. So we’re going to start with `BCKW`

, a more obscure calculus, actually invented by Haskell Curry.

There are 4 combinators in `BCKW`

: `B`

, `C`

, `K`

, and `W`

(shocking, I know). You can think about these combinators as functions which manipulate the beginning of strings:

```
Bxyz ~> x(yz)
Cxyz ~> xzy
Kxy ~> x
Wxy ~> xyy
```

Upper case letters are combinators, lower-case are variables. Yes, yes, I know I said that combinator calculi didn’t need variables, and it doesn’t! I’m just using them here to explain how each of the combinators work. If you really want to be pedantic you can think of the lower case letters as notational placeholders meaning “any given combinator”. They won’t exist in any actual programs we write.

Let’s work with some examples to get a sense for how these combinators work.

The simplest combinator is `K`

: it’s actually equivalent to the `const`

function from Haskell. It discards its second argument, and returns the first. If you give a combinator more arguments than it usually accepts, you just keep the extra arguments in the output:

`Kxyz ~> xz`

`W`

is the next combinator: it *duplicates* its second argument.

`Wxy ~> xyy`

We always start from the *left*, applying the rule for the left-most combinator first.

```
WKxyz ~> Kxxyz ~> xyz
KWxyz ~> Wyz ~> yzz
```

Next we have `C`

: this is equivalent to the Haskell function `flip`

. It swaps the second and third arguments:

`Cxyz ~> xzy`

Here’s a small little evaluator for expressions which use `C`

, `K`

, and `W`

. You can edit the expression, and press enter to step through it.

The last combinator introduces parentheses, and it’s equivalent to function composition.

`Bxyz ~> x(yz)`

You can write parentheses yourself: implicitly, all expressions are left-associative. That means that the following are all equal:

`xyz = (xy)z = (x)yz = ((x)y)z`

But `xyz`

is *not* equal to, say, `x(yz)`

.

And here’s a puzzle to start flexing your combinator skills: one of the combinators in SKI combinator calculus is `I`

, which is the identity function.

`Ix ~> x`

Try write an expression which functions the same way as `I`

, using only the `BCKW`

combinators. Use the following evaluator to try and figure out how to do it: write an expression after `λ>`

which functions the same as `I`

.

`CK`

followed by any combinator will do the trick. So `CKB`

, `CKK`

, `CKC`

, etc.
`I = CKC`

Update 19/10/2020: A few people have pointed out (Joachim Breitner was the first) that there is a shorter solution to this problem: `WK`

. I tend to prefer solutions that don’t include `W`

, since then we’re working in a subset of the language that is both terminating and affine; although in this case the reason I didn’t mention `WK`

is that I just didn’t find it myself.
Each of the combinators we’ve defined so far work a little weird: they seem to skip over their first argument, and work on their second. Indeed, there is another, equivalent combinator calculus which doesn’t have this peculiarity:

```
Bxyz ~> x(yz)
Axy ~> y
Mx ~> xx
Txy ~> yx
```

`B`

stays the same in this calculus, but the rest of the combinators get switched out for seemingly simpler versions. `K`

goes to `A`

^{1}:

```
Axy ~> y
Kxy ~> x
```

Which isn’t a huge change. It’s the other two where we see the real difference. `W`

has been swapped out for `M`

:

```
Wxy ~> xyy
Mx ~> xx
```

As you can see `W`

basically does the same thing as `M`

, but while passing through its first argument. The difference between `T`

and `C`

is similar:

```
Cxyz ~> xzy
Txy ~> yx
```

So, first of all, it is pretty simple to show that `BCKW`

contains all of the `BAMT`

combinators. Try find a way to write `T`

using only `BCKW`

combinators (hint: you might want to use your previous answer for writing `I`

using `BCKW`

).

`BAMT`

combinators can be encoded using `BCKW`

by putting `I`

(or `CKC`

or what have you) after the corresponding `BCKW`

combinator. In other words:
```
T = CI = C(CKC)
A = KI = K(CKC)
M = WI = W(CKC)
```

It’s pretty easy to go from `BCKW`

to `BAMT`

, then. However, it’s *extremely* difficult to go the other way. Here, try to write `K`

in terms of `BAMT`

(this is quite difficult, do not expect to get it!):

```
B(TA)(BBT)
B(B(TA)B)T
```

So this is why we will stick to `BCKW`

for the time being: `BAMT`

is just too painful to use.

One of the things `BCKW`

has over `SKI`

is that each combinator represents a concrete capability. `K`

and `W`

especially: without these combinators, we can neither duplicate nor discard variables. This makes the languages without one or both of these interesting (albeit not Turing-complete).

If we say that we can’t use `W`

, we know that the will not duplicate any input. In fact, encoded appropriately, we know that the program can only decrease its size through execution. The `BCK`

system is in fact an encoding of *affine* logic, which is all the rage nowadays. Rust uses affine types to guarantee memory safety: by preventing duplication of references, you can know that whenever you’re looking at a variable you’re free to modify it, or destroy it if necessary (obviously Rust is a bit more complex than what I’ve described here, but `BCK`

is indeed the fundamental basis for the system in the same way that `SK`

can be the basis for any programming language).

If we remove `K`

as well we have a *linear* language. This is even more restrictive, but is also quite actively researched at the moment: linear types have been used to construct languages for differential privacy, for instance.

There’s one small issue with `BC`

: it doesn’t (strictly speaking) have an equivalent to `I`

. You can write an expression which is *close*, but it will only actually compute when applied to at least 3 arguments. See if you can find it.

`BCC`

Usually we add `I`

, though, to give us `BCI`

.

`S`

is the only combinator we haven’t seen yet. It’s kind of a combination of `B`

, `C`

, and `W`

:

`Sxyz ~> xz(yz)`

It does parenthesising, reordering, *and* duplication. This allows it to be powerful to be Turing complete only with the addition of `K`

. Try first to construct `I`

given only `S`

and `K`

:

`SK`

followed by any combinator will suffice.
`I = SKK = SKS`

And now construct `S`

from `BCKW`

:

`S = B(BW)(BBC) = B(B(BW)C)(BB)`

Of course, to show that `SK`

is universal we’d need to show that it contains one of the other universal systems. We won’t do that exhaustively here, but first just try to figure out `B`

and `W`

:

`B = S(KS)K`

`S = SS(SK) = SS(KI)`

The next task is to encode the `Y`

combinator. This is a combinator that evaluates to the following:

`Yf ~> f(Yf)`

As you can see, it encodes *recursion*. Like the `fix`

function in Haskell, this combinator allows us to do recursion without explicit self-reference. And, of course, we can define this combinator using the combinators we’ve seen before, since our language is Turing complete. One encoding is `BM(CBM)`

:

As you can see, `BM(CBM)`

, when applied to `f`

, yields `f(M(CBMf))`

, which is equivalent to `f(BM(CBM)f)`

(the `B`

just hasn’t been applied inside the `f`

). So this is indeed a proper recursion combinator.

Let’s try doing a little bit of programming with these combinators now.

In the lambada calculus, to encode numbers we often use the *church* numerals: that’s what we’re going to do here, too. A church numeral representing some number $n$ is a function which takes two arguments, and applies the first argument to the second $n$ times. Here are some church numerals in Haskell:

```
zero :: (a -> a) -> a -> a
zero f x = x
one :: (a -> a) -> a -> a
one f x = f x
two :: (a -> a) -> a -> a
two f x = f (f x)
three :: (a -> a) -> a -> a
three f x = f (f (f x))
```

Encoding these numerals in combinators is a little more difficult. Zero and one are obvious: they are `A`

and `I`

, respectively. Try to figure out two and three:

`WB`

`SB(WB)`

It turns out that it’s pretty easy to encode numbers in a relatively small amount of space, using a binary encoding. First, multiplication on Church numerals is simply composition: so that’s `B`

on our combinators. We already have 2 defined, so the next thing we need for a binary encoding is a successor function. And we know what *that* is, from the answer to 3!

This means we can encode normal number in $\mathcal{O}(\log n)$ space (although it still takes linear time to evaluate). The following repl allows for numbers:

We could take up even less space if we allowed for non-normal forms. 4, for instance, could be encoded like so:

`M(WB)`

But we generally prefer to keep our encodings in normal form: otherwise there’s some extra evaluation we have to pay for when we go to use them.

Once upon a time SKI combinators were used as a target for functional compilers: Miranda, Haskell’s precursor, compiled down to a set of combinators which included `SKI`

. Nowadays, Haskell is compiled to the “spineless tagless G-machine”: its compilation technique took over from combinators in the late 80s, and has been the dominant form since. Apparently the reason is that, on the current architecture of most computers, combinator-based compilation targets just aren’t fast enough. They generate too much garbage: as a result, switching to the STG yielded about a 40% speedup.

A lot of this information comes from two talks, by the way:

- An Introduction to Combinator Compilers and Graph Reduction Machines, by David Graunke (2016), which goes through a high-level history and explanation of combinator compilers and why we switched away from them. A very interesting tidbit in this talk was that some people started making custom hardware to handle combinator calculi a little better. Even more interesting is the fact that these days we have FPGAs all over the place, so maybe combinator compilers are ripe for reintroduction?
- Combinators Revisited, by Edward Kmett (2018), which goes through a little more of the details of the problems with combinator compilers, and mentions some of the places in which we’re tantalisingly close to making combinator compilation work.

So compilation to combinators was once upon a time an extremely active area of research, but it has since fallen by the wayside a little because our current hardware is unable to evaluate it efficiently. What this means for us, though, is that there’s a large body of work on how to compile lambda terms to combinators!

We use the following basic combinator set for compilation: `SKIBC`

. `S`

is really the most important one here: of course we only need it and `K`

, but we use `I`

because it dramatically simplifies the expressions we generate, and we use `B`

and `C`

because they are special cases of `S`

, as we’ll see in a second. The translation works like so:

```
\x. e1 e2 -> S (\x. e1) (\x. e2)
\x. x -> I
\x. e -> K e
```

The translation works bottom-up. We’re only interested in removing the lambdas: combinator calculus does have application, after all, so there’s nothing we need to do in that case. For that reason, the algorithm is often called “abstraction elimination”, and it’s the one the pointfree.io uses to automatically pointfree Haskell expressions.

There are three forms of abstraction: abstraction into an expression which is an application, abstraction which returns its argument, and abstraction which returns something other than its argument. In the first case, we use `S`

to pass the argument down each branch of the abstraction. In the second, we just use `I`

. And in the third case, we use `K`

to just ignore the argument. We won’t ever get `\x. \y. e`

, since the algorithm works bottom-up, so the `\y. e`

is eliminated before looking at the `\x. \y. e`

.

`B`

and `C`

work like special cases of `S`

: when we pass `x`

down both branches of the application in the first case, sometimes that work is unnecessary. Sometimes one of the branches doesn’t use the passed variable: in this case, we use `B`

or `C`

, depending on which branch ignores the variable.

```
\x. e1 e2, x ∉ e1 -> B e1 (\x. e2)
\x. e1 e2, x ∉ e2 -> C (\x. e1) e2
```

There is one issue with this approach: it produces combinator expressions which are of order $\mathcal{O}(n^3)$ larger than the corresponding lambda expression. With some tricks (like our usage of `C`

and `B`

) we can get that down to $\mathcal{O}(n^2)$, but that’s still a pretty unpleasant size increase.

The issue is that we’re basically passing the arguments as a singly-linked list, where naive access is $\mathcal{O(n^2)}$, and more sophisticated access is $\mathcal{O}(n)$.

Oleg Kiselyov wrote a paper (2018) on getting this down to $\mathcal{O}(n)$, with some memoisation. There’s also a blog post (Lynn 2018), describing how to get that conversion without memoisation in $\mathcal{O}(n \log n)$ time, and an online implementation here.

That’s all for this post! I’ll probably write more about combinators in the future: they’re an extremely interesting subject, and a lot of fun as puzzles to mess around with. One thing that I haven’t mentioned is the connection between combinators and concatenative languages: it turns out that these two things are pretty much the same thing! Maybe I’ll look at it in a future post.

Graunke, David. 2016. “An Introduction to Combinator Compilers and Graph Reduction Machines.” St. Louis. https://www.youtube.com/watch?v=GawiQQCn3bk.

Kiselyov, Oleg. 2018. “$\lambda$ to SKI, Semantically.” In *Functional and Logic Programming*, ed by. John P. Gallagher and Martin Sulzmann, 33–50. Lecture Notes in Computer Science. Cham: Springer International Publishing. doi:10.1007/978-3-319-90686-7_3. http://okmij.org/ftp/tagless-final/ski.pdf.

Kmett, Edward. 2018. “Combinators Revisited.” Wesley Conference Centre, Sydney, Australia. https://yowconference.com/talks/edward-kmett/yow-lambda-jam-2018/combinators-revisited-5919.

Lynn, Ben. 2018. “Ben Lynn’s Online Garbage: Lambda the Penultimate.” *Ben Lynn’s Online Garbage*. https://benlynn.blogspot.com/2018/11/lambda-penultimate_16.html.

Wildenhain, Tom. 2017. “On the Turing Completeness of MS PowerPoint.” http://www.andrew.cmu.edu/user/twildenh/PowerPointTM/Paper.pdf.

If you want to look up these combinators elsewhere, this is the only one you won’t be able to find: it’s much less common than

`K`

, and where I have found it people just call it`K`

, so I had to pick a different letter to distinguish it↩︎

Tags: Haskell

It’s been a while since I last wrote a post (I’ve been busy with my Master’s thesis, which is nearly done), so I thought I would quickly throw out some fun snippets of Haskell I had reason to write over the past couple of weeks.

For some reason, until recently I had been under the impression that it was impossible to fuse zips efficiently. In other words, I thought that `zip`

was like `tail`

, in that if it was implemented using only `foldr`

it would result in an asymptotic slowdown (`tail`

is normally $\mathcal{O}(1)$, implemented as a fold it’s $\mathcal{O}(n)$).

Well, it seems like this is not the case. The old zip-folding code I had looks to me now to be the correct complexity: it’s related to How To Zip Folds, by Oleg Kiselyov (although I’m using a different version of the function which can be found on the mailing list). The relevant code is as follows:

```
newtype Zip a b =
Zip { runZip :: a -> (Zip a b -> b) -> b }
zip :: [a] -> [b] -> [(a,b)]
zip xs ys = foldr xf xb xs (Zip (foldr yf yb ys))
where
= runZip yk x xk
xf x xk yk = []
xb _
= (x,y) : xk (Zip yk)
yf y yk x xk = [] yb _ _
```

There are apparently reasons for why the Prelude’s `zip`

isn’t allowed to fuse both of its arguments: I don’t fully understand them, however. (in particular the linked page says that the fused zip would have different strictness behaviour, but the version I have above seems to function properly).

This version of zip leads to some more fun solutions to folding puzzles, like this one:

Write a function that is equivalent to:

`= reverse (zip (reverse xs) (reverse ys)) zipFromEnd xs ys`

Without creating any intermediate lists.

The desired function is interesting in that, instead of lining up lists according to their first elements, it aligns them according to the *ends*.

```
>>> zipFromEnd [1,2,3] "abc"
1,'a'),(2,'b'),(3,'c')]
[(
>>> zipFromEnd [1,2,3] "abcd"
1,'b'),(2,'c'),(3,'d')]
[(
>>> zipFromEnd [1,2,3,4] "abc"
2,'a'),(3,'b'),(4,'c')] [(
```

The solution here is just to use `foldl`

, and we get the following:

```
zipFromEnd :: [a] -> [b] -> [(a,b)]
= foldl xf xb xs (Zip (foldl yf yb ys)) []
zipFromEnd xs ys where
= runZip yk x xk
xf xk x yk = zs
xb _ zs
= xk (Zip yk) ((x,y) : zs)
yf yk y x xk zs = zs yb _ _ zs
```

Another function which is a little interesting is the “zip longest” function:

```
zipLongest :: (a -> a -> a) -> [a] -> [a] -> [a]
= foldr xf xb xs (Zip (foldr yf yb ys))
zipLongest c xs ys where
= runZip yk (Just x) xk
xf x xk yk = runZip zs Nothing xb
xb zs
Nothing xk = y : xk (Zip yk)
yf y yk Just x) xk = c x y : xk (Zip yk)
yf y yk (
Nothing _ = []
yb Just x) zs = x : zs (Zip yb) yb (
```

Finally, all of these functions rely on the `Zip`

type, which is *not* strictly positive. This means that we can’t use it in Agda, and it’s tricky to reason about: I wonder what it is about functions for deforestation that tends to lead to non-strictly-positive datatypes.

The next puzzle I was interested in was finding the next lexicographic permutation of some string. In other words, given some string $s$, you need to find another string $t$ that is a permutation of $s$ such that $s < t$, and that there is no string $u$ that is a permutation of $s$ and $s < u < t$. The Wikipedia article on the topic is excellent (and clear), but again the algorithm is described in extremely imperative terms:

- Find the largest index k such that a[k] < a[k + 1]. If no such index exists, the permutation is the last permutation.
- Find the largest index l greater than k such that a[k] < a[l].
- Swap the value of a[k] with that of a[l].
- Reverse the sequence from a[k + 1] up to and including the final element a[n].

The challenge here is to write this algorithm without doing any indexing: indexing is expensive on Haskell lists, and regardless it is cleaner to express it without.

I managed to work out the following:

```
nextLexPerm :: Ord a => [a] -> Maybe [a]
= Nothing
nextLexPerm [] :xs) = go1 x xs
nextLexPerm (xwhere
= Nothing
go1 _ [] :xs) = maybe (go2 i j [] xs) (Just . (i:)) (go1 j xs)
go1 i (j
go2 i j xs ys| j <= i = Nothing
| otherwise = Just (fromMaybe (j : foldl (flip (:)) (i:xs) ys) (go3 i (j:xs) ys))
= Nothing
go3 _ _ [] :ys) = go2 i j xs ys go3 i xs (j
```

This comes from the Rosetta Code problem Circle Sort. This is a strange little sorting algorithm, where basically you compare elements on opposite sides of an array, swapping them as needed. The example given is the following:

`6 7 8 9 2 5 3 4 1`

First we compare (and swap) `6`

and `1`

, and then `7`

and `4`

, and so on, until we reach the middle. At this point we split the array in two and perform the procedure on each half. After doing this once it is not the case that the array is definitely sorted: you may have to repeat the procedure several (but finitely many) times, until no swaps are performed.

I have absolutely no idea what the practical application for such an odd algorithm would be, but it seemed like an interesting challenge to try implement it in a functional style (i.e. without indices or mutation).

The first thing we have to do is fold the list in half, so we pair up the right items. We’ve actually seen an algorithm to do this before: it’s often called the “tortoise and the hare”, and our previous use was to check if a list was a palindrome. Here’s how we implement it:

```
halve :: [a] -> [(a,a)]
= snd (go xs xs)
halve xs where
:ys) (_:_:zs) = f y (go ys zs)
go (y:ys) [_] = (ys,[])
go (_= (ys,[])
go ys []
:ys,zs) = (ys, (x,y) : zs)
f x (y
>>> halve [6,7,8,9,2,5,3,4,1]
6,1),(7,4),(8,3),(9,5)] [(
```

Notice that the `2`

in the very middle of the list is missing from the output: I’ll describe how to handle that element later on. In the above piece of code, that `2`

actually gets bound to the underscore (in `(_:ys)`

) in the second clause of `go`

.

Next we need to do the actual swapping: this is actually pretty straightforward, if we think of the algorithm functionally, rather than imperatively. Instead of swapping things in place, we are building up both halves of the new list, so the “swap” operation should simply decide which list each item goes into.

```
halve :: Ord a => [a] -> ([a],[a])
= tl (go xs xs)
halve xs where
= (lte,gt)
tl (_,lte,gt)
:ys) (_:_:zs) = swap y (go ys zs)
go (y:ys) [_] = (ys,[],[])
go (_= (ys,[],[])
go ys []
:ys,lte,gt)
swap x (y| x <= y = (ys, x : lte, y : gt)
| otherwise = (ys, y : lte, x : gt)
```

At this point we can also see what to do with the middle item: we’ll put it in the higher or lower list, depending on a comparison with the element it’s next to.

```
halve :: Ord a => [a] -> ([a],[a])
= tl (go xs xs)
halve xs where
= (lte,gt)
tl (_,lte,gt)
:ys) (_:_:zs) = swap y (go ys zs)
go (y= (ys,[],[])
go ys [] :ys) [_] = (ys,[y | e],[y | not e])
go (ywhere e = y <= head ys
:ys,lte,gt)
swap x (y| x <= y = (ys, x : lte, y : gt)
| otherwise = (ys, y : lte, x : gt)
```

Next, we can use this as a helper function in the overall recursive function.

```
circleSort :: Ord a => [a] -> [a]
= []
circleSort [] = [x]
circleSort [x] =
circleSort xs let (lte,gt) = halve xs
in circleSort lte ++ circleSort (reverse gt)
```

This function isn’t correct (yet). As we mentioned already, we need to run the circle sort procedure multiple times until no swaps occur. We can add in the tracking of swaps like so:

```
circleSort :: Ord a => [a] -> [a]
= if swapped then circleSort ks else ks
circleSort xs where
= go xs
(swapped,ks)
= (False, [])
go [] = (False, [x])
go [x] =
go xs let (s,_,lte,gt) = halve xs xs
= go lte
(sl,lte') = go (reverse gt)
(sg,gt' ) in (s || sl || sg, lte' ++ gt')
:ys) (_:_:zs) = swap y (halve ys zs)
halve (y= (False,ys,[],[])
halve ys [] :ys) [_] = (False,ys,[y | e],[y | not e])
halve (ywhere e = y <= head ys
:ys,lte,gt)
swap x (s,y| x <= y = (s ,ys, x : lte, y : gt)
| otherwise = (True,ys, y : lte, x : gt)
```

So at this point we actually have a working implementation of the function, which avoids indices as intended. It has some problems still, though. First, we call `++`

, when we could be using difference lists. Here’s the solution to that:

```
circleSort :: Ord a => [a] -> [a]
= if swapped then circleSort ks else ks
circleSort xs where
= go xs []
(swapped,ks)
= (False, zs)
go [] zs = (False, x:zs)
go [x] zs =
go xs zs let (s,_,lte,gt) = halve xs xs
= go lte gt'
(sl,lte') = go (reverse gt) zs
(sg,gt' ) in (s || sl || sg, lte')
:ys) (_:_:zs) = swap y (halve ys zs)
halve (y= (False,ys,[],[])
halve ys [] :ys) [_] = (False,ys,[y | e],[y | not e])
halve (ywhere e = y <= head ys
:ys,lte,gt)
swap x (s,y| x <= y = (s ,ys, x : lte, y : gt)
| otherwise = (True,ys, y : lte, x : gt)
```

Next we can actually rewrite the `go`

function to allow for a certain amount of tail recursion (kind of):

```
circleSort :: Ord a => [a] -> [a]
= if swapped then circleSort ks else ks
circleSort xs where
= go xs (False,[])
(swapped,ks)
= (s,ks)
go [] (s,ks) = (s,x:ks)
go [x] (s,ks) =
go xs (s,ks) let (s',_,ls,rs) = halve s xs xs
in go ls (go (reverse rs) (s',ks))
:ys) (_:_:zs) = swap y (halve s ys zs)
halve s (y= (s,ys,[],[])
halve s ys [] :ys) [_] = (s,ys,[y | e],[y | not e])
halve s (ywhere e = y <= head ys
:ys,ls,rs)
swap x (s,y| x <= y = ( s,ys,x:ls,y:rs)
| otherwise = (True,ys,y:ls,x:rs)
```

Next, we call `reverse`

: but we can avoid the reverse by passing a parameter which tells us which direction we’re walking down the list. Since the swapping logic is symmetric, we’re able to just invert some of the functions. It is a little tricky, though:

```
circleSort :: Ord a => [a] -> [a]
= if swapped then circleSort ks else ks
circleSort xs where
= go False xs (False,[])
(swapped,ks)
= (s,ks)
go d [] (s,ks) = (s,x:ks)
go d [x] (s,ks) =
go d xs (s,ks) let (s',_,ls,rs) = halve d s xs xs
in go False ls (go True rs (s',ks))
:ys) (_:_:zs) = swap d y (halve d s ys zs)
halve d s (y= (s,ys,[],[])
halve d s ys [] :ys) [_] = (s,ys,[y | e],[y | not e])
halve d s (ywhere e = y <= head ys
:ys,ls,rs)
swap d x (s,y| bool (<=) (<) d x y = ( d || s,ys,x:ls,y:rs)
| otherwise = (not d || s,ys,y:ls,x:rs)
```

So there it is! The one-pass, purely function implementation of circle sort. Very possibly the most useless piece of code I’ve ever written.

]]>
Tags: Haskell

A week or so ago I gave a presentation on purely functional data structures as part of an interview^{1}. Here are the slides:

https://doisinkidney.com/pdfs/purely-functional-data-structures-slides.pdf

The presentation is meant to be about 45 minutes long, and it’s aimed at end of first year computer science students who have done some Haskell and know a little bit about pointers.

The interview went well, by the way! All going well with my master’s I’ll be starting a PhD in Imperial in Nicolas Wu’s group this October.↩︎

Part 2 of a 2-part series on Random Access Lists

```
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE PolyKinds #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE TypeFamilyDependencies #-}
{-# LANGUAGE ConstraintKinds #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE UndecidableInstances #-}
{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE AllowAmbiguousTypes #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# OPTIONS_GHC -Wall -fno-warn-unticked-promoted-constructors #-}
module Post where
import Data.Kind
import Prelude hiding (lookup)
import GHC.TypeLits
import Control.Lens hiding (Cons, index)
```

One of the common techniques for building purely functional data structures is to base the structure on some numerical representation (Hinze 1998). Most recently, I read Swierstra (2020), where the binary numbers were used to implement a heterogeneous random-access list (effectively a generic tuple).

I’m going to look today at using the zeroless binary system to implement a similar structure, and see what the differences are.

I have talked about this representation before, so I won’t go into it in huge depth, but put simply the zeroless binary system represents a binary number as a string of `1`

s and `2`

s (i.e. no zeroes). The vast majority of the normal binary operations (addition, multiplication, etc.) can be implemented with the same broad efficiency, but this system has one key advantage in that every single number is uniquely represented. Since we’re going to use these numbers to index our data types, this is actually extremely useful.

Before we get started, we’ll first define the peculiar type of lists we’re going to use.

```
infixr 5 :-
data Plus a = a :- Star a
data Star a
= Nil
| Some (Plus a)
```

`Star a`

is isomorphic to `[a]`

, so we’ve not lost any expressive power or anything like that. The usefulness of this definition is that we have a non-empty list type built in to our list type, so we don’t have to do conversion back and forth which can be cumbersome.

Next on to the number itself:

```
data Bit = B1 | B2
type family Inc (xs :: Star Bit) = (ys :: Plus Bit) | ys -> xs where
Inc Nil = B1 :- Nil
Inc (Some (B1 :- xs)) = B2 :- xs
Inc (Some (B2 :- xs)) = B1 :- Some (Inc xs)
```

We’re straight into the type-level operations here, and there’s an interesting bit of syntax worth pointing out before we move on. `ys -> xs`

is a type family dependency: it means that we can uniquely determine `xs`

given `ys`

. This is very handy for type inference and so on, and is perhaps the main benefit of the zeroless binary numbers.

Next, we’ll build a tree indexed by these numbers. Now that we’re jumping in to indexing, we’ll need some singletons. Here’s my preferred way to do them:

```
type family The k = (s :: k -> Type) | s -> k
class Known (x :: a) where sing :: The a x
data SBit b where
SB1 :: SBit B1
SB2 :: SBit B2
type instance The Bit = SBit
instance Known B1 where sing = SB1
instance Known B2 where sing = SB2
```

The type family defines the singleton GADTs themselves. The class `Known`

is for automatically generating singleton values.

On to the tree. We’re actually going to build a *Braun* tree here, as they are actually particularly clean to implement on the type level.

```
type family Carry (x :: Bit) (xs :: Star Bit) :: Star Bit where
Carry B1 xs = xs
Carry B2 xs = Some (Inc xs)
data Tree (xs :: Star Bit) a where
Leaf :: Tree Nil a
Branch :: Forest xs a -> Tree (Some xs) a
data Forest (xs :: Plus Bit) a where
Root :: a
-> The Bit x
-> Tree (Carry x xs) a
-> Tree xs a
-> Forest (x :- xs) a
```

We first have a type family which increments a binary number if its first argument is `B2`

: this will maintain the Braun tree’s invariant.

Next, we have the tree definition itself, which is split into two mutual definitions, in much the same way as the `Star`

and `Plus`

lists previously. Next, to `cons`

something onto the tree:

```
type family Cons (x :: a) (xs :: Tree ns a) = (ys :: Forest (Inc ns) a) | ys -> x xs where
Cons x Leaf = Root x SB1 Leaf Leaf
Cons x (Branch (Root y SB1 ls rs)) = Root x SB2 (Branch (Cons y rs)) ls
Cons x (Branch (Root y SB2 ls rs)) = Root x SB1 (Branch (Cons y rs)) ls
```

You’ll notice that we can again annotate this type family with injectivity.

So far all we have is a size-indexed tree. We want a *heterogeneous* tree, meaning that we must next construct a tree *indexed* by the previous tree. In order to do this, we’ll first need singletons on the type level:

```
type family Sing (x :: a) = (y :: The a x) | y -> x
type instance Sing B1 = SB1
type instance Sing B2 = SB2
```

This kind of nonsense we’re doing here is precisely the kind of thing obsolesced by dependent types, by the way. If you’re already doing type-level heavy stuff (as we are here) the extra power afforded by full dependent types often means that hacky special cases just turn into standard functions, greatly simplifying things like the above type families.

But anyway, back to the tree:

```
data HTree (xs :: Tree ns Type) where
HLeaf :: HTree Leaf
HNode :: x
-> !(The Bit b)
-> !(HTree ls)
-> !(HTree rs)
-> HTree (Branch (Root x (Sing b) ls rs))
```

And we can `cons`

on an element in much the same way we did with the homogeneous tree:

```
infixr 5 <:
(<:) :: x -> HTree xs -> HTree (Branch (Cons x xs))
<: HLeaf = HNode x SB1 HLeaf HLeaf
x <: HNode y SB1 yl yr = HNode x SB2 (y <: yr) yl
x <: HNode y SB2 yl yr = HNode x SB1 (y <: yr) yl x
```

The real use of this data structure is quick *indexing*. As with the previous functions, we will first need to construct the type-level version of what we want to do.

```
type family Lookup (i :: Star Bit) (xs :: Tree sz a) :: a where
Lookup Nil (Branch (Root x _ _ _)) = x
Lookup (Some (B1 :- i)) (Branch (Root _ _ ls _)) = Lookup i ls
Lookup (Some (B2 :- i)) (Branch (Root _ _ _ rs)) = Lookup i rs
```

While this function is partial, the value-level one should not be: it should be provably in-bounds for lookups. As a result we’ll need a slightly complex type to represent the indices:

```
data Position (xs :: Star Bit) (ys :: Star Bit) where
P0 :: Position Nil (Some ys)
P1 :: !(Position xs (Carry y ys)) -> Position (Some (B1 :- xs)) (Some (y :- ys))
P2 :: !(Position xs ys) -> Position (Some (B2 :- xs)) (Some (y :- ys))
```

A value of type `Position xs ys`

is actually a proof that `xs`

is smaller than `ys`

, but we’re using it here just as a pointer to an entry in the tree. Here’s the actual lookup function itself.

```
lookup :: forall is (ts :: Tree sz Type). Position is sz -> HTree ts -> Lookup is ts
lookup P0 (HNode x _ _ _) = x
lookup (P1 i) (HNode _ _ ls _) = lookup i ls
lookup (P2 i) (HNode _ _ _ rs) = lookup i rs
```

Just having pointers isn’t much use: we also need a way to build them. The key function here is `push`

: this increments the index pointed to by one.

```
infixr 5 ::-
data SPlus xs where
(::-) :: The a x -> The (Star a) xs -> SPlus (x :- xs)
data SStar xs where
Nily :: SStar Nil
Somy :: The (Plus a) xs -> SStar (Some xs)
type instance The (Plus a) = SPlus
type instance The (Star a) = SStar
instance Known Nil where sing = Nily
instance Known xs => Known (Some xs) where sing = Somy sing
instance (Known x, Known xs) => Known (x :- xs) where sing = sing ::- sing
```

```
push :: Known ys => Position xs ys -> Position (Some (Inc xs)) (Some (Inc ys))
= go p sing
push p where
go :: Position xs ys -> The (Star Bit) ys -> Position (Some (Inc xs)) (Some (Inc ys))
P0 (Somy (SB1 ::- _ )) = P1 P0
go P0 (Somy (SB2 ::- _ )) = P1 P0
go P2 i) (Somy (SB1 ::- ys)) = P1 (go i ys)
go (P2 i) (Somy (SB2 ::- ys)) = P1 (go i ys)
go (P1 i) (Somy (SB1 ::- _ )) = P2 i
go (P1 i) (Somy (SB2 ::- _ )) = P2 i go (
```

Everything above is pretty much all you need for many use cases, but it’s pretty ugly stuff. To actually use this thing as a generic tuple we’ll need a lot of quality-of-life improvements.

First of all, we should use type-level lists to indicate the tuple itself:

```
type family Length (xs :: [a]) :: Star Bit where
Length '[] = Nil
Length (_ : xs) = Some (Inc (Length xs))
type family FromList (xs :: [a]) = (ys :: Tree (Length xs) a) | ys -> xs where
FromList '[] = Leaf
FromList (x : xs) = Branch (Cons x (FromList xs))
```

```
type family Tuple (xs :: [Type]) = (ys :: Type) | ys -> xs where
Tuple xs = HTree (FromList xs)
```

Because the type family here is injective, we won’t get any of the usual weird errors when we use the type `Tuple [Bool,String]`

or whatever: passing that around will function almost exactly the same as passing around the tree representation itself directly.

```
example :: Tuple [Bool,String,Int,(),String]
= True <: "True" <: 1 <: () <: "T" <: HLeaf example
```

We can fold over the tree itself (using the Braun tree folding algorithm from a previous post) if every element in the tree conforms to some class. Using this we can generate a nice string representation of the tree.

`Show`

instance. ```
type family All (c :: a -> Constraint) (xs :: Tree ns a) :: Constraint where
All c Leaf = ()
All c (Branch (Root x _ ls rs)) = (c x, All c ls, All c rs)
newtype Q2 a
= Q2
unQ2 :: (Q2 a -> Q2 a) -> (Q2 a -> Q2 a) -> a
{
}
foldrTree :: forall c xs b. All c xs => (forall x. c x => x -> b -> b) -> b -> HTree xs -> b
= unQ2 (f @c g' n' t b) id id
foldrTree g' n' t where
f :: forall c' ys b'. All c' ys => (forall x. c' x => x -> b' -> b') -> b' -> HTree ys -> Q2 b' -> Q2 b'
HNode x _ l r) xs = Q2 (\ls rs -> g x (unQ2 xs (ls . f @c' g n l) (rs . f @c' g n r)))
f g n (HLeaf _ = Q2 (\_ _ -> n)
f _ n
= Q2 (\ls rs -> unQ2 (ls (rs b)) id id)
b
instance All Show xs => Show (HTree xs) where
showsPrec _ tr = showChar '(' . go (foldrTree @Show (\x xs -> shows x : xs) [] tr)
where
go :: [ShowS] -> ShowS
= showChar ')'
go [] :xs) = x . foldr (\y ys -> showChar ',' . y . ys) (showChar ')') xs go (x
```

```
>>> example
True,"True",1,(),"T") (
```

The approach used in Swierstra (2020) had a specific goal in mind: using the heterogeneous list to implement a lookup table for evaluating lambda calculus. As such, efficiently being able to “increment” an index was vital.

If we wanted to use the type as a generic tuple, though, we would have no such requirement. Instead, we might expect all accesses to be resolved and inlined at compile-time (as in Martinez, Viera, and Pardo 2013). We also would want a nice syntax for accessing parts of the tuple.

We can accomplish all of this with some type classes, as it happens. If we replace pattern-matching on data types with typeclass resolution we can be all but guaranteed that the function calls and so on will be inlined entirely at compile-time (we also would need to add INLINE pragmas to every instance, which I haven’t done here for readability’s sake). The main class we’ll use is the following:

```
class (xs :: Star Bit) < (ys :: Star Bit) where
pull :: forall (t :: Tree ys Type). HTree t -> Lookup xs t
```

```
instance Nil < Some ys where
HNode x _ _ _) = x
pull (
instance xs < ys => Some (B1 :- xs) < Some (B1 :- ys) where
HNode _ _ ls _) = pull @xs ls
pull (
instance xs < Some (Inc ys) => Some (B1 :- xs) < Some (B2 :- ys) where
HNode _ _ ls _) = pull @xs ls
pull (
instance xs < ys => Some (B2 :- xs) < Some (y :- ys) where
HNode _ _ _ rs) = pull @xs rs
pull (
instance TypeError (Text "Index out of range") => xs < Nil where
= error "unreachable"
pull
data Peano = Z | S Peano
type family FromPeano (n :: Peano) = (m :: Star Bit) | m -> n where
FromPeano Z = Nil
FromPeano (S n) = Some (Inc (FromPeano n))
type family FromLit (n :: Nat) :: Peano where
FromLit 0 = Z
FromLit n = S (FromLit (n - 1))
get :: forall n xs (t :: Tree xs Type). FromPeano (FromLit n) < xs
=> HTree t -> Lookup (FromPeano (FromLit n)) t
= pull @(FromPeano (FromLit n)) get
```

Some other details out of the way we get the following nice interface:

```
>>> get @4 example
"T"
```

You even get a type error for out-of-range indices:

`>>> get @7 example`

```
• Index out of range
• In the expression: get @7 example
```

Or we could even add a lens interface:

```
type family Replace (i :: Star Bit) (x :: a) (xs :: Tree sz a) :: Tree sz a where
Replace Nil x (Branch (Root _ b ls rs)) = Branch (Root x b ls rs)
Replace (Some (B1 :- i)) x (Branch (Root y b ls rs)) = Branch (Root y b (Replace i x ls) rs)
Replace (Some (B2 :- i)) x (Branch (Root y b ls rs)) = Branch (Root y b ls (Replace i x rs))
class (xs :: Star Bit) <! (ys :: Star Bit) where
index :: forall (t :: Tree ys Type) b. Lens (HTree t) (HTree (Replace xs b t)) (Lookup xs t) b
instance Nil <! Some ys where
index f (HNode x b ls rs) = fmap (\x' -> HNode x' b ls rs) (f x)
instance xs <! ys => Some (B1 :- xs) <! Some (B1 :- ys) where
index f (HNode x b ls rs) = fmap (\ls' -> HNode x b ls' rs) (index @xs f ls)
instance xs <! Some (Inc ys) => Some (B1 :- xs) <! Some (B2 :- ys) where
index f (HNode x b ls rs) = fmap (\ls' -> HNode x b ls' rs) (index @xs f ls)
instance xs <! ys => Some (B2 :- xs) <! Some (y :- ys) where
index f (HNode x b ls rs) = fmap (\rs' -> HNode x b ls rs') (index @xs f rs)
instance TypeError (Text "Index out of range") => xs <! Nil where
index = error "unreachable"
ind :: forall n xs (t :: Tree xs Type) a. FromPeano (FromLit n) <! xs
=> Lens (HTree t) (HTree (Replace (FromPeano (FromLit n)) a t)) (Lookup (FromPeano (FromLit n)) t) a
= index @(FromPeano (FromLit n)) ind
```

```
>>> over (ind @1) length example
True,4,1,(),"T") (
```

The approach I’ve taken here is actually a little unusual: in both Hinze (1998) and Swierstra (2020) the tree is defined as a *nested* data type. Let’s take a look at that approach, while also switching to Agda.

```
: Set
𝔹 = List Bool
𝔹
pattern 1ᵇ = false
pattern 2ᵇ = true
data When (A : Set a) : Bool → Set a where
: When A false
O⟨⟩ _⟩ : A → When A true
I⟨
infixl 4 _×2
record _×2 (A : Set a) : Set a where
constructor _,_
field
: A
fst snd open _×2
infixr 5 ⟨_⟩+_+2×_
data Array (A : Set a) : 𝔹 → Set a where
: Array A []
O _⟩+_+2×_ : ∀ {n ns} → A → When A n → Array (A ×2) ns → Array A (n ∷ ns) ⟨
```

The cons function here is really no more complex than the previous cons:

```
: 𝔹 → List Bool
inc = 1ᵇ ∷ []
inc [] (1ᵇ ∷ xs) = 2ᵇ ∷ xs
inc (2ᵇ ∷ xs) = 1ᵇ ∷ inc xs
inc
: ∀ {ns} → A → Array A ns → Array A (inc ns)
cons = ⟨ x ⟩+ O⟨⟩ +2× O
cons x O (⟨ x₂ ⟩+ O⟨⟩ +2× xs) = ⟨ x₁ ⟩+ I⟨ x₂ ⟩ +2× xs
cons x₁ (⟨ x₂ ⟩+ I⟨ x₃ ⟩ +2× xs) = ⟨ x₁ ⟩+ O⟨⟩ +2× cons (x₂ , x₃) xs cons x₁
```

But what I’m really interested in, again, is *indexing*. In particular, I’m interested in using an actual binary number to index into this structure, rather than the weird GADT we had to use in Haskell. One of the advantages of using full dependent types is that we can write functions like the following:

```
: ∀ is → Array A xs → is < xs → A
lookup = {!!} lookup
```

In other words, we can pass the proof term separately. This can help performance a little, but mainly it’s nice to use the actual number type one intended to use along with all of the functions we might use on that term.

So let’s get writing! The first thing to define is the proof of `<`

. I’m going to define it in terms of a boolean function on the bits themselves, i.e.:

```
_<ᴮ_ : 𝔹 → 𝔹 → Bool
_<ᴮ_ = {!!}
: Bool → Set
T = ⊤
T true = ⊥
T false
_<_ : 𝔹 → 𝔹 → Set
= T (x <ᴮ y) x < y
```

This will mean the proofs themselves are easy to pass around without modification. In fact, we can go further and have the compiler *definitionally* understand that the proof of `x < y`

is proof irrelevant, with Agda’s `Prop`

.

```
record ⊤ : Prop where constructor tt
data ⊥ : Prop where
: Bool → Prop
T = ⊤
T true = ⊥
T false
_<_ : 𝔹 → 𝔹 → Prop
= T (x <ᴮ y) x < y
```

Next, the functions which compute the actual comparison.

```
_&_≲ᵇ_ : Bool → Bool → Bool → Bool
= s or y
s & false ≲ᵇ y = s and y
s & true ≲ᵇ y
_&_≲ᴮ_ : Bool → 𝔹 → 𝔹 → Bool
= s
s & [] ≲ᴮ [] (y ∷ ys) = true
s & [] ≲ᴮ (x ∷ xs) ≲ᴮ [] = false
s & (x ∷ xs) ≲ᴮ (y ∷ ys) = (s & x ≲ᵇ y) & xs ≲ᴮ ys
s &
_<ᴮ_ _≤ᴮ_ : 𝔹 → 𝔹 → Bool
_<ᴮ_ = false &_≲ᴮ_
_≤ᴮ_ = true &_≲ᴮ_
```

These functions combine the definitions of `≤`

and `<`

, and do them both at once. We pass whether the comparison is non-strict or not as the first parameter: this is worth doing since both `<`

and `≤`

can be defined in terms of each other:

```
(1ᵇ ∷ xs) < (2ᵇ ∷ ys) = xs ≤ ys
(2ᵇ ∷ xs) ≤ (1ᵇ ∷ ys) = xs < ys
...
```

Finally the function itself:

```
: ∀ {b} → When A b → A ×2 → A
sel-bit {b = 1ᵇ} _ = snd
sel-bit {b = 2ᵇ} _ = fst
sel-bit
mutual
: ∀ xs {ys} → Array A ys → xs < ys → A
index (⟨ x ⟩+ _ +2× _ ) p = x
index [] (1ᵇ ∷ is) (⟨ _ ⟩+ x +2× xs) p = index₂ is x xs p
index (2ᵇ ∷ is) (⟨ _ ⟩+ x +2× xs) p = sel-bit x (index is xs p)
index
: ∀ xs {y ys} → When A y → Array (A ×2) ys → 1ᵇ ∷ xs < y ∷ ys → A
index₂ = fst (index is xs p)
index₂ is O⟨⟩ xs p = x
index₂ [] I⟨ x ⟩ xs p (i ∷ is) I⟨ _ ⟩ xs p = snd (index₃ i is xs p)
index₂
: ∀ x xs {ys} → Array A ys → x ∷ xs ≤ ys → A
index₃ (⟨ _ ⟩+ x +2× xs) p = index₂ is x xs p
index₃ 2ᵇ is (⟨ x ⟩+ _ +2× _ ) p = x
index₃ 1ᵇ [] (i ∷ is) (⟨ _ ⟩+ x +2× xs) p = sel-bit x (index₃ i is xs p) index₃ 1ᵇ
```

I think Braun trees are a fascinating data structure with lots of interesting aspects. In practice they tend to be much slower than other comparable structures, but they’re extremely simple and have many properties which make them particularly well-suited to type-level programming.

Hinze, Ralf. 1998. *Numerical Representations as Higher-Order Nested Datatypes*. Institut für Informatik III, Universität Bonn.

Martinez, Bruno, Marcos Viera, and Alberto Pardo. 2013. “Just Do It While Compiling!: Fast Extensible Records in Haskell.” In *Proceedings of the ACM SIGPLAN 2013 Workshop on Partial Evaluation and Program Manipulation - PEPM ’13*, 77. Rome, Italy: ACM Press. doi:10.1145/2426890.2426908.

Swierstra, Wouter. 2020. “Heterogeneous Binary Random-Access Lists.” *Journal of Functional Programming* 30: e10. doi:10.1017/S0956796820000064.

Part 9 of a 10-part series on Breadth-First Traversals

Tags: Haskell

This post will be quite light on details: I’m trying to gather up all of the material in this series to be a chapter in my Master’s thesis, so I’m going to leave the heavy-duty explanations and theory for that. Once finished I will probably do a short write up on this blog.

That said, the reason I’m writing this post is that in writing my thesis I figured out a nice way to solve the problem I first wrote about in this post. I won’t restate it in its entirety, but basically we’re looking for a function with the following signature:

`bft :: Applicative f => (a -> f b) -> Tree a -> f (Tree b)`

Seasoned Haskellers will recognise it as a “traversal”. However, this shouldn’t be an ordinary traversal: that, after all, can be derived automatically by the compiler these days. Instead, the Applicative effects should be evaluated in *breadth-first* order. To put it another way, if we have a function which lists the elements of a tree in breadth-first order:

`bfs :: Tree a -> [a]`

Then we should have the following identity:

`-> ([x], x)) t = (bfs t, t) bft (\x `

Using the writer Applicative with the list monoid here as a way to talk about ordering of effects.

There are many solutions to the puzzle (see Gibbons 2015; or Easterly 2019, or any of the posts in this series), but I had found them mostly unsatisfying. They basically relied on enumerating the tree in breadth-first order, running the traversal on the intermediate list, and then rebuilding the tree. It has the correct time complexity and so on, but it would be nice to deforest the intermediate structure a little bit more.

Anyways, the function I finally managed to get is the following:

```
bft :: Applicative f => (a -> f b) -> Tree a -> f (Tree b)
:& xs) = liftA2 (:&) (f x) (bftF f xs)
bft f (x
bftF :: Applicative f => (a -> f b) -> [Tree a] -> f [Tree b]
= fmap head . foldr (<*>) (pure []) . foldr f [pure ([]:)]
bftF t where
:& xs) (q : qs) = liftA2 c (t x) q : foldr f (p qs) xs
f (x
= [pure ([]:)]
p [] :xs) = fmap (([]:).) x : xs
p (x
: ks) = ((x :& xs) : y) : ys
c x k (xs where (y : ys) = k ks
```

The `Tree`

is defined like so:

`data Tree a = a :& [Tree a]`

It has all the right properties (complexity, etc.), and if you stick tildes before every irrefutable pattern-match it is also maximally lazy.

As a bonus, here’s another small function I looked at for my thesis. It performs a topological sort of a graph.

```
type Graph a = a -> [a]
topoSort :: Ord a => Graph a -> [a] -> [a]
= fst . foldr f ([], ∅)
topoSort g where
f x (xs,s) | x ∈ s = (xs,s)
| x ∉ s = first (x:) (foldr f (xs, {x} ∪ s) (g x))
```

*Patterns in Functional Programming*. https://patternsinfp.wordpress.com/2015/03/05/breadth-first-traversal/.

Tags: Haskell

Just a short one again today!

There’s an excellent talk by Kenneth Foner at Compose from 2016 which goes through a paper by Danvy and Goldberg (2005) called “There and Back Again” (or TABA). You should watch the talk and read the paper if you’re in any way excited by the weird and wonderful algorithms we use in functional languages to do simple things like reversing a list.

The function focused on in the paper is one which does the following:

```
zipRev :: [a] -> [b] -> [(a,b)]
= zip xs (reverse ys) zipRev xs ys
```

But does it in one pass, *without* reversing the second list. It uses a not-insignificant bit of cleverness to do it, but you can actually arrive at the same solution in a pretty straightforward way by aggressively converting everything you can to a fold. The result is the following:

```
zipRev :: [a] -> [b] -> [(a,b)]
= foldl f b ys xs
zipRev xs ys where
= []
b _ :xs) = (x,y) : k xs f k y (x
```

I have written a little more on this function and the general technique before.

The talk goes through the same stuff, but takes a turn then to proving the function total: our version above won’t work correctly if the lists don’t have the same length, so it would be nice to provide that guarantee in the types somehow. Directly translating the version from the TABA paper into one which uses length-indexed vectors will require some nasty, expensive proofs, though, which end up making the whole function quadratic. The solution in the talk is to call out to an external solver which gives some extremely slick proofs (and a very nice interface). However, yesterday I realised you needn’t use a solver at all: you can type the Haskell version just fine, and you don’t even need the fanciest of type-level features.

As ever, the solution is another fold. To demonstrate this rather short solution, we’ll first need the regular toolbox of types:

```
data Nat = Z | S Nat
data Vec (a :: Type) (n :: Nat) where
Nil :: Vec a Z
(:-) :: a -> Vec a n -> Vec a (S n)
```

And now we will write a length-indexed left fold on this vector. The key trick here is that the type passed in the recursive call changes, by composition:

`newtype (:.:) (f :: b -> Type) (g :: a -> b) (x :: a) = Comp { unComp :: f (g x) }`

Safe coercions will let us use the above type safely without a performance hit, resulting in the following linear-time function:

```
foldlVec :: forall a b n. (forall m. a -> b m -> b (S m)) -> b Z -> Vec a n -> b n
Nil = b
foldlVec f b :- xs) = unComp (foldlVec (c f) (Comp (f x b)) xs)
foldlVec f b (x where
c :: (a -> b (S m) -> b (S (S m))) -> (a -> (b :.: S) m -> (b :.: S) (S m))
= coerce
c {-# INLINE c #-}
```

We can use this function to write vector reverse:

```
reverseVec :: Vec a n -> Vec a n
= foldlVec (:-) Nil reverseVec
```

Now, to write the reversing zip, we need another newtype to put the parameter in the right place, but it is straightforward other than that.

```
newtype VecCont a b n = VecCont { runVecCont :: Vec a n -> Vec (a,b) n }
revZip :: Vec a n -> Vec b n -> Vec (a,b) n
= flip $ runVecCont .
revZip
foldlVec-> VecCont (\(x :- xs) -> (x,y) :- runVecCont k xs))
(\y k VecCont (const Nil)) (
```

Danvy, Olivier, and Mayer Goldberg. 2005. “There and Back Again.” *BRICS Report Series* 12 (3). doi:10.7146/brics.v12i3.21869. https://tidsskrift.dk/brics/article/view/21869.

Part 8 of a 10-part series on Breadth-First Traversals

{-# OPTIONS --cubical --sized-types #-} module Post where open import ../code/terminating-tricky-traversals/Post.Prelude

Just a short one today. I’m going to look at a couple of algorithms for breadth-first traversals with complex termination proofs.

In a previous post I talked about breadth-first traversals over graphs, and the difficulties that cycles cause. Graphs are especially tricky to work with in a purely functional language, because so many of the basic algorithms are described in explicitly mututing terms (i.e. “mark off a node as you see it”), with no obvious immutable translation. The following is the last algoirthm I came up with:

```
bfs :: Ord a => (a -> [a]) -> a -> [[a]]
= takeWhile (not.null) (map fst (fix (f r . push)))
bfs g r where
= ([],Set.empty) : [ ([],seen) | (_,seen) <- xs ]
push xs @((l,s):qs)
f x q| Set.member x s = q
| otherwise = (x:l, Set.insert x s) : foldr f qs (g x)
```

As difficult as it is to work with graphs in a pure functional language, it’s even *more* difficult to work in a *total* language, like Agda. Looking at the above function, there are several bits that we can see right off the bat won’t translate over easily. Let’s start with `fix`

.

We shouldn’t expect to be able to write `fix`

in Agda as-is. Just look at its Haskell implementation:

```
fix :: (a -> a) -> a
= f (fix f) fix f
```

It’s obviously non total!

(this is actually a non-memoizing version of `fix`

, which is different from the usual one)

We can write a function *like* `fix`

, though, using coinduction and sized types.

record Thunk (A : Size → Type a) (i : Size) : Type a where coinductive field force : ∀ {j : Size< i} → A j open Thunk public fix : (A : Size → Type a) → (∀ {i} → Thunk A i → A i) → ∀ {j} → A j fix A f = f λ where .force → fix A f

Coinductive types are the dual to inductive types. Totality-wise, a coinductive type must be “productive”; i.e. a coinductive list can be infinitely long, but it must be provably able to evaluate to a constructor (cons or nil) in finite time.

Sized types also help us out here: they’re quite subtle, and a little finicky to use occasionally, but they are invaluable when it comes to proving termination or productivity of complex (especially higher-order) functions. The canonical example is mapping over the following tree type:

module NonTerminating where data Tree (A : Type a) : Type a where _&_ : A → List (Tree A) → Tree A {-# TERMINATING #-} mapTree : (A → B) → Tree A → Tree B mapTree f (x & xs) = f x & map (mapTree f) xs

The compiler can’t tell that the recursive call in the `mapTree`

function will only be called on subnodes of the argument: it can’t tell that it’s structurally recursive, in other words. Annoyingly, we can fix the problem by inlining `map`

.

mutual mapTree′ : (A → B) → Tree A → Tree B mapTree′ f (x & xs) = f x & mapForest f xs mapForest : (A → B) → List (Tree A) → List (Tree B) mapForest f [] = [] mapForest f (x ∷ xs) = mapTree′ f x ∷ mapForest f xs

The other solution is to give the tree a size parameter. This way, all submodes of a given tree will have smaller sizes, which will give the compiler a finite descending chain condition it can use to prove termination.

data Tree (A : Type a) (i : Size) : Type a where _&_ : A → ∀ {j : Size< i} → List (Tree A j) → Tree A i mapTree : (A → B) → Tree A i → Tree B i mapTree f (x & xs) = f x & map (mapTree f) xs

So how do we use this stuff in our graph traversal? Well first we’ll need a coinductive Stream type:

record Stream (A : Type a) (i : Size) : Type a where coinductive field head : A tail : ∀ {j : Size< i} → Stream A j open Stream public smap : (A → B) → Stream A i → Stream B i smap f xs .head = f (xs .head) smap f xs .tail = smap f (xs .tail)

And then we can use it to write our breadth-first traversal.

bfs : ⦃ _ : IsDiscrete A ⦄ → (A → List A) → A → Stream (List A) i bfs g r = smap fst (fix (Stream _) (f r ∘ push)) where push : Thunk (Stream _) i → Stream _ i push xs .head = ([] , []) push xs .tail = smap (_,_ [] ∘ snd) (xs .force) f : _ → Stream _ i → Stream _ i f x qs with (x ∈? qs .head .snd) .does ... | true = qs ... | false = λ where .head → (x ∷ qs .head .fst , x ∷ qs .head .snd) .tail → foldr f (qs .tail) (g x)

How do we convert this to a list of lists? Well, for this condition we would actually need to prove that there are only finitely many elements in the graph. We could actually use Noetherian finiteness for this: though I have a working implementation, I’m still figuring out how to clean this up, so I will leave it for another post.

A recent paper (Nipkow and Sewell 2020) provided Coq proofs for some algorithms on Braun trees (Okasaki 1997), which prompted me to take a look at them again. This time, I came up with an interesting linear-time `toList`

function, which relies on the following peculiar type:

```
newtype Q2 a
= Q2
unQ2 :: (Q2 a -> Q2 a) -> (Q2 a -> Q2 a) -> a
{ }
```

Even after coming up with the type myself, I still can’t really make heads nor tails of it. If I squint, it starts to look like some bizarre church-encoded binary number (but I have to *really* squint). It certainly seems related to corecursive queues (Smith 2009).

Anyway, we can use the type to write the following lovely `toList`

function on a Braun tree.

```
toList :: Tree a -> [a]
= unQ2 (f t b) id id
toList t where
Node x l r) xs = Q2 (\ls rs -> x : unQ2 xs (ls . f l) (rs . f r))
f (Leaf xs = Q2 (\_ _ -> [])
f
= Q2 (\ls rs -> unQ2 (ls (rs b)) id id) b
```

So can we convert it to Agda?

Not really! As it turns out, this function is even more difficult to implement than one might expect. We can’t even *write* the `Q2`

type in Agda without getting in trouble.

{-# NO_POSITIVITY_CHECK #-} record Q2 (A : Type a) : Type a where inductive field q2 : (Q2 A → Q2 A) → (Q2 A → Q2 A) → A open Q2

`Q2`

isn’t strictly positive, unfortunately.

{-# TERMINATING #-} toList : Braun A → List A toList t = f t n .q2 id id where n : Q2 A n .q2 ls rs = ls (rs n) .q2 id id f : Braun A → Q2 (List A) → Q2 (List A) f leaf xs .q2 ls rs = [] f (node x l r) xs .q2 ls rs = x ∷ xs .q2 (ls ∘ f l) (rs ∘ f r)

Apparently this problem of strict positivity for breadth-first traversals has come up before: Berger, Matthes, and Setzer (2019); Hofmann (1993).

Update 31/01/2020

Daniel Peebles (@copumpkin on twitter) replied to my tweet about this post with the following:

Interesting! Curious

howyou came up with that weird type at the end. It doesn’t exactly feel like the first thing one might reach for and it would be interesting to see some writing on the thought process that led to it

So that’s what I’m going to add here!

Let’s take the Braun tree of the numbers 1 to 15:

```
┌8
┌4┤
│ └12
┌2┤
│ │ ┌10
│ └6┤
│ └14
1┤
│ ┌9
│ ┌5┤
│ │ └13
└3┤
│ ┌11
└7┤
└15
```

Doing a normal breadth-first traversal for the first two levels is fine (1, 2, 3): it starts to fall apart at the third level (4, 6, 5, 7). Here’s the way we should traverse it: “all of the left branches, and then all of the right branches”. So, we will have a queue of trees. We take the root element of each tree in the queue, and emit it, and then we add all of the *left* children of the trees in the queue to one queue, and then all the *right* children to another, and then concatenate them into a new queue and we start again. We can stop whenever we hit an empty tree because of the structure of the Braun tree. Here’s an ascii diagram to show what’s going on:

```
┌8 | ┌8 | ┌8 | 8
┌4┤ | ┌4┤ | 4┤ |
│ └12 | │ └12 | └12 | 9
┌2┤ | 2┤ | |
│ │ ┌10 | │ ┌10 | ┌9 | 10
│ └6┤ | └6┤ | 5┤ |
│ └14 | └14 | └13 | 11
1┤ --> -----> -------->
│ ┌9 | ┌9 | ┌10 | 12
│ ┌5┤ | ┌5┤ | 6┤ |
│ │ └13 | │ └13 | └14 | 13
└3┤ | 3┤ | |
│ ┌11 | │ ┌11 | ┌11 | 14
└7┤ | └7┤ | 7┤ |
└15 | └15 | └15 | 15
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
```

If we want to do this in Haskell, we have a number of options for how we would represent queues: as ever, though, I much prefer to use vanilla lists and time the reversals so that they stay linear. Here’s what that looks like:

```
toList :: Tree a -> [a]
= f t b [] []
toList t where
Node x l r) xs ls rs = x : xs (l : ls) (r : rs)
f (Leaf _ _ _ = []
f
= foldr f b (reverse ls ++ reverse rs) [] [] b ls rs
```

Any place we see a `foldr`

being run after a reverse or a concatenation, we know that we can remove a pass (in actual fact rewrite rules will likely do this automatically for us).

```
toList :: Tree a -> [a]
= f b t [] []
toList t where
Node x l r) xs ls rs = x : xs (l : ls) (r : rs)
f (Leaf _ _ _ = []
f
= foldl (flip f) (foldl (flip f) b rs) ls [] [] b ls rs
```

Finally, since we’re building up the lists with `:`

(in a linear way, i.e. we will not use the intermediate queues more than once), and we’re immediately consuming them with a fold, we can deforest the intermediate list, replacing every `:`

with `f`

(actually, it’s a little more tricky than that, since we replace the `:`

with the *reversed* version of `f`

, i.e. the one you would pass to `foldr`

if you wanted it to act like `foldl`

. This trick is explained in more detail in this post).

```
toList :: Tree a -> [a]
= f t b id id
toList t where
Node x l r) xs ls rs = x : xs (ls . f l) (rs . f r)
f (Leaf _ _ _ = []
f
= ls (rs b) id id b ls rs
```

Once you do that, however, you run into the “cannot construct the infinite type” error. To be precise:

`• Occurs check: cannot construct the infinite type: a3 ~ (a3 -> c0) -> (a3 -> c1) -> [a2]`

And this gives us the template for our newtype! It requires some trial and error, but you can see where some of the recursive calls are, and what you eventually get is the following:

```
newtype Q2 a
= Q2
unQ2 :: (Q2 a -> Q2 a) -> (Q2 a -> Q2 a) -> [a]
{ }
```

(You can remove the list type constructor at the end, I did as I thought it made it slightly more general). And from there we get back to the `toList`

function.

*24th International Conference on Types for Proofs and Programs (TYPES 2018)*, ed by. Peter Dybjer, José Espírito Santo, and Luís Pinto, 130:22. Leibniz International Proceedings in Informatics (LIPIcs). Dagstuhl, Germany: Schloss DagstuhlLeibniz-Zentrum fuer Informatik. doi:10.4230/LIPIcs.TYPES.2018.1. http://drops.dagstuhl.de/opus/volltexte/2019/11405.

Nipkow, Tobias, and Thomas Sewell. 2020. “Proof Pearl: Braun Trees.” In *Certified Programs and Proofs, CPP 2020*, ed by. J. Blanchette and C. Hritcu, –. ACM. http://www21.in.tum.de/~nipkow/pubs/cpp20.html.

Okasaki, Chris. 1997. “Three Algorithms on Braun Trees.” *Journal of Functional Programming* 7 (6) (November): 661–666. doi:10.1017/S0956796897002876. https://www.eecs.northwestern.edu/~robby/courses/395-495-2013-fall/three-algorithms-on-braun-trees.pdf.

*The Monad.Reader* 14 (14) (July): 28. https://meldingmonads.files.wordpress.com/2009/06/corecqueues.pdf.

In dependently typed languages, it’s often important to figure out a good “low-level” representation for some concept. The natural numbers, for instance:

`data Nat = Z | S Nat`

For “real” applications, of course, these numbers are offensively inefficient, in terms of both space and time. But that’s not what I’m after here: I’m looking for a type which best describes the essence of the natural numbers, and that can be used to prove and think about them. In that sense, this representation is second to none: it’s basically the simplest possible type which *can* represent the naturals.

Let’s nail down that idea a little better. What do we mean when a type is a “good” representation for some concept.

There should be no redundancy. The type for the natural numbers above has this property: every natural number as one (and only one) canonical representative in

`Nat`

. Compare that to the following possible representation for the integers:`data Int = Neg Nat | Pos Nat`

There are two ways to represent

`0`

here: as`Pos Z`

or`Neg Z`

.Of course, you can quotient out the redundancy in Cubical Agda, or normalise on construction every time, but either of these workarounds gets your representation a demerit.

Operations should be definable simply and directly on the representation. Points docked for converting to and from some non-normalised form.

That conversion, however, can exist, and ideally should exist, in some fundamental way. You should be able to establish an efficient isomorphism with other representations of the same concept.

Properties about the type should correspond to intuitive properties about the representation. For

`Nat`

above, this means things like order: the usual order on the natural numbers again has a straightforward analogue on`Nat`

.

With that laundry list of requirements, it’s no wonder that it’s often tricky to figure out the “right” type for a concept.

In this post, I’m going to talk about a type for the rational numbers, and I’m going to try satisfy those requirements as best I can.

Our first attempt at representing the rationals might use a fraction:

`data Frac = Integer :/ Integer`

This obviously fails the redundancy property. The fractions $\frac{1}{2}$ and $\frac{2}{4}$ represent the same number, but have different underlying values.

So the type isn’t suitable as a potential representation for the rationals. That’s not to say that this type is useless: far from it! Indeed, Haskell’s Data.Ratio uses something quite like this to implement rationals.

If you’re going to deal with redundant elements, there are two broad ways to deal with it. Data.Ratio’s approach is to normalise on construction, and only export a constructor which does this. This gives you a pretty good guarantee that there won’t be any unreduced fractions lying around in you program. Agda’s standard library also uses an approach like this, although the fact that the numerator and denominator are coprime is statically verified by way of a proof carried in the type.

The other way to deal with redundancy is by quotient. In Haskell, that kind of means doing the following:

```
instance Eq Frac where
:/ xd) == (y :/ yd) = (x * yd) == (y * xd)
(x
instance Ord Frac where
compare (x :/ xd) (y :/ yd) = compare (x * yd) (y * xd)
```

We don’t have real quotient types in Haskell, but this gets the idea across: we haven’t normalised our representation internally, but as far as anyone *using* the type is concerned, they shouldn’t be able to tell the difference between $\frac{1}{2}$ and $\frac{2}{4}$.

Th `Num`

instance is pretty much just a restating of the axioms for fractions.

`Num`

instance for `Frac`

. ```
instance Num Frac where
fromInteger n = n :/ 1
:/ xd) * (y :/ yd) = (x * y) :/ (xd * yd)
(x :/ xd) + (y :/ yd) = (x * yd + y * xd) :/ (xd * yd)
(x signum (n :/ d) = signum (n * d) :/ 1
abs n = signum n * n
:/ xd) - (y :/ yd) = (x * yd - y * xd) :/ (xd * yd) (x
```

Cubical Agda, of course, *does* have real quotient types. There, the `Eq`

instance becomes a path constructor.

```
data ℚ : Type₀ where
_÷_ : (n d : ℕ) → ℚ
: ∀ xⁿ xᵈ yⁿ yᵈ →
reduce →
xⁿ ℕ* yᵈ ≡ yⁿ ℕ* xᵈ xⁿ ÷ xᵈ ≡ yⁿ ÷ yᵈ
```

But we’ll leave the Agda stuff for another post.

Now we get to the cool stuff. To reduce a fraction, we usually do something like getting the greatest common divisor of each operand. One nice way to do that is to use Euclid’s algorithm:

```
gcd :: Natural -> Natural -> Natural
gcd n m = case compare n m of
EQ -> n
LT -> gcd n (m - n)
GT -> gcd (n - m) m
```

Let’s run that function on three different inputs: $\frac{2}{3}$, $\frac{4}{6}$, and $\frac{5}{6}$.

```
gcd 2 3 => case compare 2 3 of
LT -> gcd 2 (3 - 2) => case compare 2 1 of
GT -> gcd (2 - 1) 1 => case compare 1 1 of
EQ -> 1
gcd 4 6 => case compare 4 6 of
LT -> gcd 4 (6 - 4) => case compare 4 2 of
GT -> gcd (4 - 2) 2 => case compare 2 2 of
EQ -> 2
gcd 5 6 => case compare 5 6 of
LT -> gcd 5 (6 - 5) => case compare 5 1 of
GT -> gcd (5 - 1) 1 => case compare 4 1 of
GT -> gcd (4 - 1) 1 => case compare 3 1 of
GT -> gcd (3 - 1) 1 => case compare 2 1 of
GT -> gcd (2 - 1) 1 => case compare 1 1 of
EQ -> 1
```

Those all return the right things, but that’s not what’s interesting here: look at the chain of comparison results. For the two fractions which are equivalent, their *chains* are equal.

This turns out to hold in general. Every rational number can be (uniquely!) represented as a list of bits, where each bit is a comparison result from Euclid’s algorithm.

```
data Bit = O | I
type Rational = [Bit]
abs :: Frac -> Rational
abs = unfoldr f
where
:/ d) = case compare n d of
f (n EQ -> Nothing
LT -> Just (O, n :/ (d - n))
GT -> Just (I, (n - d) :/ d)
```

And since we used `unfoldr`

, it’s easy to reverse the algorithm to convert from the representation to a pair of numbers.

```
rep :: Rational -> Frac
= foldr f (1 :/ 1)
rep where
I (n :/ d) = (n + d) :/ d
f O (n :/ d) = n :/ (n + d) f
```

Now `abs . rep`

is the identity function, and `rep . abs`

reduces a fraction! We have identified an isomorphism between our type (a list of bits) and the rational numbers!

Well, between the positive rational numbers. Not to worry: we can add a sign before it. And, because our type doesn’t actually include 0, we don’t get the duplicate 0 problems we did with `Int`

.

```
data Q
= Neg Rational
| Zero
| Pos Rational
```

We can also define some operations on the type, by converting back and forth.

```
instance Num Rational where
fromInteger n = abs (n :/ 1)
+ ys = abs (rep xs + rep ys)
xs * ys = abs (rep xs * rep ys)
xs - ys = abs (rep xs - rep ys) xs
```

So we have a construction that has our desired property of canonicity. Even better, there’s a reasonably efficient algorithm to convert to and from it! Our next task will be examining the representation itself, and seeing what information we can get from it.

To do so we’ll turn to the subject of the title of this post: the Stern-Brocot tree.

This tree, pictured above, has some incredible properties:

- It contains every rational number (in reduced form) exactly once.
- It is a binary search tree.

Both of these properties make it an excellent candidate for basing a representation on. As it turns out, that’s what we already did! Our list of bits above is precisely a path into the Stern-Brocot tree, where every `O`

is a left turn and every `I`

right.

The most important fact we’ve gleaned so far from the Stern-Brocot tree is that our representation is lexicographically ordered. While that may not seem like much, it turns our list of bits into a progressively-narrowing interval, which generates more and more accurate estimates of the true value. When we see a `O`

at the head of the list, we know that the result must be smaller than `1`

; what follows will tell us on what side of $\frac{1}{2}$ the answer lies, and so on.

This turns out to be quite a useful property: we often don’t need *exact* precision for some calculation, but rather some approximate answer. It’s even rarer still that we know exactly how much precision we need for a given expression (which is what floating point demands). Usually, the precision we need changes quite dynamically. If a particular number plays a more influential role in some expression, for instance, its precision is more important than the others!

By producing a lazy list of bits, however, we can allow the *consumer* to specify the precision they need, by demanding those bits as they go along. (In the literature, this kind of thing is referred to as “lazy exact arithmetic”, and it’s quite fascinating. The representation presented here, however, is not very suitable for any real computation: it’s incredibly slow. There is a paper on the topic: Niqui (2007), which examines the Stern-Brocot numbers in Coq).

In proofs, the benefit is even more pronounced: finding out that a number is in a given range by just inspecting the first element of the list gives an excellent recursion strategy. We can do case analysis on: “what if it’s 1”, “what if it’s less than 1”, and “what if it’s greater than 1”, which is quite intuitive.

There’s one problem: our evaluation function is defined as a `foldr`

, and forces the accumulator at every step. We will need to figure out another evaluator which folds from the left.

So let’s look more at the “interval” interpretation of the Stern-Brocot tree. The first interval is $\left(\frac{0}{1},\frac{1}{0}\right)$: neither of these values are actually members of the type, which is why we’re not breaking any major rules with the $\frac{1}{0}$. To move left (to $\frac{1}{2}$ in the diagram), we need to use a peculiar operation called “child’s addition”, often denoted with a $\oplus$.

$\frac{a}{b} \oplus \frac{c}{d} = \frac{a+c}{b+d}$

The name comes from the fact that it’s a very common mistaken definition of addition on fractions.

Right, next steps: to move *left* in an interval, we do the following:

$\text{left} \left(\mathit{lb},\mathit{ub} \right) = \left( \mathit{lb}, \mathit{lb} \oplus \mathit{ub} \right)$

In other words, we narrow the right-hand-side of the interval. To move right is the opposite:

$\text{right} \left(\mathit{lb},\mathit{ub} \right) = \left( \mathit{lb} \oplus \mathit{ub} , \mathit{ub} \right)$

And finally, when we hit the end of the sequence, we take the *mediant* value.

$\text{mediant}\left(\mathit{lb} , \mathit{ub}\right) = \mathit{lb} \oplus \mathit{rb}$

From this, we get a straightforward left fold which can compute our fraction.

```
6 :-:
infix data Interval
= (:-:)
lb :: Frac
{ ub :: Frac
,
}
mediant :: Interval -> Frac
:/ d :-: a :/ c) = (a+b) :/ (c+d)
mediant (b
right :: Interval -> Interval
left,= lb x :-: mediant x
left x = mediant x :-: ub x
right x
rep' :: [Bit] -> Frac
= mediant . foldl f ((0 :/ 1) :-: (1 :/ 0))
rep' where
I = right a
f a O = left a f a
```

Before diving in and using this new evaluator to incrementalise our functions, let’s take a look at what’s going on behind the scenes of the “interval narrowing” idea.

It turns out that the “interval” is really a $2\times2$ square matrix in disguise (albeit a little reordered).

$\left( \frac{a}{b} , \frac{c}{d} \right) = \left( \begin{matrix} c & a \\ d & b \end{matrix} \right)$

Seen in this way, the beginning interval—$\left(\frac{0}{1} , \frac{1}{0}\right)$—is actually the identity matrix. Also, the two values in the second row of the tree correspond to special matrices which we will refer to as $L$ and $R$.

$L = \left( \begin{matrix} 1 & 0 \\ 1 & 1 \end{matrix} \right) \; R = \left( \begin{matrix} 1 & 1 \\ 0 & 1 \end{matrix} \right)$

It turns out that the left and right functions we defined earlier correspond to multiplication by these matrices.

$\text{left}(x) = xL$ $\text{right}(x) = xR$

Since matrix multiplication is associative, what we have here is a monoid. `mempty`

is the open interval at the beginning, and `mappend`

is matrix multiplication. This is the property that lets us incrementalise the whole thing, by the way: associativity allows us to decide when to start and stop the calculation.

We now have all the parts we need. First, we will write an evaluator that returns increasingly precise intervals. Our friend `scanl`

fits the requirement precisely.

```
approximate :: [Bit] -> [Interval]
= scanl f mempty
approximate where
I = right i
f i O = left i f i
```

Next, we will need to combine two of these lists with some operation on fractions.

```
interleave :: (Frac -> Frac -> Frac)
-> [Interval]
-> [Interval]
-> [Interval]
*) [xi] ys = map (\y -> x * lb y :-: x * ub y) ys
interleave (where x = mediant xi
*) (x:xs) ys@(y:_) =
interleave (*) `on` lb) x y :-: ((*) `on` ub) x y) : interleave (*) ys xs (((
```

The operation must respect orders in the proper way for this to be valid.

This pops one bit from each list in turn: one of the many possible optimisations would be to pull more information from the more informative value, in some clever way.

Finally, we have a function which incrementally runs some binary operator lazily on a list of bits.

```
quad :: (Frac -> Frac -> Frac)
-> [Bit]
-> [Bit]
-> [Bit]
*) xs ys = foldr f (unfoldr p) zs mempty
quad (where
= (interleave (*) `on` approximate) xs ys
zs
f x xs c| mediant c < lb x = I : f x xs (right c)
| mediant c > ub x = O : f x xs (left c)
| otherwise = xs c
= mediant (last zs)
t
= case compare (mediant c) t of
p c LT -> Just (I, right c)
GT -> Just (O, left c)
EQ -> Nothing
```

The function only ever inspects the next bit when it absolutely needs to.

The helper function `f`

here is the “incremental” version. `p`

takes over when the precision of the input is exhausted.

We can use this to write an addition function (with some added special cases to speed things up).

```
add :: [Bit] -> [Bit] -> [Bit]
= I : ys
add [] ys = I : xs
add xs [] I:xs) ys = I : add xs ys
add (I:ys) = I : add xs ys
add xs (= quad (+) xs ys add xs ys
```

We (could) also try and optimise the times we look for a new bit. Above we have noticed every case where one of the rationals is preceded by a whole part. After you encounter two `O`

s, in addition if the two strings are inverses of each other the result will be 1. i.e. `OOIOOI`

+ `OIOIIO`

= $\frac{1}{1}$. We could try and spot this, only testing with comparison of the mediant when the bits are the same. You’ve doubtless spotted some other possible optimisations: I have yet to look into them!

One of the other applications of lazy rationals is that they can begin to *look* like the real numbers. For instance, the `p`

helper function above is basically defined extensionally. Instead of stating the value of the number, we give a function which tells us when we’ve made something too big or too small (which sounds an awful lot like a Dedekind cut to my ears). Here’s a function which *inverts* a given function on fractions, for instance.

```
inv :: (Frac -> Frac) -> [Bit] -> [Bit]
= unfoldr f mempty
inv o n where
= fromQ n
t
= case compare (o (mediant c)) t of
f c LT -> Just (I, right c)
GT -> Just (O, left c)
EQ -> Nothing
```

Of course, the function has to satisfy all kinds of extra properties that I haven’t really thought a lot about yet, but no matter. We can use it to invert a squaring function:

```
sqrt :: [Bit] -> [Bit]
sqrt = inv (\x -> x * x)
```

And we can use *this* to get successive approximations to $\sqrt{2}$!

```
root2Approx= map (toDouble . mediant) (approximate (sqrt (abs (2 :/ 1))))
>>> mapM_ print root2Approx
1.0
2.0
1.5
1.3333333333333333
1.4
1.4285714285714286
1.4166666666666667
1.411764705882353
1.4137931034482758
1.4146341463414633
...
```

Using the Stern-Brocot tree to represent the rationals was formalised in Coq in Bertot (2003). The corresponding lazy operations are formalised in QArith. Its theory and implementation is described in Niqui (2007). Unfortunately, I found most of the algorithms impenetrably complex, so I can’t really judge how they compare to the ones I have here.

I mentioned that one of the reasons you might want lazy rational arithmetic is that it can help with certain proofs. While this is true, in general the two main reasons people reach for lazy arithmetic is efficiency and as a way to get to the real numbers.

From the perspective of efficiency, the Stern-Brocot tree is probably a bad idea. You may have noticed that the right branch of the tree contains all the whole numbers: this means that the whole part is encoded in unary. Beyond that, we generally have to convert to some fraction in order to do any calculation, which is massively expensive.

The problem is that bits in the same position in different numbers don’t necessarily correspond to the same quantities. In base 10, for instance, the numbers 561 and 1024 have values in the “ones” position of 1 and 4, respectively. We can work with those two values independent of the rest of the number, which can lead to quicker algorithms.

Looking at the Stern-Brocot encoding, the numbers $\frac{2}{3}$ and 3 are represented by `OI`

and `II`

, respectively. That second `I`

in each, despite being in the same position, corresponds to *different values*: $\frac{1}{3}$ in the first, and $\frac{3}{2}$ in the second.

Solutions to both of these problems necessitate losing the one-to-one property of the representation. We could improve the size of the representation of terms by having our $L$ and $R$ matrices be the following (Kůrka 2014):

$L = \left( \begin{matrix} 1 & 0 \\ 1 & 2 \end{matrix} \right) \; R = \left( \begin{matrix} 2 & 1 \\ 0 & 1 \end{matrix} \right)$

But now there will be gaps in the tree. This basically means we’ll have to use infinite repeating bits to represent terms like $\frac{1}{2}$.

We could solve the other problem by throwing out the Stern-Brocot tree entirely and using a more traditional positional number system. Again, this introduces redundancy: in order to represent some fraction which doesn’t divide properly into the base of the number system you have to use repeating decimals.

The second reason for lazy rational arithmetic is that it can be a crucial component in building a constructive interpretation of the real numbers. This in particular is an area of real excitement at the moment: HoTT has opened up some interesting avenues that weren’t possible before for constructing the reals (Bauer 2016).

In a future post, I might present a formalisation of these numbers in Agda. I also intend to look at the dyadic numbers.

Update 26/12/2019: thanks Anton Felix Lorenzen and Joseph C. Sible for spotting some mistakes in this post.

Bauer, Andrej. 2016. “The Real Numbers in Homotopy Type Theory.” Faro, Portugal. http://math.andrej.com/wp-content/uploads/2016/06/hott-reals-cca2016.pdf.

Bertot, Yves. 2003. “A Simple Canonical Representation of Rational Numbers.” *Electronic Notes in Theoretical Computer Science* 85 (7). Mathematics, Logic and Computation (Satellite Event of ICALP 2003) (September): 1–16. doi:10.1016/S1571-0661(04)80754-0. http://www.sciencedirect.com/science/article/pii/S1571066104807540.

Kůrka, Petr. 2014. “Exact Real Arithmetic for Interval Number Systems.” *Theoretical Computer Science* 542 (July): 32–43. doi:10.1016/j.tcs.2014.04.030. http://www.sciencedirect.com/science/article/pii/S0304397514003351.

Niqui, Milad. 2007. “Exact Arithmetic on the SternBrocot Tree.” *Journal of Discrete Algorithms* 5 (2). 2004 Symposium on String Processing and Information Retrieval (June): 356–379. doi:10.1016/j.jda.2005.03.007. http://www.sciencedirect.com/science/article/pii/S1570866706000311.

Tags: Agda

{-# OPTIONS --safe --without-K #-} module Post where open import Data.Fin using (Fin; suc; zero; _≟_) open import Data.Nat using (ℕ; suc; zero; _+_; compare; equal; greater; less) open import Data.Nat.Properties using (+-comm) open import Data.Bool using (not; T) open import Relation.Nullary using (yes; no; does; ¬_) open import Data.Product using (Σ; Σ-syntax; proj₁; proj₂; _,_) open import Data.Unit using (tt; ⊤) open import Function using (_∘_; id; _⟨_⟩_) open import Relation.Binary.PropositionalEquality using (subst; trans; cong; sym; _≡_; refl; _≢_) open import Data.Empty using (⊥-elim; ⊥) variable n m : ℕ

Here’s a puzzle: can you prove that `Fin`

is injective? That’s the type constructor, by the way, not the type itself. Here’s the type of the proof we want:

Goal : Set₁ Goal = ∀ {n m} → Fin n ≡ Fin m → n ≡ m

I’m going to present a proof of this lemma that has a couple interesting features. You should try it yourself before reading on, though: it’s difficult, but great practice for understanding Agda’s type system.

First off, I should say that it’s not really a “new” proof: it’s basically Andras Kovac’s proof, with one key change. That proof, as well as this one, goes `--without-K`

: because I actually use this proof in some work I’m doing in Cubical Agda at the moment, this was non optional. It does make things significantly harder, and disallows nice tricks like the ones used by effectfully.

The trick we’re going to use comes courtesy of James Wood. The central idea is the following type:

_≢ᶠ_ : Fin n → Fin n → Set x ≢ᶠ y = T (not (does (x ≟ y)))

This proof of inequality of `Fin`

s is different from the usual definition, which might be something like:

_≢ᶠ′_ : Fin n → Fin n → Set x ≢ᶠ′ y = x ≡ y → ⊥

Our definition is based on the decidable equality of two `Fin`

s. It also uses the standard library’s new `Dec`

type. Basically, we get better computation behaviour from our definition. It behaves as if it were defined like so:

_≢ᶠ″_ : Fin n → Fin n → Set zero ≢ᶠ″ zero = ⊥ zero ≢ᶠ″ suc y = ⊤ suc x ≢ᶠ″ zero = ⊤ suc x ≢ᶠ″ suc y = x ≢ᶠ″ y

The benefit of this, in contrast to `_≢ᶠ′_`

, is that each case becomes a definitional equality we don’t have to prove. Compare the two following proofs of congruence under `suc`

:

cong-suc″ : ∀ {x y : Fin n} → x ≢ᶠ″ y → suc x ≢ᶠ″ suc y cong-suc″ p = p cong-suc′ : ∀ {x y : Fin n} → x ≢ᶠ′ y → suc x ≢ᶠ′ suc y cong-suc′ {n = suc n} p q = p (cong fpred q) where fpred : Fin (suc (suc n)) → Fin (suc n) fpred (suc x) = x fpred zero = zero

First, we will describe an “injection” for functions from `Fin`

s to `Fin`

s.

_F↣_ : ℕ → ℕ → Set n F↣ m = Σ[ f ∈ (Fin n → Fin m) ] ∀ {x y} → x ≢ᶠ y → f x ≢ᶠ f y

We’re using the negated from of injectivity here, which is usually avoided in constructive settings. It actually works a little better for us here, though. Since we’re working in the domain of `Fin`

s, and since our proof is prop-valued, it’s almost like we’re working in classical logic.

Next, we have the workhorse of the proof, the `shrink`

lemma:

shift : (x y : Fin (suc n)) → x ≢ᶠ y → Fin n shift zero (suc y) x≢y = y shift {suc _} (suc x) zero x≢y = zero shift {suc _} (suc x) (suc y) x≢y = suc (shift x y x≢y) shift-inj : ∀ (x y z : Fin (suc n)) y≢x z≢x → y ≢ᶠ z → shift x y y≢x ≢ᶠ shift x z z≢x shift-inj zero (suc y) (suc z) y≢x z≢x neq = neq shift-inj {suc _} (suc x) zero (suc z) y≢x z≢x neq = tt shift-inj {suc _} (suc x) (suc y) zero y≢x z≢x neq = tt shift-inj {suc _} (suc x) (suc y) (suc z) y≢x z≢x neq = shift-inj x y z y≢x z≢x neq shrink : suc n F↣ suc m → n F↣ m shrink (f , inj) .proj₁ x = shift (f zero) (f (suc x)) (inj tt) shrink (f , inj) .proj₂ p = shift-inj (f zero) (f (suc _)) (f (suc _)) (inj tt) (inj tt) (inj p)

This will give us the inductive step for the overall proof. Notice the absence of any `cong`

s or the like: the computation behaviour of `≢ᶠ`

saves us on that particular front. Also we don’t have to use `⊥-elim`

at any point: again, because of the computation behaviour of `≢ᶠ`

, Agda knows that certain cases are unreachable, so we don’t even have to define them.

Next, we derive the proof that a `Fin`

cannot inject into a smaller `Fin`

.

¬plus-inj : ∀ n m → ¬ (suc (n + m) F↣ m) ¬plus-inj zero (suc m) inj = ¬plus-inj zero m (shrink inj) ¬plus-inj (suc n) m (f , inj) = ¬plus-inj n m (f ∘ suc , inj) ¬plus-inj zero zero (f , _) with f zero ... | ()

That’s actually the bulk of the proof done: the rest is Lego, joining up the pieces and types. First, we give the normal definition of injectivity:

Injective : ∀ {a b} {A : Set a} {B : Set b} → (A → B) → Set _ Injective f = ∀ {x y} → f x ≡ f y → x ≡ y _↣_ : ∀ {a b} → Set a → Set b → Set _ A ↣ B = Σ (A → B) Injective

Then we convert from one to the other:

toFin-inj : (Fin n ↣ Fin m) → n F↣ m toFin-inj f .proj₁ = f .proj₁ toFin-inj (f , inj) .proj₂ {x} {y} x≢ᶠy with x ≟ y | f x ≟ f y ... | no ¬p | yes p = ¬p (inj p) ... | no _ | no _ = tt

And finally we have our proof:

n≢sn+m : ∀ n m → Fin n ≢ Fin (suc (n + m)) n≢sn+m n m n≡m = ¬plus-inj m n (toFin-inj (subst (_↣ Fin n) (n≡m ⟨ trans ⟩ cong (Fin ∘ suc) (+-comm n m)) (id , id))) Fin-inj : Injective Fin Fin-inj {n} {m} n≡m with compare n m ... | equal _ = refl ... | less n k = ⊥-elim (n≢sn+m n k n≡m) ... | greater m k = ⊥-elim (n≢sn+m m k (sym n≡m)) _ : Goal _ = Fin-inj

All in all, the proof is about 36 lines, which is pretty short for what it does.

]]>
Part 1 of a 2-part series on Random Access Lists

Tags: Agda

“Heterogeneous Random-Access Lists” by Wouter Swierstra (2019) describes how to write a simple binary random-access list (Okasaki 1995) to use as a heterogeneous tuple. If you haven’t tried to implement the data structure described in the paper before, you might not realise the just how *elegant* the implementation is. The truth is that arriving at the definitions presented is difficult: behind every simple function is a litany of complex and ugly alternatives that had to be tried and discarded first before settling on the final answer.

In this post I want to go through a very similar structure, with special focus on the “wrong turns” in implementation which can lead to headache.

Here are a couple of important identities on ℕ:

+0 : ∀ n → n + zero ≡ n +0 zero = refl +0 (suc n) = cong suc (+0 n) +-suc : ∀ n m → n + suc m ≡ suc n + m +-suc zero m = refl +-suc (suc n) m = cong suc (+-suc n m)

These two show up all the time as proof obligations from the compiler (i.e. “couldn’t match type `n + suc m`

with `suc n + m`

”). The solution is obvious, right? `subst`

in one of the proofs above and you’re on your way. Wait! There might be a better way.

We’re going to look at reversing a vector as an example. We have a normal-looking length-indexed vector:

infixr 5 _∷_ data Vec (A : Set a) : ℕ → Set a where [] : Vec A zero _∷_ : A → Vec A n → Vec A (suc n)

Reversing a list is easy: we do it the standard way, in $\mathcal{O}(n)$ time, with an accumulator:

list-reverse : List A → List A list-reverse = go [] where go : List A → List A → List A go acc [] = acc go acc (x ∷ xs) = go (x ∷ acc) xs

Transferring over to a vector and we see our friends `+-suc`

and `+0`

.

vec-reverse₁ : Vec A n → Vec A n vec-reverse₁ xs = subst (Vec _) (+0 _) (go [] xs) where go : Vec A n → Vec A m → Vec A (m + n) go acc [] = acc go acc (x ∷ xs) = subst (Vec _) (+-suc _ _) (go (x ∷ acc) xs)

The solution, as with so many things, is to use a fold instead of explicit recursion. Folds on vectors are a little more aggressively typed than those on lists:

vec-foldr : (B : ℕ → Type b) → (∀ {n} → A → B n → B (suc n)) → B zero → Vec A n → B n vec-foldr B f b [] = b vec-foldr B f b (x ∷ xs) = f x (vec-foldr B f b xs)

We allow the output type to be indexed by the list of the vector. This is a good thing, bear in mind: we need that extra information to properly type `reverse`

.

For reverse, unfortunately, we need a *left*-leaning fold, which is a little trickier to implement than `vec-foldr`

.

vec-foldl : (B : ℕ → Set b) → (∀ {n} → B n → A → B (suc n)) → B zero → Vec A n → B n vec-foldl B f b [] = b vec-foldl B f b (x ∷ xs) = vec-foldl (B ∘ suc) f (f b x) xs

With this we can finally `reverse`

.

vec-reverse : Vec A n → Vec A n vec-reverse = vec-foldl (Vec _) (λ xs x → x ∷ xs) []

The real trick in this function is that the type of the return value changes as we fold. If you think about it, it’s the same optimisation that we make for the $\mathcal{O}(n)$ reverse on lists: the `B`

type above is the “difference list” in types, allowing us to append on to the end without $\mathcal{O}(n^2)$ proofs.

As an aside, this same trick can let us type the convolve-TABA (Danvy and Goldberg 2005; Foner 2016) function quite simply:

convolve : Vec A n → Vec B n → Vec (A × B) n convolve = vec-foldl (λ n → Vec _ n → Vec _ n) (λ { k x (y ∷ ys) → (x , y) ∷ k ys}) (λ _ → [])

Binary numbers come up a lot in dependently-typed programming languages: they offer an alternative representation of ℕ that’s tolerably efficient (well, depending on who’s doing the tolerating). In contrast to the Peano numbers, though, there are a huge number of ways to implement them.

I’m going to recommend one particular implementation over the others, but before I do I want to define a function on ℕ:

2* : ℕ → ℕ 2* zero = zero 2* (suc n) = suc (suc (2* n))

In all of the implementations of binary numbers we’ll need a function like this. It is absolutely crucial that it is defined in the way above: the other obvious definition (`2* n = n + n`

) is a nightmare for proofs.

Right, now on to some actual binary numbers. The obvious way (a list of bits) is insufficient, as it allows multiple representations of the same number (because of the trailing zeroes). Picking a more clever implementation is tricky, though. One way splits it into two types:

module OneTerminated where infixl 5 _0ᵇ _1ᵇ infixr 4 𝕓_ data 𝔹⁺ : Set where 1ᵇ : 𝔹⁺ _0ᵇ _1ᵇ : 𝔹⁺ → 𝔹⁺ data 𝔹 : Set where 𝕓0ᵇ : 𝔹 𝕓_ : 𝔹⁺ → 𝔹

𝔹⁺ is the strictly positive natural numbers (i.e. the naturals starting from 1). 𝔹 adds a zero to that set. This removes the possibility for trailing zeroes, thereby making this representation unique for every natural number.

⟦_⇓⟧⁺ : 𝔹⁺ → ℕ ⟦ 1ᵇ ⇓⟧⁺ = 1 ⟦ x 0ᵇ ⇓⟧⁺ = 2* ⟦ x ⇓⟧⁺ ⟦ x 1ᵇ ⇓⟧⁺ = suc (2* ⟦ x ⇓⟧⁺) ⟦_⇓⟧ : 𝔹 → ℕ ⟦ 𝕓0ᵇ ⇓⟧ = 0 ⟦ 𝕓 x ⇓⟧ = ⟦ x ⇓⟧⁺

The odd syntax lets us write binary numbers in the natural way:

_ : ⟦ 𝕓 1ᵇ 0ᵇ 1ᵇ ⇓⟧ ≡ 5 _ = refl _ : ⟦ 𝕓 1ᵇ 0ᵇ 0ᵇ 1ᵇ ⇓⟧ ≡ 9 _ = refl

I would actually recommend this representation for most use-cases, especially when you’re using binary numbers “as binary numbers”, rather than as an abstract type for faster computation.

Another clever representation is one I wrote about before: the “gapless” representation. This is far too much trouble for what it’s worth.

Finally, my favourite representation at the moment is *zeroless*. It has a unique representation for each number, just like the two above, but it is still a list of bits. The difference is that the bits here are 1 and 2, not 0 and 1. I like to reuse types in combination with pattern synonyms (rather than defining new types), as it can often make parallels between different functions clearer.

Bit : Set Bit = Bool pattern 1ᵇ = false pattern 2ᵇ = true 𝔹 : Set 𝔹 = List Bit

Functions like `inc`

are not difficult to implement:

inc : 𝔹 → 𝔹 inc [] = 1ᵇ ∷ [] inc (1ᵇ ∷ xs) = 2ᵇ ∷ xs inc (2ᵇ ∷ xs) = 1ᵇ ∷ inc xs

And evaluation:

_∷⇓_ : Bit → ℕ → ℕ 1ᵇ ∷⇓ xs = suc (2* xs) 2ᵇ ∷⇓ xs = suc (suc (2* xs)) ⟦_⇓⟧ : 𝔹 → ℕ ⟦_⇓⟧ = foldr _∷⇓_ zero

Since we’re working in Cubical Agda, we might as well go on and prove that 𝔹 is isomorphic to ℕ. I’ll include the proof here for completeness, but it’s not relevant to the rest of the post (although it is very short, as a consequence of the simple definitions).

⟦_⇑⟧ : ℕ → 𝔹 ⟦ zero ⇑⟧ = [] ⟦ suc n ⇑⟧ = inc ⟦ n ⇑⟧ 2*⇔1ᵇ∷ : ∀ n → inc ⟦ 2* n ⇑⟧ ≡ 1ᵇ ∷ ⟦ n ⇑⟧ 2*⇔1ᵇ∷ zero = refl 2*⇔1ᵇ∷ (suc n) = cong (inc ∘ inc) (2*⇔1ᵇ∷ n) 𝔹→ℕ→𝔹 : ∀ n → ⟦ ⟦ n ⇓⟧ ⇑⟧ ≡ n 𝔹→ℕ→𝔹 [] = refl 𝔹→ℕ→𝔹 (1ᵇ ∷ xs) = 2*⇔1ᵇ∷ ⟦ xs ⇓⟧ ; cong (1ᵇ ∷_) (𝔹→ℕ→𝔹 xs) 𝔹→ℕ→𝔹 (2ᵇ ∷ xs) = cong inc (2*⇔1ᵇ∷ ⟦ xs ⇓⟧) ; cong (2ᵇ ∷_) (𝔹→ℕ→𝔹 xs) inc⇔suc : ∀ n → ⟦ inc n ⇓⟧ ≡ suc ⟦ n ⇓⟧ inc⇔suc [] = refl inc⇔suc (1ᵇ ∷ xs) = refl inc⇔suc (2ᵇ ∷ xs) = cong (suc ∘ 2*) (inc⇔suc xs) ℕ→𝔹→ℕ : ∀ n → ⟦ ⟦ n ⇑⟧ ⇓⟧ ≡ n ℕ→𝔹→ℕ zero = refl ℕ→𝔹→ℕ (suc n) = inc⇔suc ⟦ n ⇑⟧ ; cong suc (ℕ→𝔹→ℕ n) 𝔹⇔ℕ : 𝔹 ⇔ ℕ 𝔹⇔ℕ = iso ⟦_⇓⟧ ⟦_⇑⟧ ℕ→𝔹→ℕ 𝔹→ℕ→𝔹

Now on to the data structure. Here’s its type.

infixr 5 _1∷_ _2∷_ data Array (T : ℕ → Type a) : 𝔹 → Type a where [] : Array T [] _∷_ : T (bool 0 1 d) → Array (T ∘ suc) ds → Array T (d ∷ ds) pattern _1∷_ x xs = _∷_ {d = 1ᵇ} x xs pattern _2∷_ x xs = _∷_ {d = 2ᵇ} x xs

So it is a list-like structure, which contains elements of type `T`

. `T`

is the type of trees in the array: making the array generic over the types of trees is a slight departure from the norm. Usually, we would just use a perfect tree or something:

module Prelim where Perfect : Set a → ℕ → Set a Perfect A zero = A Perfect A (suc n) = Perfect (A × A) n

By making the tree type a parameter, though, we actually *simplify* some of the code for manipulating the tree. It’s basically the same trick as the type-changing parameter in `vec-foldl`

.

As well as that, of course, we can use the array with more exotic tree types. With binomial trees, for example, we get a binomial heap:

mutual data BinomNode (A : Set a) : ℕ → Set a where binom-leaf : BinomNode A 0 binom-branch : Binomial A n → BinomNode A n → BinomNode A (suc n) Binomial : Set a → ℕ → Set a Binomial A n = A × BinomNode A n

But we’ll stick to the random-access lists for now.

The perfect trees above are actually a specific instance of a more general data type: exponentiations of functors.

_^_ : (Set a → Set a) → ℕ → Set a → Set a (F ^ zero ) A = A (F ^ suc n) A = (F ^ n) (F A) Nest : (Set a → Set a) → Set a → ℕ → Set a Nest F A n = (F ^ n) A Pair : Set a → Set a Pair A = A × A Perfect : Set a → ℕ → Set a Perfect = Nest Pair

It’s a nested datatype, built in a bottom-up way. This is in contrast to, say, the binomial trees above, which are top-down.

Our first function on the array is `cons`

, which inserts an element:

cons : (∀ n → T n → T n → T (suc n)) → T 0 → Array T ds → Array T (inc ds) cons branch x [] = x 1∷ [] cons branch x (y 1∷ ys) = branch 0 x y 2∷ ys cons branch x (y 2∷ ys) = x 1∷ cons (branch ∘ suc) y ys

Since we’re generic over the type of trees, we need to pass in the “branch” constructor (or function) for whatever tree type we end up using. Here’s how we’d implement such a branch function for perfect trees.

perf-branch : ∀ n → Perfect A n → Perfect A n → Perfect A (suc n) perf-branch zero = _,_ perf-branch (suc n) = perf-branch n

One issue here is that the `perf-branch`

function probably doesn’t optimise to the correct complexity, because the `n`

has to be scrutinised repeatedly. The alternative is to define a `cons`

for nested types, like so:

nest-cons : (∀ {A} → A → A → F A) → A → Array (Nest F A) ds → Array (Nest F A) (inc ds) nest-cons _∙_ x [] = x ∷ [] nest-cons _∙_ x (y 1∷ ys) = (x ∙ y) 2∷ ys nest-cons _∙_ x (y 2∷ ys) = x ∷ nest-cons _∙_ y ys perf-cons : A → Array (Perfect A) ds → Array (Perfect A) (inc ds) perf-cons = nest-cons _,_

Again, we’re going to keep things general, allowing multiple index types. For those index types we’ll need a type like `Fin`

but for binary numbers.

data Fin𝔹 (A : Set a) : 𝔹 → Type a where here₁ : Fin𝔹 A (1ᵇ ∷ ds) here₂ : (i : A) → Fin𝔹 A (2ᵇ ∷ ds) there : (i : A) → Fin𝔹 A ds → Fin𝔹 A (d ∷ ds) lookup : (∀ {n} → P → T (suc n) → T n) → Array T ds → Fin𝔹 P ds → T 0 lookup ind (x ∷ xs) here₁ = x lookup ind (x ∷ xs) (here₂ i) = ind i x lookup ind (x ∷ xs) (there i is) = ind i (lookup ind xs is) nest-lookup : (∀ {A} → P → F A → A) → Array (Nest F A) ds → Fin𝔹 P ds → A nest-lookup ind (x ∷ xs) here₁ = x nest-lookup ind (x ∷ xs) (here₂ i) = ind i x nest-lookup ind (x ∷ xs) (there i is) = ind i (nest-lookup ind xs is)

We’ll once more use perfect to show how these generic functions can be concretised. For the index types into a perfect tree, we will use a `Bool`

.

perf-lookup : Array (Perfect A) ds → Fin𝔹 Bool ds → A perf-lookup = nest-lookup (bool fst snd)

This next function is quite difficult to get right: a fold. We want to consume the binary array into a unary, cons-list type thing. Similarly to `foldl`

on vectors, we will need to change the return type as we fold, but we will *also* need to convert from binary to unary, *as we fold*. The key ingredient is the following function:

2^_*_ : ℕ → ℕ → ℕ 2^ zero * n = n 2^ suc m * n = 2* (2^ m * n)

It will let us do the type-change-as-you-go trick from `foldl`

, but in a binary setting. Here’s `foldr`

:

array-foldr : (B : ℕ → Type b) → (∀ n {m} → T n → B (2^ n * m) → B (2^ n * suc m)) → B 0 → Array T ds → B ⟦ ds ⇓⟧ array-foldr B c b [] = b array-foldr B c b (x 1∷ xs) = c 0 x (array-foldr (B ∘ 2*) (c ∘ suc) b xs) array-foldr B c b (x 2∷ xs) = c 1 x (array-foldr (B ∘ 2*) (c ∘ suc) b xs)

And, as you should expect, here’s how to use this in combination with the perfect trees. Here we’ll build a binary random access list from a vector, and convert back to a vector.

perf-foldr : (B : ℕ → Type b) → (∀ {n} → A → B n → B (suc n)) → ∀ n {m} → Perfect A n → B (2^ n * m) → B (2^ n * suc m) perf-foldr B f zero = f perf-foldr B f (suc n) = perf-foldr (B ∘ 2*) (λ { (x , y) zs → f x (f y zs) }) n toVec : Array (Perfect A) ds → Vec A ⟦ ds ⇓⟧ toVec = array-foldr (Vec _) (perf-foldr (Vec _) _∷_) [] fromVec : Vec A n → Array (Perfect A) ⟦ n ⇑⟧ fromVec = vec-foldr (Array (Perfect _) ∘ ⟦_⇑⟧) perf-cons []

That’s the end of the “simple” stuff! The binary random-access list I’ve presented above is about as simple as I can get it.

In this section, I want to look at some more complex (and more fun) things you can do with it. First: lenses.

Lenses aren’t super ergonomic in dependently-typed languages, but they do come with some advantages. The lens laws are quite strong, for instance, meaning that often by constructing programs using a lot of lenses gives us certain properties “for free”. Here, for instance, we can define the lenses for indexing.

open import Lenses

head : Lens (Array T (d ∷ ds)) (T (bool 0 1 d)) head .into (x ∷ _ ) .get = x head .into (_ ∷ xs) .set x = x ∷ xs head .get-set (_ ∷ _) _ = refl head .set-get (_ ∷ _) = refl head .set-set (_ ∷ _) _ _ = refl tail : Lens (Array T (d ∷ ds)) (Array (T ∘ suc) ds) tail .into (_ ∷ xs) .get = xs tail .into (x ∷ _ ) .set xs = x ∷ xs tail .get-set (_ ∷ _) _ = refl tail .set-get (_ ∷ _) = refl tail .set-set (_ ∷ _) _ _ = refl

nest-lens : (∀ {A} → P → Lens (F A) A) → Fin𝔹 P ds → Lens (Array (Nest F A) ds) A nest-lens ln here₁ = head nest-lens ln (here₂ i) = head ⋯ ln i nest-lens ln (there i is) = tail ⋯ nest-lens ln is ⋯ ln i

ind-lens : (∀ {n} → P → Lens (T (suc n)) (T n)) → Fin𝔹 P ds → Lens (Array T ds) (T 0) ind-lens ln here₁ = head ind-lens ln (here₂ i) = head ⋯ ln i ind-lens ln (there i is) = tail ⋯ ind-lens ln is ⋯ ln i

Finally, to demonstrate some of the versatility of this data structure, we’re going to implement a tree based on a *Fenwick* tree. This is a data structure for prefix sums: we can query the running total at any point, and *update* the value at a given point, in $\mathcal{O}(\log n)$ time. We’re going to make it generic over a monoid:

module _ {ℓ} (mon : Monoid ℓ) where open Monoid mon record Leaf : Set ℓ where constructor leaf field val : 𝑆 open Leaf mutual SumNode : ℕ → Set ℓ SumNode zero = Leaf SumNode (suc n) = Summary n × Summary n Summary : ℕ → Set ℓ Summary n = Σ 𝑆 (fiber (cmb n)) cmb : ∀ n → SumNode n → 𝑆 cmb zero = val cmb (suc _) (x , y) = fst x ∙ fst y Fenwick : 𝔹 → Set ℓ Fenwick = Array Summary

So it’s an array of perfect trees, with each branch in the tree containing a summary of its children. Constructing a tree is straightforward:

comb : ∀ n → Summary n → Summary n → Summary (suc n) comb n xs ys = _ , (xs , ys) , refl sing : 𝑆 → Summary 0 sing x = _ , leaf x , refl fFromVec : Vec 𝑆 n → Fenwick ⟦ n ⇑⟧ fFromVec = vec-foldr (Fenwick ∘ ⟦_⇑⟧) (cons comb ∘ sing) []

Updating a particular point involves a good bit of boilerplate, but isn’t too complex.

upd-lens : Bool → Lens (Summary (suc n)) (Summary n) upd-lens b .into (_ , xs , _) .get = ⦅pair⦆ b .into xs .get upd-lens b .into (_ , xs , _) .set x = _ , ⦅pair⦆ b .into xs .set x , refl upd-lens b .get-set _ = ⦅pair⦆ b .get-set _ upd-lens false .set-get (t , xs , p) i .fst = p i upd-lens false .set-get (t , xs , p) i .snd .fst = xs upd-lens false .set-get (t , xs , p) i .snd .snd j = p (i ∧ j) upd-lens true .set-get (t , xs , p) i .fst = p i upd-lens true .set-get (t , xs , p) i .snd .fst = xs upd-lens true .set-get (t , xs , p) i .snd .snd j = p (i ∧ j) upd-lens false .set-set _ _ _ = refl upd-lens true .set-set _ _ _ = refl top : Lens (Summary 0) 𝑆 top .into x .get = x .snd .fst .val top .into x .set y .fst = y top .into x .set y .snd .fst .val = y top .into x .set y .snd .snd = refl top .get-set _ _ = refl top .set-get (x , y , p) i .fst = p i top .set-get (x , y , p) i .snd .fst = y top .set-get (x , y , p) i .snd .snd j = p (i ∧ j) top .set-set _ _ _ = refl

update : Fin𝔹 Bool ds → Lens (Fenwick ds) 𝑆 update is = ind-lens upd-lens is ⋯ top

Finally, here’s how we get the summary up to a particular point in $\mathcal{O}(\log n)$ time:

running : (∀ n → Bool → T (suc n) → 𝑆 × T n) → (∀ n → T n → 𝑆) → Array T ds → Fin𝔹 Bool ds → 𝑆 × T 0 running l s (x ∷ xs) (there i is) = let y , ys = running (l ∘ suc) (s ∘ suc) xs is z , zs = l _ i ys in s _ x ∙ y ∙ z , zs running l s (x 1∷ xs) here₁ = ε , x running l s (x 2∷ xs) (here₂ i) = l _ i x prefix : Fenwick ds → Fin𝔹 Bool ds → 𝑆 prefix xs is = let ys , zs , _ = running ind (λ _ → fst) xs is in ys ∙ zs where ind : ∀ n → Bool → Summary (suc n) → 𝑆 × Summary n ind n false (_ , (xs , _) , _) = ε , xs ind n true (_ , ((x , _) , (y , ys)) , _) = x , (y , ys)

Danvy, Olivier, and Mayer Goldberg. 2005. “There and Back Again.” *BRICS Report Series* 12 (3). doi:10.7146/brics.v12i3.21869.

Foner, Kenneth. 2016. “’There and Back Again’ and What Happened After.” New York.

Okasaki, Chris. 1995. “Purely Functional Random-Access Lists.” In *Proceedings of the Seventh International Conference on Functional Programming Languages and Computer Architecture*, 86–95. FPCA ’95. New York, NY, USA: ACM. doi:10.1145/224164.224187.

Swierstra, Wouter. 2019. “Heterogeneous Random-Access Lists.”

Tags: Haskell

Update 5/10/2019: check the bottom of this post for some links to comments and discussion.

Beginners to Haskell are often confused as to what’s so great about the language. Much of the proselytizing online focuses on pretty abstract (and often poorly defined) concepts like “purity”, “strong types”, and (god forbid) “monads”. These things are difficult to understand, somewhat controversial, and not obviously beneficial (especially when you’ve only been using the language for a short amount of time).

The real tragedy is that Haskell (and other ML-family languages) are *packed* with simple, decades-old features like pattern matching and algebraic data types which have massive, clear benefits and few (if any) downsides. Some of these ideas are finally filtering in to mainstream languages (like Swift and Rust) where they’re used to great effect, but the vast majority of programmers out there haven’t yet been exposed to them.

This post aims to demonstrate some of these features in a simple (but hopefully not too simple) example. I’m going to write and package up a simple sorting algorithm in both Haskell and Python, and compare the code in each. I’m choosing Python because I like it and beginners like it, but also because it’s missing most of the features I’ll be demonstrating. It’s important to note I’m not comparing Haskell and Python as languages: the Python code is just there as a reference for people less familiar with Haskell. What’s more, the comparison is unfair, as the example deliberately plays to Haskell’s strengths (so I can show off the features I’m interested in): it wouldn’t be difficult to pick an example that makes Python look good and Haskell look poor.

This post is not meant to say “Haskell is great, and your language sucks”! It’s not even really about Haskell: much of what I’m talking about here applies equally well to Ocaml, Rust, etc. I’m really writing this as a response to the notion that functional features are somehow experimental, overly complex, or ultimately compromised. As a result of that idea, I feel like these features are left out of a lot of modern languages which would benefit from them. There exists a small set of simple, battle-tested PL ideas, which have been used for nearly forty years now: this post aims to demonstrate them, and argue for their inclusion in every general-purpose programming language that’s being designed today.

We’ll be using a skew heap to sort lists in both languages. The basic idea is to repeatedly insert stuff into the heap, and then repeatedly “pop” the smallest element from the heap until it’s empty. It’s not in-place, but it is $\mathcal{O}(n \log n)$, and actually performs pretty well in practice.

A Skew Heap is represented by a *binary tree*:

```
data Tree a
= Leaf
| Node a (Tree a) (Tree a)
```

```
class Tree:
def __init__(self, is_node, data, lchild, rchild):
self._is_node = is_node
self._data = data
self._lchild = lchild
self._rchild = rchild
def leaf():
return Tree(False, None, None, None)
def node(data, lchild, rchild):
return Tree(True, data, lchild, rchild)
```

I want to point out the precision of the Haskell definition: a tree is either a leaf (an empty tree), or a node, with a payload and two children. There are no special cases, and it took us one line to write (spread to 3 here for legibility on smaller screens).

In Python, we have to write a few more lines^{1}. This representation uses the `_is_node`

field is `False`

for an empty tree (a leaf). If it’s `True`

, the other fields are filled. We write some helper functions to give us constructors like the leaf and node ones for the Haskell example.

This isn’t the standard definition of a binary tree in Python, in fact it might looks a little weird to most Python people. Let’s run through some alternatives and their issues.

The standard definition:

`class Tree: def __init__(self, data, lchild, rchild): self._data = data self._lchild = lchild self._rchild = rchild`

Instead of having a separate field for “is this a leaf or a node”, the empty tree is simply

`None`

:`def leaf(): return None`

With this approach, if we define any

*methods*on a tree, they won’t work on the empty tree!`>>> leaf().size() AttributeError: 'NoneType' object has no attribute 'size'`

We’ll do inheritance! Python even has a handy

`abc`

library to help us with some of this:`from abc import ABC, abstractmethod class Tree(ABC): @abstractmethod def size(self): raise NotImplemented class Leaf(Tree): def __init__(self): pass def size(self): return 0 class Node(Tree): def __init__(self, data, lchild, rchild): self._data = data self._lchild = lchild self._rchild = rchild def size(self): return 1 + self._lchild.size() + self._rchild.size()`

Methods will now work on an empty tree, but we’re faced with 2 problems: first, this is very verbose, and pretty complex. Secondly, we can’t write a mutating method which changes a tree from a leaf to a node. In other words, we can’t write an

`insert`

method!We won’t represent a leaf as the whole

*tree*being`None`

, just the data!`def leaf(): return Tree(None, None, None)`

This (surprisingly) pops up in a few places. While it solves the problem of methods, and the mutation problem, it has a serious bug. We can’t have

`None`

as an element in the tree! In other words, if we ask our eventual algorithm to sort a list which contains`None`

, it will silently discard some of the list, returning the wrong answer.

There are yet more options (using a wrapper class), none of them ideal. Another thing to point out is that, even with our definition with a tag, we can only represent types with 2 possible states. If there was another type of node in the tree, we couldn’t simply use a boolean tag: we’d have to switch to integers (and remember the meaning of each integer), or strings! Yuck!

What Python is fundamentally missing here is *algebraic data types*. This is a way of building up all of your types out of products (“my type has this *and* this”) and sums (“my type is this *or* this”). Python can do products perfectly well: that’s what classes are. The tree class itself is the product of `Bool`

, `data`

, `Tree`

, and `Tree`

. However it’s missing an *entire half of the equation*! This is why you just can’t express binary trees as cleanly as you can in Swift, Haskell, OCaml, etc. Python, as well as a host of other languages like Go, Java, etc, will let you express *one* kind of “sum” type: “or nothing” (the null pointer). However, it’s clunky and poorly handled in all of those languages (the method problems above demonstrate the issues in Python), and doesn’t work for anything other than that one special case.

Again, there’s nothing about algebraic data types that makes them ill-suited to mainstream or imperative languages. Swift uses them, and people love them!

The core operation on skew heaps is the *skew merge*.

```
merge :: Ord a => Tree a -> Tree a -> Tree a
Leaf ys = ys
merge Leaf = xs
merge xs @(Node x xl xr) ys@(Node y yl yr)
merge xs| x <= y = Node x (merge ys xr) xl
| otherwise = Node y (merge xs yr) yl
```

```
def merge(lhs, rhs):
if not lhs._is_node:
return rhs
if not rhs._is_node:
return lhs
if lhs._data <= rhs._data:
return Tree(lhs._data,
merge(rhs, lhs._rchild),
lhs._lchild)else:
return Tree(rhs._data,
merge(lhs, rhs._rchild), rhs._lchild)
```

The standout feature here is pattern matching. In Haskell, we’re able to write the function as we might describe it: “in this case, I’ll do this, in this other case, I’ll do this, etc.”. In Python, we are forced to think of the truth tables and sequential testing. What do I mean by truth tables? Consider the following version of the Python function above:

```
def merge(lhs, rhs):
if lhs._is_node:
if rhs._is_node:
if lhs._data <= rhs._data:
return Tree(lhs._data,
merge(rhs, lhs._rchild),
lhs._lchild)else:
return Tree(rhs._data,
merge(lhs, rhs._rchild),
rhs._lchild)else:
return lhs
else:
return rhs
```

You may even write this version first: it initially seems more natural (because `_is_node`

is used in the positive). Here’s the question, though: does it do the same thing as the previous version? Are you *sure*? Which else is connected to which if? Does every if have an else? (some linters will suggest you *remove* some of the elses above, since the if-clause has a `return`

statement in it!)

The fact of the matter is that we are forced to do truth tables of every condition in our minds, rather than *saying what we mean* (as we do in the Haskell version).

The other thing we’re saved from in the Haskell version is accessing undefined fields. In the Python function, we know accessing `lhs._data`

is correct since we verified that `lhs`

is a node. But the logic to do this verification is complex: we checked if it *wasn’t* a node, and returned if that was true… so if it *is true* that `lhs`

*isn’t* a node, we would have returned, but we didn’t, so…

Bear in mind all of these logic checks happened four lines before the actual access: this can get much uglier in practice! Compare this to the Haskell version: *we only get to bind variables if we’re sure they exist*. The syntax itself prevents us from accessing fields which aren’t defined, in a simple way.

Pattern matching has existed for years in many different forms: even C has switch statements. The added feature of destructuring is available in languages like Swift, Rust, and the whole ML family. Ask for it in your language today!

Now that we have that function, we get to define others in terms of it:

```
insert :: Ord a => a -> Tree a -> Tree a
= merge (Node x Leaf Leaf) insert x
```

```
def insert(element, tree):
= merge(
tree.__dict__
node(element, leaf(), leaf()),
tree ).__dict__.copy()
```

I haven’t mentioned Haskell’s type system so far, as it’s been quite unobtrusive in the examples. And that’s kind of the point: despite more complex examples you’ll see online demonstrating the power of type classes and higher-kinded types, Haskell’s type system *excels* in these simpler cases.

`merge :: Ord a => Tree a -> Tree a -> Tree a`

Without much ceremony, this signature tells us:

- The function takes two trees, and returns a third.
- Both trees have to be filled with the same types of elements.
- Those elements must have an order defined on them.

I feel a lot of people miss the point of this particular feature. Technically speaking, this feature allows us to write fewer type signatures, as Haskell will be able to guess most of them. Coming from something like Java, you might think that that’s an opportunity to shorten up some verbose code. It’s not! You’ll rarely find a Haskell program these days missing top-level type signatures: it’s easier to read a program with explicit type signatures, so people are advised to put them as much as possible.

(Amusingly, I often find older Haskell code snippets which are entirely devoid of type signatures. It seems that programmers were so excited about Hindley-Milner type inference that they would put it to the test as often as they could.)

Type inference in Haskell is actually useful in a different way. First, if I write the *implementation* of the `merge`

function, the compiler will tell *me* the signature, which is extremely helpful for more complex examples. Take the following, for instance:

`= ((x * 2) ^ 3) / 4 f x `

Remembering precisely which numeric type `x`

needs to be is a little difficult (`Floating`

? `Real`

? `Fractional`

?), but if I just ask the compiler it will tell me without difficulty.

The second use is kind of the opposite: if I have a hole in my program where I need to fill in some code, Haskell can help me along by telling me the *type* of that hole automatically. This is often enough information to figure out the entire implementation! In fact, there are some programs which will use this capability of the type checker to fill in the hole with valid programs, synthesising your code for you.

So often strong type systems can make you feel like you’re fighting more and more against the compiler. I hope these couple examples show that it doesn’t have to be that way.

The next function is “pop-min”:

```
popMin :: Ord a => Tree a -> Maybe (a, Tree a)
Leaf = Nothing
popMin Node x xl xr) = Just (x, merge xl xr) popMin (
```

```
def popMin(tree):
if tree._is_node:
= tree._data
res = merge(
tree.__dict__
tree._lchild,
tree._rchild
).__dict__.copy()return res
else:
raise IndexError
```

At first glance, this function should be right at home in Python. It *mutates* its input, and it has an error case. The code we’ve written here for Python is pretty idiomatic, also: other than the ugly deep copy, we’re basically just mutating the object, and using an exception for the exceptional state (when the tree is empty). Even the exception we use is the same exception as when you try and `pop()`

from an empty list.

The Haskell code here mainly demonstrates a difference in API style you’ll see between the two languages. If something isn’t found, we just use `Maybe`

. And instead of mutating the original variable, we return the new state in the second part of a tuple. What’s nice about this is that we’re only using simple core features like algebraic data types to emulate pretty complex features like exceptions in Python.

You may have heard that “Haskell uses monads to do mutation and exceptions”. This is not true. Yes, state and exceptions have patterns which technically speaking are “monadic”. But make no mistake: when we want to model “exceptions” in Haskell, we really just return a maybe (or an either). And when we want to do “mutation”, we return a tuple, where the second element is the updated state. You don’t have to understand monads to use them, and you certainly don’t “need” monads to do them. To drive the point home, the above code could actually equivalently have a type which mentions “the state monad” and “the maybe monad”:

`popMin :: Ord a => StateT (Tree a) Maybe a`

But there’s no need to!

The main part of our task is now done: all that is left is to glue the various bits and pieces together. Remember, the overall algorithm builds up the heap from a list, and then tears it down using `popMin`

. First, then, to build up the heap.

```
listToHeap :: Ord a => [a] -> Tree a
= foldr insert Leaf listToHeap
```

```
def listToHeap(elements):
= leaf()
res for el in elements:
insert(el, res)return res
```

To my eye, the Haskell code here is significantly more “readable” than the Python. I know that’s a very subjective judgement, but `foldr`

is a function so often used that it’s immediately clear what’s happening in this example.

Why didn’t we use a similar function in Python, then? We actually could have: python does have an equivalent to `foldr`

, called `reduce`

(it’s been relegated to functools since Python 3 (also technically it’s equivalent to `foldl`

, not `foldr`

)). We’re encouraged *not* to use it, though: the more pythonic code uses a for loop. Also, it wouldn’t work for our use case: the `insert`

function we wrote is *mutating*, which doesn’t gel well with `reduce`

.

I think this demonstrates another benefit of simple, functional APIs. If you keep things simple, and build things out of functions, they’ll tend to glue together *well*, without having to write any glue code yourself. The for loop, in my opinion, is “glue code”. The next function, `heapToList`

, illustrates this even more so:

```
heapToList :: Ord a => Tree a -> [a]
= unfoldr popMin heapToList
```

```
def heapToList(tree):
= []
res try:
while True:
res.append(popMin(tree))except IndexError:
return res
```

Again, things are kept simple in the Haskell example. We’ve stuck to data types and functions, and these data types and functions mesh well with each other. You might be aware that there’s some deep and interesting mathematics behind the `foldr`

and `unfoldr`

functions going on, and how they relate. We don’t need to know any of that here, though: they just work together well.

Again, Python does have a function which is equivalent to `unfoldr`

: `iter`

has an overload which will repeatedly call a function until it hits a sentinel value. But this doesn’t fit with the rest of the iterator model! Most iterators are terminated with the `StopIteration`

exception; ours (like the `pop`

function on lists) is terminated by the `IndexError`

exception; and this function excepts a third version, terminated by a sentinel!

Finally, let’s write `sort`

:

```
sort :: Ord a => [a] -> [a]
sort = heapToList . listToHeap
```

```
def sort(elements):
return heapToList(listToHeap(elements))
```

This is just driving home the point: programs work *well* when they’re built out of functions, and you *want* your language to encourage you to build things out of functions. In this case, the `sort`

function is built out of two smaller ones: it’s the *essence* of function composition.

So I fully admit that laziness is one of the features of Haskell that does have downsides. I don’t think every language should be lazy, but I did want to say a little about it in regards to the sorting example here.

I tend to think that people overstate how hard it makes reasoning about space: it actually follows pretty straightforward rules, which you can generally step through in yourself (compared to, for instance, rewrite rules, which are often black magic!)

In modern programming, people will tend to use laziness it anyway. Python is a great example: the itertools library is almost entirely lazy. Actually making use of the laziness, though, is difficult and error-prone. Above, for instance, the `heapToList`

function is lazy in Haskell, but strict in Python. Converting it to a lazy version is not the most difficult thing in the world:

```
def heapToList(tree):
try:
while True:
yield popMin(tree)
except IndexError:
pass
```

But now, suddenly, the entire list API won’t work. What’s more, if we try and access the *first* element of the returned value, we mutate the whole thing: anyone else looking at the output of the generator will have it mutated out from under them!

Laziness fundamentally makes this more reusable. Take our `popMin`

function: if we just want to view the smallest element, without reconstructing the rest of the tree, we can actually use `popMin`

as-is. If we don’t use the second element of the tuple we don’t pay for it. In Python, we need to write a second function.

Testing the `sort`

function in Haskell is ridiculously easy. Say we have an example sorting function that we trust, maybe a slow but obvious insertion sort, and we want to make sure that our fast heap sort here does the same thing. This is the test:

`-> sort (xs :: [Int]) === insertionSort xs) quickCheck (\xs `

In that single line, the QuickCheck library will automatically generate random input, run each sort function on it, and compare the two outputs, giving a rich diff if they don’t match.

This post was meant to show a few features like pattern-matching, algebraic data types, and function-based APIs in a good light. These ideas aren’t revolutionary any more, and plenty of languages have them, but unfortunately several languages don’t. Hopefully the example here illustrates a little why these features are good, and pushes back against the idea that algebraic data types are too complex for mainstream languages.

This got posted to /r/haskell and hackernews. You can find me arguing in the comments there a little bit: I’m `oisdk`

on hackernews and u/foBrowsing on reddit.

There are two topics that came up a bunch that I’d like to add to this post. First I’ll just quote one of the comments from Beltiras:

Friend of mine is always trying to convert me. Asked me to read this yesterday evening. This is my take on the article:

Most of my daily job goes into gluing services (API endpoints to databases or other services, some business logic in the middle). I don’t need to see yet another exposition of how to do algorithmic tasks. Haven’t seen one of those since doing my BSc. Show me the tools available to write a daemon, an http server, API endpoints, ORM-type things and you will have provided me with tools to tackle what I do. I’ll never write a binary tree or search or a linked list at work.

If you want to convince me, show me what I need to know to do what I do.

and my response:

I wasn’t really trying to convince anyone to use Haskell at their day job: I am just a college student, after all, so I would have no idea what I was talking about!

I wrote the article a while ago after being frustrated using a bunch of Go and Python at an internship. Often I really wanted simple algebraic data types and pattern-matching, but when I looked up why Go didn’t have them I saw a lot of justifications that amounted to “functional features are too complex and we’re making a simple language. Haskell is notoriously complex”. In my opinion, the

`res, err := fun(); if err != nil`

(for example) pattern was much more complex than the alternative with pattern-matching. So I wanted to write an article demonstrating that, while Haskell has a lot of out-there stuff in it, there’s a bunch of simple ideas which really shouldn’t be missing from any modern general-purpose language.

As to why I used a binary tree as the example, I thought it was pretty self-contained, and I find skew heaps quite interesting.

The second topic was basically people having a go at my ugly Python; to which I say: fair enough! It is not my best. I wasn’t trying necessarily to write the best Python I could here, though, rather I was trying to write the “normal” implementation of a binary tree. If I was to implement a binary tree of some sort myself, though, I would certainly write it in an immutable style rather than the style here. Bear in mind as well that much of what I’m arguing for is stylistic: I think (for instance) that it would be better to use `reduce`

in Python more, and I think the move away from it is a bad thing. So of course I’m not going to use reduce when I’m showing the Python version: I’m doing a compare and contrast!

Yes, I know about the new dataclasses feature. However, it’s wrapped up with the (also new) type hints module, and as such is much more complicated to use. As the purpose of the Python code here is to provide something of a lingua franca for non-Haskellers, I decided against using it. That said, the problems outlined are

*not*solved by dataclasses.↩︎

Tags: Agda

I recently finished my undergrad degree in UCC. I’m putting my final-year project up here for reference purposes.

Here is the pdf.

And here’s a bibtext entry:

```
@thesis{kidney_automatically_2019,
address = {Cork, Ireland},
type = {Bachelor thesis},
title = {Automatically and {Efficiently} {Illustrating} {Polynomial} {Equalities} in {Agda}},
url = {https://doisinkidney.com/pdfs/bsc-thesis.pdf},
abstract = {We present a new library which automates the construction of equivalence proofs between polynomials over commutative rings and semirings in the programming language Agda [20]. It is signi cantly faster than Agda’s existing solver. We use re ection to provide a sim- ple interface to the solver, and demonstrate how to use the constructed proofs to provide step-by-step solutions.},
language = {en},
school = {University College Cork},
author = {Kidney, Donnacha Oisín},
month = apr,
year = {2019}
}
```

Tags: Python

This post is a write-up of a solution to part of a programming puzzle I did yesterday. It’s a little different than the usual “solution + theory” approach, though: I’m going to talk about the actual steps you’d need to take to get to the solution (i.e. what to google, what intermediate code looks like, etc.). Often write ups like this are presented as finished artefacts, with little info on the tricks or techniques the author used to jog their intuition into figuring out the puzzle (or where some intermediate step requires a leap of insight). In actual fact, this particular puzzle requires almost no insight *at all*: I’m going to show how to get to a working solution without understanding any of the theory behind it!

Spoilers ahead for the google foobar problem “Distract the Guards”.

We’re interested in a particular type of sequences of pairs of numbers. These sequences are generated from a starting pair $n$ and $m$ like so:

If $n$ and $m$ are equal, the sequence stops.

Otherwise, the smaller number is subtracted from the larger, and then the smaller is doubled, and the sequence continues with these two numbers.

Here’s an example starting with 3 and 5:

```
3, 5
6, 2
4, 4
---- done ----
```

Once it hits `4, 4`

, the first condition is met, and the sequence stops. Not all of these sequences stop, however:

```
1, 4
2, 3
1, 4
---- done ----
```

As you can see, in this case we loop back around to `1, 4`

: our task is to figure out, given a pair of numbers, whether the sequence generated by them loops forever, or stops at some point.

This step is crucial: before trying to figure out any of the deep mathematics behind the problem, write the dumbest thing that could work. You’re going to need it, anyway, to test your faster versions against, and besides, it might be good enough as-is!

```
def sequence(n,m):
while n != m:
yield (n,m)
if n < m:
-= n
m *= 2
n else:
-= m
n *= 2
m
def loops(xs):
= set()
seen for x in xs:
if x in seen:
return True
else:
seen.add(x)return False
def solution(n,m):
return loops(sequence(n,m))
```

The first function actually generates the sequence we’re interested in: it uses python’s generators to do so. The second function is just a generic function that checks a sequence for duplicates. Finally, the last function answers the question we’re interested in.

Next, we want to try and spot some patterns in the answers the function generates. Remember, we’re not really interested in figuring out the theory at this point: if we find out that a loop only happens when both numbers are even (for instance), that’s good enough for us and we can stop there!

We humans are pattern-matching machines: to leverage our abilities, though, we will need to visualise the data somehow. In this case, I’m going to plot a simple scatter graph to the terminal, using the following code (I apologise for my terrible indentation style):

```
print(
'\n'.join(
''.join(
'*' if solution(x,y) else ' '
for x in range(1,81)
)for y in range(100,0,-1)
) )
```

And we get the following output:

```
*************************** ******************************* ********************
**************************** *** *********** ******************************* ***
************* *************** **************************************************
****************************** *************************************************
******************************* ************************************************
******************************** *********************** ******* ***************
********************************* **********************************************
** *************************** *** *********************************************
*********************************** ********************************************
************ ******* *************** *******************************************
***** *********************** ******* *************** *************** **********
************************************** *****************************************
*************************************** ****************************************
******** ******************* *********** ***************************************
***************************************** **************************************
****************************************** ******* *********************** *****
*********** *************** *************** ************************************
******************************************** ***********************************
********************************************* **********************************
************** *********** ******************* *************** *****************
*********************************************** *******************************
************************************************ ***************************** *
***************** ******* *********************** *************************** **
********** *********************** *************** ************************* ***
*************************************************** *********************** ****
**** *************** *** ******************* ******* ********************* *****
***************************************************** ******************* ******
****************************************************** ***************** *******
*********************** ******************************* *************** ********
******************************************************** ************* *********
********* ******************************* *************** *********** **********
********************** *** ******************************* ********* ***********
*********************************************************** ******* ************
************************************************************ ***** *************
********************* ******* ******************************* *** **************
************** *********************** *********************** * ***************
*************************************************************** ****************
******* *********** *********** *************** ************* * ***************
* *********************************************************** *** **************
** ********************************************************* ***** *************
*** *************** *************** *********************** ******* ************
**** ***************************************************** ********* ***********
***** *************************************************** *********** **********
****** *********** ******************* ***************** ************* *********
******* *********************************************** *************** ********
******** *************** ******* ********************* ***************** *******
********* ******* *********************** *********** ******************* ******
********** ***************************************** ********************* *****
*********** *************************************** *********************** ****
************ *** *************************** ***** ************************* ***
************* *************** ******************* *************************** **
****** ******* ********************************* ************* *************** *
*************** ******************************* *******************************
**************** ***************************** *********************************
***************** *************************** **********************************
** *********** *** ******* ******* ********* ***** *********************** *****
******************* *********************** ************************************
******************** ********************* *************************************
***** ******* ******* ******************* *********** *************** **********
********************** ***************** ***************************************
*********************** *************** ****************************************
******** *** *********** ************* ***************** ******* ***************
************************* *********** ******************************************
************************** ********* *******************************************
*********** *************** ******* *********************** ********************
**** *************** ******* ***** ********* ******************************* ***
***************************** *** **********************************************
********** *** *************** * ********************* ******* *****************
******************************* ************************************************
***************************** * ***********************************************
* ******* ******* *********** *** *************** *************** **************
** ************************* ***** *********************************************
*** *********************** ******* ********************************************
**** *** *********** ***** ********* ******* *********************** ***********
***** ******************* *********** *************************************** **
****** ******* ********* ************* *************** ******************* *****
******* *************** *************** ******************************* ********
******** ************* ***************** *************************** ***********
********* *********** ******************* *********************** **************
** *** *** ********* ***** ******* ******* ******************* *********** *****
*********** ******* *********************** *************** ********************
************ ***** ************************* *********** ***********************
***** ******* *** *********** *************** ******* *********************** **
************** * ***************************** *** *****************************
*************** ******************************* ********************************
*** *** ***** * ******* ******* *********** *** *************** ***************
* *********** *** *********************** ******* ******************************
** ********* ***** ******************* *********** *****************************
*** ******* ******* *************** *************** ****************************
**** ***** ********* *********** ******************* *********************** ***
***** *** *********** ******* *********************** *************** **********
** *** * ***** ******* *** *********** *************** ******* *****************
******* *************** ******************************* ************************
***** * *********** *** *********************** ******* ***********************
* *** *** ******* ******* *************** *************** **********************
** * ***** *** *********** ******* *********************** *************** *****
*** ******* *************** ******************************* ********************
* * *** *** ******* ******* *************** *************** *******************
* *** ******* *************** ******************************* ******************
* *** ******* *************** ******************************* *****************
```

There’s a clear pattern there, but it might be easier to see if we inverted it, plotting those things which *don’t* loop:

```
* *
* * * *
* *
*
*
* * *
*
* * *
*
* * *
* * * * *
*
*
* * *
*
* * *
* * *
*
*
* * * *
* *
* *
* * * *
* * * *
* *
* * * * * *
* *
* *
* * *
* *
* * * *
* * * *
* *
* *
* * * *
* * * *
*
* * * * * * *
* * *
* * *
* * * * *
* * *
* * *
* * * * *
* * *
* * * * *
* * * * *
* * *
* * *
* * * * *
* * * *
* * * * *
* * *
* *
* *
* * * * * * * *
* *
* *
* * * * * *
* *
* *
* * * * * *
* *
* *
* * * *
* * * * * *
* *
* * * * * *
*
* * *
* * * * * * *
* * *
* * *
* * * * * * *
* * * *
* * * * * *
* * * *
* * * *
* * * *
* * * * * * * * *
* * * *
* * * *
* * * * * * *
* * * *
* *
* * * * * * * * * *
* * * * *
* * * * *
* * * * *
* * * * * *
* * * * * *
* * * * * * * * *
* * *
* * * * * * *
* * * * * * *
* * * * * * * *
* * * *
* * * * * * * * *
* * * * *
* * * * * *
```

For this kind of thing it’s also worth getting familiar with gnuplot.

The clearest pattern in the graph above is the straight lines coming from the origin. This tells me, straight away, that we have an opportunity for optimisation if we wanted to memoize. We can’t yet be sure, but it *looks* like every point belongs to one of these straight lines. That means that once we find a non-looping pair like `3, 5`

, we can extend that line out to `6, 10`

and `9, 15`

, etc.

We can also see that the graph has a symmetry through the line `x = y`

. This means that if `3, 5`

doesn’t loop, neither does `5, 3`

.

Both of these techniques allow us to reduce the arguments to a canonical form, making the memoization table smaller, also. In code:

```
from fractions import Fraction
def canonical(n,m):
= Fraction(n,m) if n <= m else Fraction(m,n)
f return (f.numerator, f.denominator)
= {}
memo_dict
def solution(n,m):
= canonical(n, m)
c try:
return memo_dict[c]
except KeyError:
= loops(sequence(*c))
r = r
memo_dict[c] return r
```

Now that we have our faster version, we want to be able to quickly check that it’s equivalent to the slow. While Python is usually great for programming puzzles, this step in particular is crying out for something like QuickCheck: without it, we’ll have to roll our own.

```
from random import randrange
for _ in range(1000):
= randrange(1,10000), randrange(1,10000)
x, y if solution_new(x,y) != solution_old(x,y):
print(x,y)
```

We’re not looking for certainty here, just something that will quickly spot an error if one exists.

Now that we’ve made some of the more obvious optimisations, it’s time to move on to finding another pattern in the output. To do this, we’ll use oeis.org. We want to find if the pairs which pass our test follow some sequence which has a simple generating function which we can adapt into a test.

Since the things we’re testing are pairs, rather than individual numbers, we’ll have to fix one of them and see if there’s a pattern in the other.

`print([x for x in range(1,101) if not solution(1, x)])`

This prints the following sequence:

`[1, 3, 7, 15, 31, 63]`

And when we search for it on oeis, we get this as the top result:

```
0, 1, 3, 7, 15, 31, 63, 127, 255, 511, 1023, 2047, 4095, 8191, 16383, 32767,
65535, 131071, 262143, 524287, 1048575, 2097151, 4194303, 8388607, 16777215,
33554431, 67108863...
```

And looking at the comments under the sequence, we see the following:

Numbers n for which the expression 2^n/(n+1) is an integer. - Paolo P. Lava, May 12 2006

A test for members of the sequence, all packaged up for us!

But how do we generalise to pairs other than 1? Well, as a total guess, we can see that 1 appears in one place in the formula: why not replace that with the other member of the pair?

After that, we get the following function to test:

```
def solution(n,m):
= canonical(n,m)
nc, mc return bool((2 ** mc) % (nc + mc))
```

And it works!

This last step is pretty straightforward: see if there’s an algorithm already out there that solves your problem. In our case, taking the modulus is still pretty slow, but it turns out that modular exponentiation (i.e. computing expressions of the form `x^y mod z`

) can be done faster than the naive way. In fact, python provides this algorithm as a function in the standard library, making our last version of the function the following:

```
def solution(n,m):
= canonical(n,m)
nc, mc return bool(pow(2, mc, nc + mc))
```

I’m not sure if this function is fully correct, but it was accepted as a solution to the puzzle.

Anyway, in conclusion: you can get quite far through a programming puzzle by applying some educated guesses and googling!

]]>
Part 7 of a 10-part series on Breadth-First Traversals

Tags: Haskell

Currently, we have several different ways to enumerate a tree in breadth-first order. The typical solution (which is the usual recommended approach in imperative programming as well) uses a *queue*, as described by Okasaki (2000). If we take the simplest possible queue (a list), we get a quadratic-time algorithm, with an albeit simple implementation. The next simplest version is to use a banker’s queue (which is just a pair of lists). From this version, if we inline and apply identities like the following:

`foldr f b . reverse = foldl (flip f) b`

We’ll get to the following definition:

```
bfe :: Forest a -> [a]
= foldr f b ts []
bfe ts where
Node x xs) fw bw = x : fw (xs : bw)
f (
= []
b [] = foldl (foldr f) b qs [] b qs
```

We can get from this function to others (like one which uses a corecursive queue, and so on) through a similar derivation. I might some day write a post on each derivation, starting from the simple version and demonstrating how to get to the more efficient at each step.

For today, though, I’m interested in the *traversal* of a rose tree. Traversal, here, of course, is in the applicative sense.

Thus far, I’ve managed to write linear-time traversals, but they’ve been unsatisfying. They work by enumerating the tree, traversing the effectful function over the list, and then rebuilding the tree. Since each of those steps only takes linear time, the whole thing is indeed a linear-time traversal, but I hadn’t been able to fuse away the intermediate step.

The template for the algorithm I want comes from the `Phases`

applicative (Easterly 2019):

```
data Phases f a where
Lift :: f a -> Phases f a
(:<*>) :: f (a -> b) -> Phases f a -> Phases f b
```

We can use it to write a breadth-first traversal like so:

```
bft :: Applicative f => (a -> f b) -> Tree a -> f (Tree b)
= runPhases . go
bft f where
Node x xs) = liftA2 Node (Lift (f x)) (later (traverse go xs)) go (
```

The key component that makes this work is that it combines applicative effects in parallel:

```
instance Functor f => Functor (Phases f) where
fmap f (Lift x) = Lift (fmap f x)
fmap f (fs :<*> xs) = fmap (f.) fs :<*> xs
instance Applicative f => Applicative (Phases f) where
pure = Lift . pure
Lift fs <*> Lift xs = Lift (fs <*> xs)
:<*> gs) <*> Lift xs = liftA2 flip fs xs :<*> gs
(fs Lift fs <*> (xs :<*> ys) = liftA2 (.) fs xs :<*> ys
:<*> gs) <*> (xs :<*> ys) = liftA2 c fs xs :<*> liftA2 (,) gs ys
(fs where
~(x,y) = f x (g y) c f g
```

We’re also using the following helper functions:

```
runPhases :: Applicative f => Phases f a -> f a
Lift x) = x
runPhases (:<*> xs) = fs <*> runPhases xs
runPhases (fs
later :: Applicative f => Phases f a -> Phases f a
= (:<*>) (pure id) later
```

The problem is that it’s quadratic: the `traverse`

in:

`Node x xs) = liftA2 Node (Lift (f x)) (later (traverse go xs)) go (`

Hides some expensive calls to `<*>`

.

The problem with the `Phases`

traversal is actually analogous to another function for enumeration: `levels`

from Gibbons (2015).

```
levels :: Tree a -> [[a]]
Node x xs) = [x] : foldr lzw [] (map levels xs)
levels (where
= ys
lzw [] ys = xs
lzw xs [] :xs) (y:ys) = (x ++ y) : lzw xs ys lzw (x
```

`lzw`

takes the place of `<*>`

here, but the overall issue is the same: we’re zipping at every point, making the whole thing quadratic.

However, from the above function we *can* derive a linear time enumeration. It looks like this:

```
levels :: Tree a -> [[a]]
= f ts []
levels ts where
Node x xs) (q:qs) = (x:q) : foldr f qs xs
f (Node x xs) [] = [x] : foldr f [] xs f (
```

Our objective is clear, then: try to derive the linear-time implementation of `bft`

from the quadratic, in a way analogous to the above two functions. This is actually relatively straightforward once the target is clear: the rest of this post is devoted to the derivation.

First, we start off with the original `bft`

.

```
bft :: Applicative f => (a -> f b) -> Tree a -> f (Tree b)
= runPhases . go
bft f where
Node x xs) = liftA2 Node (Lift (f x)) (later (traverse go xs)) go (
```

`traverse`

. ```
bft :: Applicative f => (a -> f b) -> Tree a -> f (Tree b)
= runPhases . go
bft f where
Node x xs) = liftA2 Node (Lift (f x)) (later (go' xs))
go (= foldr (liftA2 (:) . go) (pure []) go'
```

`go''`

. ```
bft :: Applicative f => (a -> f b) -> Tree a -> f (Tree b)
= runPhases . go
bft f where
Node x xs) = liftA2 Node (Lift (f x)) (later (go' xs))
go (= foldr go'' (pure [])
go' Node x xs) ys = liftA2 (:) (liftA2 Node (Lift (f x)) (later (go' xs))) ys go'' (
```

`go'`

(and rename `go''`

to `go'`

)
```
bft :: Applicative f => (a -> f b) -> Tree a -> f (Tree b)
= runPhases . go
bft f where
Node x xs) = liftA2 Node (Lift (f x)) (later (foldr go' (pure []) xs))
go (Node x xs) ys = liftA2 (:) (liftA2 Node (Lift (f x)) (later (foldr go' (pure []) xs))) ys go' (
```

`liftA2`

```
bft :: Applicative f => (a -> f b) -> Tree a -> f (Tree b)
= runPhases . go
bft f where
Node x xs) = liftA2 Node (Lift (f x)) (later (foldr go' (pure []) xs))
go (Node x xs) ys = liftA2 (:) (fmap Node (f x) :<*> (foldr go' (pure []) xs)) ys go' (
```

`liftA2`

(pattern-matching on `ys`

)
```
bft :: Applicative f => (a -> f b) -> Tree a -> f (Tree b)
= runPhases . go
bft f where
Node x xs) = liftA2 Node (Lift (f x)) (later (foldr go' (pure []) xs))
go (Node x xs) (Lift ys) = fmap (((:).) . Node) (f x) :<*> (foldr go' (pure []) xs) <*> Lift ys
go' (Node x xs) (ys :<*> zs) = fmap (((:).) . Node) (f x) :<*> (foldr go' (pure []) xs) <*> ys :<*> zs go' (
```

`<*>`

. ```
bft :: Applicative f => (a -> f b) -> Tree a -> f (Tree b)
= runPhases . go
bft f where
Node x xs) = liftA2 Node (Lift (f x)) (later (foldr go' (pure []) xs))
go (Node x xs) (Lift ys) = liftA2 flip (fmap (((:).) . Node) (f x)) ys :<*> foldr go' (pure []) xs
go' (Node x xs) (ys :<*> zs) = liftA2 c (fmap (((:).) . Node) (f x)) ys :<*> liftA2 (,) (foldr go' (pure []) xs) zs
go' (where
~(x,y) = f x (g y) c f g
```

`liftA2`

with `fmap`

```
bft :: Applicative f => (a -> f b) -> Tree a -> f (Tree b)
= runPhases . go
bft f where
Node x xs) = liftA2 Node (Lift (f x)) (later (foldr go' (pure []) xs))
go (Node x xs) (Lift ys) = liftA2 (flip . (((:).) . Node)) (f x) ys :<*> foldr go' (pure []) xs
go' (Node x xs) (ys :<*> zs) = liftA2 (c . (((:).) . Node)) (f x) ys :<*> liftA2 (,) (foldr go' (pure []) xs) zs
go' (where
~(x,y) = f x (g y) c f g
```

```
bft :: Applicative f => (a -> f b) -> Tree a -> f (Tree b)
= go
bft f where
Node x xs) = liftA2 Node (f x) (runPhases (foldr go' (pure []) xs))
go (
Node x xs) (Lift ys) = liftA2 (\y zs ys -> Node y ys : zs) (f x) ys :<*> foldr go' (pure []) xs
go' (Node x xs) (ys :<*> zs) = liftA2 c (f x) ys :<*> liftA2 (,) (foldr go' (pure []) xs) zs
go' (where
~(ys,z) = Node y ys : g z c y g
```

At this point, we actually hit a wall: the expression

`foldr go' (pure []) xs) zs liftA2 (,) (`

Is what makes the whole thing quadratic. We need to find a way to thread that `liftA2`

along with the fold to get it to linear. This is the only real trick in the derivation: I’ll use polymorphic recursion to avoid the extra zip.

```
bft :: forall f a b. Applicative f => (a -> f b) -> Tree a -> f (Tree b)
= go
bft f where
Node x xs) = liftA2 (\y (ys,_) -> Node y ys) (f x) (runPhases (foldr go' (pure ([],())) xs))
go (
go' :: forall c. Tree a -> Phases f ([Tree b], c) -> Phases f ([Tree b], c)
Node x xs) ys@(Lift _) = fmap (\y -> first (pure . Node y)) (f x) :<*> foldr go' ys xs
go' (Node x xs) (ys :<*> zs) = liftA2 c (f x) ys :<*> foldr go' (fmap ((,) []) zs) xs
go' (where
~(ys,z) = first (Node y ys:) (g z) c y g
```

And that’s it!

We can finally write a slightly different version that avoids some unnecessary `fmap`

s by basing `Phases`

on `liftA2`

rather than `<*>`

.

```
data Levels f a where
Now :: a -> Levels f a
Later :: (a -> b -> c) -> f a -> Levels f b -> Levels f c
instance Functor f => Functor (Levels f) where
fmap f (Now x) = Now (f x)
fmap f (Later c xs ys) = Later ((f.) . c) xs ys
runLevels :: Applicative f => Levels f a -> f a
Now x) = pure x
runLevels (Later f xs ys) = liftA2 f xs (runLevels ys)
runLevels (
bft :: forall f a b. Applicative f => (a -> f b) -> Tree a -> f (Tree b)
= go
bft f where
Node x xs) = liftA2 (\y (ys,_) -> Node y ys) (f x) (runLevels (foldr go' (Now ([],())) xs))
go (
go' :: forall c. Tree a -> Levels f ([Tree b], c) -> Levels f ([Tree b], c)
Node x xs) ys@(Now _) = Later (\y -> first (pure . Node y)) (f x) (foldr go' ys xs)
go' (Node x xs) (Later k ys zs) = Later id (liftA2 c (f x) ys) (foldr go' (fmap ((,) []) zs) xs)
go' (where
~(ys,z) = first (Node y ys:) (k g z) c y g
```

*Patterns in Functional Programming*. https://patternsinfp.wordpress.com/2015/03/05/breadth-first-traversal/.

Okasaki, Chris. 2000. “Breadth-First Numbering: Lessons from a Small Exercise in Algorithm Design.” In *Proceedings of the Fifth ACM SIGPLAN International Conference on Functional Programming*, 131–136. ICFP ’00. New York, NY, USA: ACM. doi:10.1145/351240.351253. https://www.cs.tufts.edu/~nr/cs257/archive/chris-okasaki/breadth-first.pdf.

Part 6 of a 10-part series on Breadth-First Traversals

Tags: Haskell

I was looking again at one of my implementations of breadth-first traversals:

```
bfe :: Tree a -> [a]
= f r b []
bfe r where
Node x xs) fw bw = x : fw (xs : bw)
f (
= []
b [] = foldl (foldr f) b qs [] b qs
```

And I was wondering if I could *fuse* away the intermediate list. On the following line:

`Node x xs) fw bw = x : fw (xs : bw) f (`

The `xs : bw`

is a little annoying, because we *know* it’s going to be consumed eventually by a fold. When that happens, it’s often a good idea to remove the list, and just inline the fold. In other words, if you see the following:

`foldr f b (x : y : [])`

You should replace it with this:

` f x (f y b)`

If you try and do that with the above definition, you get something like the following:

```
bfenum :: Tree a -> [a]
= f t b b
bfenum t where
Node x xs) fw bw = x : fw (bw . flip (foldr f) xs)
f (= x b b x
```

The trouble is that the above comes with type errors:

`Cannot construct the infinite type: b ~ (b -> c) -> [a]`

This error shows up occasionally when you try and do heavy church-encoding in Haskell. You get a similar error when trying to encode the Y combinator:

`= \f -> (\x -> f (x x)) (\x -> f (x x)) y `

`• Occurs check: cannot construct the infinite type: t0 ~ t0 -> t`

The solution for the y combinator is to use a newtype, where we can catch the recursion at a certain point to help the typechecker.

```
newtype Mu a = Mu (Mu a -> a)
= (\h -> h $ Mu h) (\x -> f . (\(Mu g) -> g) x $ x) y f
```

The trick for our queue is similar:

```
newtype Q a = Q { q :: (Q a -> [a]) -> [a] }
bfenum :: Tree a -> [a]
= q (f t b) e
bfenum t where
Node x xs) fw = Q (\bw -> x : q fw (bw . flip (foldr f) xs))
f (= fix (Q . flip id)
b = fix (flip q) e
```

This is actually equivalent to the continuation monad:

```
newtype Fix f = Fix { unFix :: f (Fix f) }
type Q a = Fix (ContT a [])
= runContT . unFix
q
bfenum :: Tree a -> [a]
= q (f t b) e
bfenum t where
Node x xs) fw = Fix (mapContT (x:) (flip (foldr f) xs <$> unFix fw))
f (= fix (Fix . pure)
b = fix (flip q) e
```

There’s a problem though: this algorithm never checks for an end. That’s ok if there isn’t one, mind you. For instance, with the following “unfold” function:

```
infixr 9 #.
(#.) :: Coercible b c => (b -> c) -> (a -> b) -> a -> c
#.) _ = coerce
({-# INLINE (#.) #-}
bfUnfold :: (a -> (b,[a])) -> a -> [b]
= g t (fix (Q #. flip id)) (fix (flip q))
bfUnfold f t where
= x : q fw (bw . flip (foldr ((Q .) #. g)) xs)
g b fw bw where
= f b (x,xs)
```

We can write a decent enumeration of the rationals.

```
-- Stern-Brocot
rats1 :: [Rational]
= bfUnfold step ((0,1),(1,0))
rats1 where
= (n % d,[(lb , m),(m , rb)])
step (lb,rb) where
@(n,d) = adj lb rb
m= (w+y,x+z)
adj (w,x) (y,z)
-- Calkin-Wilf
rats2 :: [Rational]
= bfUnfold step (1,1)
rats2 where
= (m % n,[(m,m+n),(n+m,n)]) step (m,n)
```

However, if we *do* want to stop at some point, we need a slight change to the queue type.

```
newtype Q a = Q { q :: Maybe (Q a -> [a]) -> [a] }
bfenum :: Tree a -> [a]
= q (f t b) e
bfenum t where
Node x xs) fw = Q (\bw -> x : q fw (Just (m bw . flip (foldr f) xs)))
f (= fix (Q . maybe [] . flip ($))
b = Nothing
e = fromMaybe (flip q e) m
```

We can actually add in a monad to the above unfold without much difficulty.

```
newtype Q m a = Q { q :: Maybe (Q m a -> m [a]) -> m [a] }
bfUnfold :: Monad m => (a -> m (b,[a])) -> a -> m [b]
= g t b e
bfUnfold f t where
= f s >>=
g s fw bw ~(x,xs) -> (x :) <$> q fw (Just (m bw . flip (foldr ((Q .) #. g)) xs))
\
= fix (Q #. maybe (pure []) . flip ($))
b = Nothing
e = fromMaybe (flip q e) m
```

And it passes the torture tests for a linear-time breadth-first unfold from Feuer (2015). It breaks when you try and use it to build a tree, though.

Finally, we can try and make the above code a little more modular, by actually packaging up the queue type as a queue.

```
newtype Q a = Q { q :: Maybe (Q a -> [a]) -> [a] }
newtype Queue a = Queue { runQueue :: Q a -> Q a }
now :: a -> Queue a
= Queue (\fw -> Q (\bw -> x : q fw bw))
now x
delay :: Queue a -> Queue a
= Queue (\fw -> Q (\bw -> q fw (Just (m bw . runQueue xs))))
delay xs where
= fromMaybe (flip q Nothing)
m
instance Monoid (Queue a) where
mempty = Queue id
mappend (Queue xs) (Queue ys) = Queue (xs . ys)
run :: Queue a -> [a]
Queue xs) = q (xs b) Nothing
run (where
= fix (Q . maybe [] . flip ($))
b
bfenum :: Tree a -> [a]
= run (f t)
bfenum t where
Node x xs) = now x <> delay (foldMap f xs) f (
```

At this point, our type is starting to look a lot like the `Phases`

type from Noah Easterly’s tree-traversals package. This is exciting: the `Phases`

type has the ideal interface for level-wise traversals. Unfortunately, it has the wrong time complexity for `<*>`

and so on: my suspicion is that the queue type above here is to `Phases`

as the continuation monad is to the free monad. In other words, we’ll get efficient construction at the expense of no inspection. Unfortunately, I can’t figure out how to turn the above type into an applicative. Maybe in a future post!

Finally, a lot of this is working towards finally understanding Smith (2009) and Allison (2006).

*Software: Practice and Experience* 19 (2) (October): 99–109. doi:10.1002/spe.4380190202. http://users.monash.edu/~lloyd/tildeFP/1989SPE/.

Feuer, David. 2015. “Is a Lazy, Breadth-First Monadic Rose Tree Unfold Possible?” Question. *Stack Overflow*. https://stackoverflow.com/q/27748526.

*The Monad.Reader* 14 (14) (July): 28. https://meldingmonads.files.wordpress.com/2009/06/corecqueues.pdf.

Tags: Concatenative, Haskell

This post demonstrates a simple encoding of a (typed) concatenative language in Haskell.

Point-free style is one of the distinctive markers of functional programming languages. Want to sum a list? That’s as easy as:

`sum = foldr (+) 0`

Now I want to sum every number after adding one to it.

`= foldr (+) 0 . map ((+) 1) sumSuccs `

One more step to make this function truly abstract™ and general™: we’ll allow the user to supply their own number to add

`= foldr (+) 0 . map . (+) sumAdded `

And here the trouble begins. The above expression won’t actually type check. In fact, it’ll give a pretty terrible error message:

```
• Non type-variable argument in the constraint: Num [a]
(Use FlexibleContexts to permit this)
• When checking the inferred type
sumThoseThat :: forall a.
(Num [a], Foldable ((->) [a])) =>
a -> [a]
```

I remember as a beginner being confused by similar messages. What’s `FlexibleContexts`

? I had thought that the “point-free style” just meant removing the last variable from an expression if it’s also the last argument:

```
sum xs = foldr (+) 0 xs
sum = foldr (+) 0
```

Why doesn’t it work here?

Well, it doesn’t work because the types don’t line up, but I’m going to try and explain a slightly different perspective on the problem, which is *associativity*.

To make it a little clearer, let’s see what happens when we point-fill the expression:

```
= (foldr(+) 0 . (map . (+))) n xs
sumAdded n xs => foldr(+) 0 ((map . (+)) n) xs
=> foldr(+) 0 (map ((+) n)) xs
```

Indeed, the problem is the placement of the parentheses. What we want at the end is:

`=> foldr(+) 0 (map ((+) n) xs) `

But, no matter. We have to jiggle the arguments around, or we could use something terrible like this:

```
infixr 9 .:
.:) = (.).(.)
(
= foldr (+) 0 .: map . (+) sumAdded
```

Is there something, though, that could do this automatically?

We run into a similar problem in Agda. We’re forever having to prove statements like:

```
(x + y) + z ≡ x + (y + z)
0 x ≡ x +
```

There are a couple of ways to get around the issue, and for monoids there’s a rich theory of techniques. I’ll just show one for now, which relies on the *endomorphism* monoid. This monoid is created by partially applying the monoid’s binary operator:

```
: Set
Endo = ℕ → ℕ
Endo
_⇑⟧ : ℕ → Endo
⟦= n + m ⟦ n ⇑⟧ m
```

And you can get back to the underlying monoid by applying it to the neutral element:

```
_⇓⟧ : Endo → ℕ
⟦= n 0 ⟦ n ⇓⟧
```

Here’s the important parts: first, we can lift the underlying operation into the endomorphism:

```
_⊕_ : Endo → Endo → Endo
= λ x → xs (ys x)
xs ⊕ ys
: ∀ n m → ⟦ ⟦ n ⇑⟧ ⊕ ⟦ m ⇑⟧ ⇓⟧ ≡ n + m
⊕-homo = cong (n +_) (+-identityʳ m) ⊕-homo n m
```

And second, it’s *definitionally* associative.

```
: ∀ x y z → (x ⊕ y) ⊕ z ≡ x ⊕ (y ⊕ z)
⊕-assoc _ _ _ = refl ⊕-assoc
```

These are all clues as to how to solve the composition problem in the Haskell code above. We need definitional associativity, somehow. Maybe we can get it from the endomorphism monoid?

You’re probably familiar with Haskell’s state monad:

`newtype State s a = State { runState :: s -> (a, s) }`

It can help a lot when you’re threading around fiddly accumulators and so on.

```
nub :: Ord a => [a] -> [a]
= go Set.empty
nub where
= []
go seen [] :xs)
go seen (x| x `Set.member` seen = go seen xs
| otherwise = x : go (Set.insert x seen) xs
```

```
nub :: Ord a => [a] -> [a]
= flip evalState Set.empty . go
nub where
= pure []
go [] :xs) = do
go (x<- gets (Set.member x)
seen if seen
then go xs
else do
modify (Set.insert x):) <$> go xs (x
```

Of course, these days state is a transformer:

`newtype StateT s m a = StateT { runStateT :: s -> m (a, s) }`

This lets us stack multiple effects on top of each other: error handling, IO, randomness, even another state monad. In fact, if you *do* stack another state monad on top, you might be surprised by the efficiency of the code it generates:

```
type DoubleState s1 s2 a = StateT s1 (State s2) a
=> s1 -> State s2 (a, s1)
=> s1 -> s2 -> ((a, s1), s2)
```

It’s nothing earth shattering, but it inlines and optimises well. That output is effectively a left-nested list, also.

If we can do one, and we can do two, why not more? Can we generalise the state pattern to an arbitrary number of variables? First we’ll need a generic tuple:

```
infixr 5 :-
data Stack (xs :: [Type]) :: Type where
Nil :: Stack '[]
(:-) :: x -> Stack xs -> Stack (x : xs)
```

Then, the state type.

`newtype State xs a = State { runState :: Stack xs -> (a, Stack xs) }`

We can actually clean the definition up a little: instead of a tuple at the other end, why not push it onto the stack.

`newtype State xs a = State { runState :: Stack xs -> Stack (a : xs) }`

In fact, let’s make this as polymorphic as possible. We should be able to change the state is we so desire.

```
infixr 0 :->
type (:->) xs ys = Stack xs -> Stack ys
```

And suddenly, our endomorphism type from above shows up again.

We can, of course, get back our original types.

`newtype State xs a = State { runState :: xs :-> a : xs }`

And it comes with all of the instances you might expect:

```
instance Functor (State xs) where
fmap f xs = State (\s -> case runState xs s of
:- ys) -> f x :- ys)
(x
instance Applicative (State xs) where
pure x = State (x :-)
<*> xs = State (\s -> case runState fs s of
fs :- s') -> case runState xs s' of
(f :- s'') -> f x :- s'')
(x
instance Monad (State xs) where
>>= f = State (\s -> case runState xs s of
xs :- ys -> runState (f y) ys) y
```

But what’s the point? So far we’ve basically just encoded an unnecessarily complicated state transformer. Think back to the stacking of states. Written in the mtl style, the main advantage of stacking monads like that is you can write code like the following:

```
pop :: (MonadState [a] m, MonadError String m) => m a
= get >>= \case
pop -> throwError "pop: empty list"
[] :xs -> do
x
put xs pure x
```

In other words, we don’t care about the rest of `m`

, we just care that it has, somewhere, state for an `[a]`

.

This logic should apply to our stack transformer, as well. If it only cares about the top two variables, it shouldn’t care what the rest of the list is. In types:

```
infixr 0 :->
type (:->) xs ys = forall zs. Stack (xs ++ zs) -> Stack (ys ++ zs)
```

And straight away we can write some of the standard combinators:

```
dup :: '[a] :-> '[a,a]
:- xs) = (x :- x :- xs)
dup (x
swap :: '[x,y] :-> '[y,x]
:- y :- xs) = y :- x :- xs
swap (x
drop :: '[x,y] :-> '[y]
drop (_ :- xs) = xs
infixl 9 !
! g) x = g (f x) (f
```

You’ll immediately run into trouble if you try to work with some of the more involved combinators, though. Quote should have the following type, for instance:

`quote :: (xs :-> ys) -> '[] :-> '[ xs :-> ys ]`

But GHC complains again:

```
• Illegal polymorphic type: xs :-> ys
GHC doesn't yet support impredicative polymorphism
• In the type signature:
quote :: (xs :-> ys) -> '[] :-> '[xs :-> ys]
```

I won’t go into the detail of this particular error: if you’ve been around the block with Haskell you know that it means “wrap it in a newtype”. If we do *that*, though, we get yet more errors:

`newtype (:~>) xs ys = Q { d :: xs :-> ys }`

```
• Couldn't match type ‘ys ++ zs0’ with ‘ys ++ zs’
Expected type: Stack (xs ++ zs) -> Stack (ys ++ zs)
Actual type: Stack (xs ++ zs0) -> Stack (ys ++ zs0)
NB: ‘++’ is a type function, and may not be injective
```

This injectivity error comes up often. It means that GHC needs to prove that the input to two functions is equal, but it only knows that their outputs are. This is a doubly serious problem for us, as we can’t do type family injectivity on two type variables (in current Haskell). To solve the problem, we need to rely on a weird mishmash of type families and functional dependencies:

```
type family (++) xs ys where
++ ys = ys
'[] : xs) ++ ys = x : (xs ++ ys)
(x
class (xs ++ ys ~ zs) => Conc xs ys zs | xs zs -> ys where
conc :: Stack xs -> Stack ys -> Stack zs
instance Conc '[] ys ys where
= ys
conc _ ys
instance Conc xs ys zs => Conc (x : xs) ys (x : zs) where
:- xs) ys = x :- conc xs ys
conc (x
infixr 0 :->
type (:->) xs ys = forall zs yszs. Conc ys zs yszs => Stack (xs ++ zs) -> Stack yszs
```

And it does indeed work:

```
pure :: a -> '[] :-> '[a]
pure = (:-)
newtype (:~>) xs ys = Q { d :: xs :-> ys }
quote :: (xs :-> ys) -> '[] :-> '[ xs :~> ys ]
= pure (Q x)
quote x
dot :: forall xs ys. ((xs :~> ys) : xs) :-> ys
:- xs) = d x xs
dot (x
true :: (xs :~> ys) : (xs :~> ys) : xs :-> ys
= swap ! drop ! dot
true
false :: (xs :~> ys) : (xs :~> ys) : xs :-> ys
= drop ! dot
false
test :: '[] :-> '[ '[a] :~> '[a,a] ]
= quote dup test
```

Interestingly, these combinators represent the monadic operations on state (`dot`

= `join`

, `pure`

= `pure`

, etc.)

And can we get the nicer composition of the function from the intro? Kind of:

`= quote add ! curry ! dot ! map ! sum sumAdded `

Here are some references for concatenative languages: Okasaki (2002), Purdy (2012), Kerby (2007), Okasaki (2003).

Kerby, Brent. 2007. “The Theory of Concatenative Combinators.” http://tunes.org/\%7Eiepos/joy.html.

Okasaki, Chris. 2002. “Techniques for Embedding Postfix Languages in Haskell.” In *Proceedings of the ACM SIGPLAN Workshop on Haskell - Haskell ’02*, 105–113. Pittsburgh, Pennsylvania: ACM Press. doi:10.1145/581690.581699. http://portal.acm.org/citation.cfm?doid=581690.581699.

———. 2003. “THEORETICAL PEARLS: Flattening Combinators: Surviving without Parentheses.” *Journal of Functional Programming* 13 (4) (July): 815–822. doi:10.1017/S0956796802004483. https://www.cambridge.org/core/journals/journal-of-functional-programming/article/theoretical-pearls/3E99993FE5464986AD94D292FF5EA275.

Purdy, Jon. 2012. “The Big Mud Puddle: Why Concatenative Programming Matters.” *The Big Mud Puddle*. https://evincarofautumn.blogspot.com/2012/02/why-concatenative-programming-matters.html.

Tags: Haskell

This post is a collection of some of the tricks I’ve learned for manipulating lists in Haskell. Each one starts with a puzzle: you should try the puzzle yourself before seeing the solution!

How can you split a list in half, in one pass, without taking its length?

This first one is a relatively well-known trick, but it occasionally comes in handy, so I thought I’d mention it. The naive way is as follows:

`= splitAt (length xs `div` 2) xs splitHalf xs `

But it’s unsatisfying: we have to traverse the list twice, and we’re taking its length (which is almost always a bad idea). Instead, we use the following function:

```
splitHalf :: [a] -> ([a],[a])
= go xs xs
splitHalf xs where
:ys) (_:_:zs) = first (y:) (go ys zs)
go (y= ([],ys) go ys _
```

The “tortoise and the hare” is the two arguments to `go`

: it traverses the second one twice as fast, so when it hits the end, we know that the first list must be halfway done.

Given two lists,

`xs`

and`ys`

, write a function which zips`xs`

with thereverseof`ys`

(in one pass).

There’s a lovely paper (Danvy and Goldberg 2005) which goes though a number of tricks for how to do certain list manipulations “in reverse”. Their technique is known as “there and back again”. However, I’d like to describe a different way to get to the same technique, using folds.

Whenever I need to do some list manipulation in reverse (i.e., I need the input list to be reversed), I first see if I can rewrite the function as a fold, and then just switch out `foldr`

for `foldl`

.

For our puzzle here, we need to first write `zip`

as a fold:

```
zip :: [a] -> [b] -> [(a,b)]
zip = foldr f b
where
:ys) = (x,y) : k ys
f x k (y= []
f x k [] = [] b _
```

If that looks complex, or difficult to write, don’t worry! There’s a systematic way to get to the above definition from the normal version of `zip`

. First, let’s start with a normal `zip`

:

```
zip :: [a] -> [b] -> [(a,b)]
zip [] ys = []
zip xs [] = []
zip (x:xs) (y:ys) = (x,y) : zip xs ys
```

Then, we need to turn it into a case-tree, where the first branch is on the list we want to fold over. In other words, we want the function to look like this:

```
zip xs = case xs of
???
```

To figure out the cases, we factor out the cases in the original function. Since the second clause (`zip xs [] = []`

) is only reachable when `xs /= []`

, it’s effectively a case for the `x:xs`

branch.

```
zip :: [a] -> [b] -> [(a,b)]
zip xs = case xs of
-> \_ -> []
[] :xs -> \case
x-> []
[] :ys -> (x,y) : zip xs ys y
```

Now, we rewrite the different cases to be auxiliary functions:

```
zip :: [a] -> [b] -> [(a,b)]
zip xs = case xs of
-> b
[] :xs -> f x xs
xwhere
= \_ -> []
b = \x xs -> \case
f -> []
[] :ys -> (x,y) : zip xs ys y
```

And finally, we *refactor* the recursive call to the first case expression.

```
zip :: [a] -> [b] -> [(a,b)]
zip xs = case xs of
-> b
[] :xs -> f x (zip xs)
xwhere
= \_ -> []
b = \x xs -> \case
f -> []
[] :ys -> (x,y) : xs ys y
```

Then those two auxiliary functions are what you pass to `foldr`

!

So, to reverse it, we simply take wherever we wrote `foldr f b`

, and replace it with `foldl (flip f) b`

:

```
zipRev :: [a] -> [b] -> [(a,b)]
= foldl (flip f) b
zipRev where
:ys) = (x,y) : k ys
f x k (y= []
f x k [] = [] b _
```

Of course, we’re reversing the wrong list here. Fixing that is simple:

```
zipRev :: [a] -> [b] -> [(a,b)]
= flip (foldl (flip f) b)
zipRev where
:xs) = (x,y) : k xs
f y k (x= []
f y k [] = [] b _
```

Rewrite the above function without using continuations.

`zipRev`

, as written above, actually uses *continuation-passing style*. In most languages (including standard ML, which was the one used in Danvy and Goldberg (2005)), this is pretty much equivalent to a direct-style implementation (modulo some performance weirdness). In a lazy language like Haskell, though, continuation-passing style often makes things unnecessarily strict.

Consider the church-encoded pairs:

```
newtype Pair a b
= Pair
runPair :: forall c. (a -> b -> c) -> c
{
}
firstC :: (a -> a') -> Pair a b -> Pair a' b
= Pair (\k -> runPair p (k . f))
firstC f p
firstD :: (a -> a') -> (a, b) -> (a', b)
~(x,y) = (f x, y)
firstD f
fstD :: (a, b) -> a
~(x,y) = x
fstD
fstC :: Pair a b -> a
= runPair p const
fstC p
>>> fstC (firstC (const ()) undefined)
undefined
>>> fstD (firstD (const ()) undefined)
()
```

So it’s sometimes worth trying to avoid continuations if there is a fast direct-style solution. (alternatively, continuations can give you extra strictness when you *do* want it)

First, I’m going to write a different version of `zipRev`

, which folds on the first list, not the second.

```
= foldl f (\_ r -> r) xs ys []
zipRev xs ys where
:ys) r = k ys ((x,y):r) f k x (y
```

Then, we inline the definition of `foldl`

:

```
= foldr f id xs (\_ r -> r) ys []
zipRev xs ys where
= k (\(y:ys) r -> c ys ((x,y):r)) f x k c
```

Then, as a hint, we tuple up the two accumulating parameters:

```
= foldr f id xs snd (ys,[])
zipRev xs ys where
= k (\((y:ys),r) -> c (ys,(x,y):r)) f x k c
```

What we can see here is that we have two continuations stacked on top of each other. When this happens, they can often “cancel out”, like so:

```
= snd (foldr f (ys,[]) xs)
zipRev xs ys where
:ys,r) = (ys,(x,y):r) f x (y
```

And we have our direct-style implementation!

Note 14/05/2019: the “cancel-out” explanation there is a little handwavy, as I’m sure you’ll notice. However, there are a number of excellent explanations on this stackoverflow thread which explain it much better than I ever could. Thanks to Anders Kaseorg, Will Ness, user11228628, and to Joseph Sible (2019) for asking the question.

Detect that a list is a palindrome, in one pass.

We now know a good way to split a list in two, and a good way to zip a list with its reverse. We can *combine* the two to get a program that checks if a list is a palindrome. Here’s a first attempt:

`= all (uncurry (==)) (uncurry zipRev (splitHalf xs)) isPal xs `

But this is doing *three* passes!

To get around it, we can manually do some fusion. Fusion is a technique where we can spot scenarios like the following:

`foldr f b (x : y : [])`

And translate them into a version without a list:

``f` (y `f` b) x `

The trick is making sure that the consumer is written as a fold, and then we just put its `f`

and `b`

in place of the `:`

and `[]`

in the producer.

So, when we inline the definition of `splitHalf`

into `zipRev`

, we get the following:

```
zipRevHalf :: [a] -> [(a,a)]
= snd (go xs xs)
zipRevHalf xs where
:ys) (_:_:zs) = f y (go ys zs)
go (y:ys) [_] = (ys,[])
go (_= (ys,[])
go ys []
:ys,r) = (ys,(x,y):r)
f x (y
= all (uncurry (==)) (zipRevHalf xs) isPal xs
```

(adding a special case for odd-length lists)

Finally, the `all (uncurry (==))`

is implemented as a fold also. So we can fuse it with the rest of the definitions:

```
isPal :: Eq a => [a] -> Bool
= snd (go xs xs)
isPal xs where
:ys) (_:_:zs) = f y (go ys zs)
go (y:ys) [_] = (ys,True)
go (_= (ys,True)
go ys []
:ys,r) = (ys,(x == y) && r) f x (y
```

You may have spotted the writer monad over `All`

there. Indeed, we can rewrite it to use the monadic bind:

```
isPal :: Eq a => [a] -> Bool
= getAll (fst (go xs xs)) where
isPal xs :ys) (_:_:zs) = f y =<< go ys zs
go (y:ys) [_] = pure ys
go (_= pure ys
go ys []
:zs) = (All (y == z), zs) f y (z
```

Construct a Braun tree from a list in linear time.

This is also a very well-known trick (Bird 1984), but today I’m going to use it to write a function for constructing Braun trees.

A Braun tree is a peculiar structure. It’s a binary tree, where adjacent branches can differ in size by only 1. When used as an array, it has $\mathcal{O}(\log n)$ lookup times. It’s enumerated like so:

```
┌─7
┌3┤
│ └11
┌1┤
│ │ ┌─9
│ └5┤
│ └13
0┤
│ ┌─8
│ ┌4┤
│ │ └12
└2┤
│ ┌10
└6┤
└14
```

The objective is to construct a tree from a list in linear time, in the order defined above. Okasaki (1997) observed that, from the list:

`0..14] [`

Each level in the tree is constructed from chucks of powers of two. In other words:

`0],[1,2],[3,4,5,6],[7,8,9,10,11,12,13,14]] [[`

From this, we can write the following function:

```
= []
rows k [] = (k , take k xs) : rows (2*k) (drop k xs)
rows k xs
= zipWith3 Node xs ts1 ts2
build (k,xs) ts where
= splitAt k (ts ++ repeat Leaf)
(ts1,ts2)
= head . foldr build [Leaf] . rows 1 fromList
```

The first place we’ll look to eliminate a pass is the `build`

function. It combines two rows by splitting the second in half, and zipping it with the first.

```
>>> build (3, [x1,x2,x3]) [y1,y2,y3,y4,y5,y6]
[(x1,y1,y4),(x2,y2,y5),(x3,y3,y6)]
```

We don’t need to store the length of the first list, though, as we are only using it to split the second, and we can do *that* at the same time as the zipping.

```
zipUntil :: (a -> b -> c) -> [a] -> [b] -> ([c],[b])
= ([],ys)
zipUntil _ [] ys :xs) (y:ys) = first (f x y:) (zipUntil f xs ys)
zipUntil f (x
>>> zipUntil (,) [1,2] "abc"
1,'a'),(2,'b')],"c") ([(
```

Using this function in `build`

looks like the following:

```
= zipWith ($) ys ts2
build (k,xs) ts where
= zipUntil Node xs (ts ++ repeat Leaf) (ys,ts2)
```

That top-level `zipWith`

is *also* unnecessary, though. If we make the program circular, we can produce `ts2`

as we consume it, making the whole thing single-pass.

```
= ys
build xs ts where
= zip3Node xs (ts ++ repeat Leaf) ts2
(ys,ts2) :xs) (y:ys) ~(z:zs) = first (Node x y z:) (zip3Node xs ys zs)
zip3Node (x= ([], ys) zip3Node [] ys _
```

That `zip3Node`

is a good candidate for rewriting as a fold, also, making the whole thing look like this:

```
= []
rows k [] = take k xs : rows (2*k) (drop k xs)
rows k xs
= ys
build xs ts where
= foldr f b xs ts zs
(ys,zs) :ys) ~(z:zs) = first (Node x y z:) (xs ys zs)
f x xs (y= ([],ys)
b ys _
= head . foldr build (repeat Leaf) . rows 1 fromList
```

To fuse all of those definitions, we first will need to rewrite `rows`

as a fold:

```
= uncurry (:) (foldr f b xs 1 2)
rows xs where
= ([],[])
b _ _ 0 j = ([], uncurry (:) (f x k j (j*2)))
f x k = first (x:) (k (i-1) j) f x k i j
```

Once we have everything as a fold, the rest of the transformation is pretty mechanical. At the end of it all, we get the following linear-time function for constructing a Braun tree from a list:

```
fromList :: [a] -> Tree a
= head (l (foldr f b xs 1 2))
fromList xs where
= (repeat Leaf, (repeat Leaf, ys))
b _ _ ys zs
= let (xs, ys) = uncurry k ys in xs
l k
0 j ys zs = ([], (l (f x k j (j*2)), ys))
f x k ~(y:ys) ~(z:zs) = first (Node x y z:) (k (i-1) j ys zs) f x k i j
```

Bird, R. S. 1984. “Using Circular Programs to Eliminate Multiple Traversals of Data.” *Acta Inf.* 21 (3) (October): 239–250. doi:10.1007/BF00264249. http://dx.doi.org/10.1007/BF00264249.

*BRICS Report Series* 12 (3). doi:10.7146/brics.v12i3.21869. https://tidsskrift.dk/brics/article/view/21869.

*Journal of Functional Programming* 7 (6) (November): 661–666. doi:10.1017/S0956796897002876. https://www.eecs.northwestern.edu/~robby/courses/395-495-2013-fall/three-algorithms-on-braun-trees.pdf.

Sible, Joseph. 2019. “How Can Two Continuations Cancel Each Other Out?” *Stack Overflow*. https://stackoverflow.com/questions/56122022/how-can-two-continuations-cancel-each-other-out.

Tags: Agda

Just some silly examples of how to get a nice list syntax with mixfix operators in Agda.

{-# OPTIONS --without-K --safe #-} module ListSyntax where open import Data.List as List using (List; _∷_; []) open import Data.Product open import Level using (_⊔_; Level) open import Data.Nat using (ℕ; _+_; suc; zero) open import Function variable a b : Level A : Set a B : Set b

With instance search.

module Instance where record ListSyntax {a b} (A : Set a) (B : Set b) : Set (a ⊔ b) where field [_] : B → List A open ListSyntax ⦃ ... ⦄ public instance cons : ∀ {a b} {A : Set a} {B : Set b} ⦃ _ : ListSyntax A B ⦄ → ListSyntax A (A × B) [_] ⦃ cons ⦄ (x , xs) = x ∷ [ xs ] instance sing : ∀ {a} {A : Set a} → ListSyntax A A [_] ⦃ sing ⦄ = _∷ []

--_ : List ℕ ← not needed _ = [ 1 ]It can even work as a standalone function:

_ : List ℕ → _ _ = List.map [_]It uses a closed operator, so we don’t need parentheses to parse it:

_ = List.foldr _+_ 0 [ 1 , 2 , 3 ]And it doesn’t clash with product:

_ : ℕ × ℕ _ = 1 , 2It allows nesting:

_ : List (List ℕ) _ = [ [ 1 ] , [ 2 , 3 ] ]

_ : List ℕ _ = [ 1 , 2 , 3 ]

module DataType where infixr 5 _] data ListBuilder {a} (A : Set a) : Set a where _,_ : A → ListBuilder A → ListBuilder A _] : A → ListBuilder A infixr 4 [_ [_ : ListBuilder A → List A [ x , xs = x ∷ ([ xs) [ x ] = x ∷ []

--_ : List ℕ ← not needed _ = [ 1 , 2 , 3 ] --_ : List ℕ ← not needed _ = [ 1 ]

Doesn’t clash with product:

_ : ℕ × ℕ _ = 1 , 2Can choose different “list-like” type based on first bracket:

open import Data.Vec as Vec using (_∷_; []; Vec) len-1 : ListBuilder A → ℕ len-1 (x , xs) = suc (len-1 xs) len-1 (x ]) = 0 infixr 4 v[_ v[_ : (xs : ListBuilder A) → Vec A (suc (len-1 xs)) v[ (x , xs) = x ∷ (v[ xs) v[ x ] = x ∷ [] _ : Vec ℕ 3 _ = v[ 1 , 2 , 3 ]

Not a closed operator, so need parens:

_ = List.foldr _+_ 0 ([ 1 , 2 , 3 ])Singleton isn’t a function

--_ = [_]Doesn’t nest

--_ = [ [ 1 ] , [ 2 , 3 ] ]]]>

Tags: Agda, Probability

Cubical Agda has just come out, and I’ve been playing around with it for a bit. There’s a bunch of info out there on the theory of cubical types, and Homotopy Type Theory more generally (cubical type theory is kind of like an “implementation” of Homotopy type theory), but I wanted to make a post demonstrating cubical Agda in practice, and one of its cool uses from a programming perspective.

I don’t really know! Cubical type theory is quite complex (even for a type theory), and I’m not nearly qualified to properly explain it. In lieu of a proper first-principles explanation, then, I’ll try and give a few examples of how it differs from normal Agda, before moving on to the main example of this post.

{-# OPTIONS --cubical #-} open import ProbabilityModule.Semirings module ProbabilityModule.Monad {s} (rng : Semiring s) where open import Cubical.Core.Everything open import Cubical.Relation.Everything open import Cubical.Foundations.Prelude hiding (_≡⟨_⟩_) renaming (_∙_ to _;_) open import Cubical.HITs.SetTruncation open import ProbabilityModule.Utils

- Extensionality
- One of the big annoyances in standard Agda is that we can’t prove the following:
extensionality : ∀ {f g : A → B} → (∀ x → f x ≡ g x) → f ≡ g

It’s emblematic of a wider problem in Agda: we can’t say “two things are equal if they always behave the same”. Infinite types, for instance (like streams) are often only equal via bisimulation: we can’t translate this into normal equality in standard Agda. Cubical type theory, though, has a different notion of “equality”, which allow a wide variety of things (including bisimulations and extensional proofs) to be translated into a proper equalityextensionality = funExt

- Isomorphisms
- One of these such things we can promote to a “proper equality” is an isomorphism. In the cubical repo this is used to prove things about binary numbers: by proving that there’s an isomorphism between the Peano numbers and binary numbers, they can lift any properties on the Peano numbers to the binary numbers.

So those are two useful examples, but the *most* interesting use I’ve seen so far is the following:

module NormalList where data List {a} (A : Set a) : Set a where [] : List A _∷_ : A → List A → List A

They allow us to add new equations to a type, as well as constructors. To demonstrate what this means, as well as why you’d want it, I’m going to talk about free objects.

Very informally, a free object on some algebra is the *minimal* type which satisfies the laws of the algebra. Lists, for instance, are the free monoid. They satisfy all of the monoid laws ($\bullet$ is `++`

and $\epsilon$ is `[]`

):

$(x \bullet y) \bullet z = x \bullet (y \bullet z)$ $x \bullet \epsilon = x$ $\epsilon \bullet x = x$

But *nothing else*. That means they don’t satisfy any extra laws (like, for example, commutativity), and they don’t have any extra structure they don’t need.

How did we get to the definition of lists from the monoid laws, though? It doesn’t look anything like them. It would be nice if there was some systematic way to construct the corresponding free object given the laws of an algebra. Unfortunately, in normal Agda, this isn’t possible. Consider, for instance, if we added the commutativity law to the algebra: $x \bullet y = y \bullet x$ Not only is it not obvious how we’d write the corresponding free object, it’s actually *not possible* in normal Agda!

This kind of problem comes up a lot: we have a type, and we want it to obey just *one more* equation, but there is no inductive type which does so. Higher Inductive Types solve the problem in quite a straightforward way. So we want lists to satisfy another equation? Well, just add it to the definition!

module OddList where mutual data List {a} (A : Set a) : Set a where [] : List A _∷_ : A → List A → List A comm : ∀ xs ys → xs ++ ys ≡ ys ++ xs postulate _++_ : List A → List A → List ANow, when we write a function that processes lists, Agda will check that the function behaves the same on

`xs ++ ys`

and `ys ++ xs`

. As an example, here’s how you might define the free monoid as a HIT:
data FreeMonoid {a} (A : Set a) : Set a where [_] : A → FreeMonoid A _∙_ : FreeMonoid A → FreeMonoid A → FreeMonoid A ε : FreeMonoid A ∙ε : ∀ x → x ∙ ε ≡ x ε∙ : ∀ x → ε ∙ x ≡ x assoc : ∀ x y z → (x ∙ y) ∙ z ≡ x ∙ (y ∙ z)

It’s quite a satisfying definition, and very easy to see how we got to it from the monoid laws.

Now, when we write functions, we have to prove that those functions themselves also obey the monoid laws. For instance, here’s how we would take the length:module Length where open import ProbabilityModule.Semirings.Nat open Semiring +-*-𝕊 length : FreeMonoid A → ℕ length [ x ] = 1 length (xs ∙ ys) = length xs + length ys length ε = 0 length (∙ε xs i) = +0 (length xs) i length (ε∙ xs i) = 0+ (length xs) i length (assoc xs ys zs i) = +-assoc (length xs) (length ys) (length zs) i

The first three clauses are the actual function: they deal with the three normal constructors of the type. The next three clauses prove that those previous clauses obey the equalities defined on the type.

With the preliminary stuff out of the way, let’s get on to the type I wanted to talk about:

First things first, let’s remember the classic definition of the probability monad:

`newtype Prob a = Prob { runProb :: [(a, Rational)] }`

Definitionally speaking, this doesn’t really represent what we’re talking about. For instance, the following two things express the same distribution, but have different representations:

```
Prob [(True, 1 / 4), (True, 1 / 4), (False, 1 / 2)]
Prob [(True , 1 / 2), (False, 1 / 2)]
```

So it’s the perfect candidate for an extra equality clause like we had above.

Second, in an effort to generalise, we won’t deal specifically with `Rational`

, and instead we’ll use any semiring. After all of that, we get the following definition:

open Semiring rng module Initial where infixr 5 _&_∷_ data 𝒫 (A : Set a) : Set (a ⊔ s) where [] : 𝒫 A _&_∷_ : (p : R) → (x : A) → 𝒫 A → 𝒫 A dup : ∀ p q x xs → p & x ∷ q & x ∷ xs ≡ p + q & x ∷ xs com : ∀ p x q y xs → p & x ∷ q & y ∷ xs ≡ q & y ∷ p & x ∷ xs del : ∀ x xs → 0# & x ∷ xs ≡ xs

The three extra conditions are pretty sensible: the first removes duplicates, the second makes things commutative, and the third removes impossible events.

Let’s get to writing some functions, then:

∫ : (A → R) → 𝒫 A → R ∫ f [] = 0# ∫ f (p & x ∷ xs) = p * f x + ∫ f xs ∫ f (dup p q x xs i) = begin[ i ] p * f x + (q * f x + ∫ f xs) ≡˘⟨ +-assoc (p * f x) (q * f x) (∫ f xs) ⟩ (p * f x + q * f x) + ∫ f xs ≡˘⟨ cong (_+ ∫ f xs) (⟨+⟩* p q (f x)) ⟩ (p + q) * f x + ∫ f xs ∎ ∫ f (com p x q y xs i) = begin[ i ] p * f x + (q * f y + ∫ f xs) ≡˘⟨ +-assoc (p * f x) (q * f y) (∫ f xs) ⟩ p * f x + q * f y + ∫ f xs ≡⟨ cong (_+ ∫ f xs) (+-comm (p * f x) (q * f y)) ⟩ q * f y + p * f x + ∫ f xs ≡⟨ +-assoc (q * f y) (p * f x) (∫ f xs) ⟩ q * f y + (p * f x + ∫ f xs) ∎ ∫ f (del x xs i) = begin[ i ] 0# * f x + ∫ f xs ≡⟨ cong (_+ ∫ f xs) (0* (f x)) ⟩ 0# + ∫ f xs ≡⟨ 0+ (∫ f xs) ⟩ ∫ f xs ∎

This is much more involved than the free monoid function, but the principle is the same: we first write the actual function (on the first three lines), and then we show that the function doesn’t care about the “rewrite rules” we have in the next three clauses.

Before going any further, we will have to amend the definition a little. The problem is that if we tried to prove something about any function on our `𝒫`

type, we’d have to prove equalities *between equalities* as well. I’m sure that this is possible, but it’s very annoying, so I’m going to use a technique I saw in this repository. We add another rule to our type, stating that all equalities on the type are themselves equal. The new definition looks like this:

infixr 5 _&_∷_ data 𝒫 (A : Set a) : Set (a ⊔ s) where [] : 𝒫 A _&_∷_ : (p : R) → (x : A) → 𝒫 A → 𝒫 A dup : ∀ p q x xs → p & x ∷ q & x ∷ xs ≡ p + q & x ∷ xs com : ∀ p x q y xs → p & x ∷ q & y ∷ xs ≡ q & y ∷ p & x ∷ xs del : ∀ x xs → 0# & x ∷ xs ≡ xs trunc : isSet (𝒫 A)

Unfortunately, after adding that case we have to deal with it explicitly in every pattern-match on `𝒫`

. We can get around it by writing an eliminator for the type which deals with it itself. Eliminators are often irritating to work with, though: we give up the nice pattern-matching syntax we get when we program directly. It’s a bit like having to rely on church encoding everywhere.

However, we can get back some pattern-like syntax if we use *copatterns*. Here’s an example of what I mean, for folds on lists:

module ListElim where open NormalList open import ProbabilityModule.Semirings.Nat open Semiring +-*-𝕊 renaming (_+_ to _ℕ+_) record [_↦_] (A : Set a) (B : Set b) : Set (a ⊔ b) where field [_][] : B [_]_∷_ : A → B → B [_]↓ : List A → B [ [] ]↓ = [_][] [ x ∷ xs ]↓ = [_]_∷_ x [ xs ]↓ open [_↦_] sum-alg : [ ℕ ↦ ℕ ] [ sum-alg ][] = 0 [ sum-alg ] x ∷ xs = x ℕ+ xs sum : List ℕ → ℕ sum = [ sum-alg ]↓

For the probability monad, there’s an eliminator for the whole thing, and eliminator for propositional proofs, and a normal eliminator for folding. Their definitions are quite long, but mechanical.

record ⟅_↝_⟆ {a ℓ} (A : Set a) (P : 𝒫 A → Set ℓ) : Set (a ⊔ ℓ ⊔ s) where constructor elim field ⟅_⟆-set : ∀ {xs} → isSet (P xs) ⟅_⟆[] : P [] ⟅_⟆_&_∷_ : ∀ p x xs → P xs → P (p & x ∷ xs) private z = ⟅_⟆[]; f = ⟅_⟆_&_∷_ field ⟅_⟆-dup : (∀ p q x xs pxs → PathP (λ i → P (dup p q x xs i)) (f p x (q & x ∷ xs) (f q x xs pxs)) (f (p + q) x xs pxs)) ⟅_⟆-com : (∀ p x q y xs pxs → PathP (λ i → P (com p x q y xs i)) (f p x (q & y ∷ xs) (f q y xs pxs)) (f q y (p & x ∷ xs) (f p x xs pxs))) ⟅_⟆-del : (∀ x xs pxs → PathP (λ i → P (del x xs i)) (f 0# x xs pxs) pxs) ⟅_⟆⇓ : (xs : 𝒫 A) → P xs ⟅ [] ⟆⇓ = z ⟅ p & x ∷ xs ⟆⇓ = f p x xs ⟅ xs ⟆⇓ ⟅ dup p q x xs i ⟆⇓ = ⟅_⟆-dup p q x xs ⟅ xs ⟆⇓ i ⟅ com p x q y xs i ⟆⇓ = ⟅_⟆-com p x q y xs ⟅ xs ⟆⇓ i ⟅ del x xs i ⟆⇓ = ⟅_⟆-del x xs ⟅ xs ⟆⇓ i ⟅ trunc xs ys p q i j ⟆⇓ = elimSquash₀ (λ xs → ⟅_⟆-set {xs}) (trunc xs ys p q) ⟅ xs ⟆⇓ ⟅ ys ⟆⇓ (cong ⟅_⟆⇓ p) (cong ⟅_⟆⇓ q) i j open ⟅_↝_⟆ public elim-syntax : ∀ {a ℓ} → (A : Set a) → (𝒫 A → Set ℓ) → Set (a ⊔ ℓ ⊔ s) elim-syntax = ⟅_↝_⟆ syntax elim-syntax A (λ xs → Pxs) = [ xs ∈𝒫 A ↝ Pxs ] record ⟦_⇒_⟧ {a ℓ} (A : Set a) (P : 𝒫 A → Set ℓ) : Set (a ⊔ ℓ ⊔ s) where constructor elim-prop field ⟦_⟧-prop : ∀ {xs} → isProp (P xs) ⟦_⟧[] : P [] ⟦_⟧_&_∷_⟨_⟩ : ∀ p x xs → P xs → P (p & x ∷ xs) private z = ⟦_⟧[]; f = ⟦_⟧_&_∷_⟨_⟩ ⟦_⟧⇑ = elim (isProp→isSet ⟦_⟧-prop) z f (λ p q x xs pxs → toPathP (⟦_⟧-prop (transp (λ i → P (dup p q x xs i)) i0 (f p x (q & x ∷ xs) (f q x xs pxs))) (f (p + q) x xs pxs) )) (λ p x q y xs pxs → toPathP (⟦_⟧-prop (transp (λ i → P (com p x q y xs i)) i0 (f p x (q & y ∷ xs) (f q y xs pxs))) (f q y (p & x ∷ xs) (f p x xs pxs)))) λ x xs pxs → toPathP (⟦_⟧-prop (transp (λ i → P (del x xs i)) i0 ((f 0# x xs pxs))) pxs) ⟦_⟧⇓ = ⟅ ⟦_⟧⇑ ⟆⇓ open ⟦_⇒_⟧ public elim-prop-syntax : ∀ {a ℓ} → (A : Set a) → (𝒫 A → Set ℓ) → Set (a ⊔ ℓ ⊔ s) elim-prop-syntax = ⟦_⇒_⟧ syntax elim-prop-syntax A (λ xs → Pxs) = ⟦ xs ∈𝒫 A ⇒ Pxs ⟧ record [_↦_] {a b} (A : Set a) (B : Set b) : Set (a ⊔ b ⊔ s) where constructor rec field [_]-set : isSet B [_]_&_∷_ : R → A → B → B [_][] : B private f = [_]_&_∷_; z = [_][] field [_]-dup : ∀ p q x xs → f p x (f q x xs) ≡ f (p + q) x xs [_]-com : ∀ p x q y xs → f p x (f q y xs) ≡ f q y (f p x xs) [_]-del : ∀ x xs → f 0# x xs ≡ xs [_]⇑ = elim [_]-set z (λ p x _ xs → f p x xs) (λ p q x xs → [_]-dup p q x) (λ p x q y xs → [_]-com p x q y) (λ x xs → [_]-del x) [_]↓ = ⟅ [_]⇑ ⟆⇓ open [_↦_] public

Here’s one in action, to define `map`

:

map : (A → B) → 𝒫 A → 𝒫 B map = λ f → [ map′ f ]↓ module Map where map′ : (A → B) → [ A ↦ 𝒫 B ] [ map′ f ] p & x ∷ xs = p & f x ∷ xs [ map′ f ][] = [] [ map′ f ]-set = trunc [ map′ f ]-dup p q x xs = dup p q (f x) xs [ map′ f ]-com p x q y xs = com p (f x) q (f y) xs [ map′ f ]-del x xs = del (f x) xs

And here’s how we’d define union, and then prove that it’s associative:

infixr 5 _∪_ _∪_ : 𝒫 A → 𝒫 A → 𝒫 A _∪_ = λ xs ys → [ union ys ]↓ xs module Union where union : 𝒫 A → [ A ↦ 𝒫 A ] [ union ys ]-set =