I’ve been writing a lot of Agda recently, and had the occasion to write a Fenwick tree that did some rebalancing. I went with AVL-style rebalancing (rather than red-black or trees of bounded balance). I’d written pretty full implementations of the other two before, and the Agda standard library (Danielsson 2018) has an implementation already that I was able to use as a starting point. Also, apparently, AVL trees seem to perform better than red-black trees in practice (Pfaff 2004).

This post will be similar in style to Stephanie Weirich’s talk (2014), which compares an Agda implementation of verified red-black trees to a Haskell one. When there’s two columns of code side-by-side, the left-hand side is Haskell, the right Agda.

The method of constructing the ordering proof is taken from “How to Keep Your Neighbours in Order” (2014) by Conor McBride; the structural proofs are somewhat inspired by the implementation in the Agda standard library, but are mainly my own.

AVL trees are more strictly balanced than red-black trees: the height of neighboring subtrees can differ by at most one. To store the height, we will start as every dependently-typed program does: with Peano numbers.

Haskell

The trees will be balanced one of three possible ways: left-heavy, right-heavy, or even. We can represent these three cases in a GADT in the case of Haskell, or an indexed datatype in the case of Agda:

Those unfamiliar with Agda might be a little intimidated by the mixfix operator in the balance definition: we’re using it here because the type can be seen of a proof that:

$max(x,y) = z$

Or, using the $\sqcup$ operator:

$(x \sqcup y) = z$

We’ll use this proof in the tree itself, as we’ll need to know the maximum of the height of a node’s two subtrees to find the height of the node. Before we do that, we’ll need a couple helper functions for manipulating the balance:

Along with the verification of the structure of the tree, we will also want to verify that its contents are ordered correctly. Unfortunately, this property is a little out of reach for Haskell, but it’s 100% doable in Agda. First, we’ll need a way to describe orders on a data type. In Haskell, we might write:

That `Bool`

throws away any information gained in the comparison, though: we want to supply a proof with the result of the comparison. First, equality:

This is one of the many ways to describe equality in Agda. It’s a type with only one constructor, and it can only be constructed when its two arguments are the same. When we pattern match on the constructor, then, we’re given a proof that whatever things those arguments refer to must be the same.

Next, we need to describe an order. For this, we’ll need two types: the empty type, and the unit type.

These are kind of like type-level Bools, with one extra, powerful addition: they keep their proof after construction. Because `⊥`

has no constructors, if someone tells you they’re going to give you one, you can be pretty sure they’re lying. How do we use this? Well, first, on the numbers:

Therefore, if we ask for something of type `x ℕ< y`

(for some `x`

and `y`

), we know that it only exists when `x`

really is less than `y`

(according to the definition above).

For our actual code, we’ll parameterize the whole thing over some abstract key type. We’ll do this using a module (a feature recently added to Haskell, as it happens). That might look something like this:

(the `k`

and `r`

here, as well as the `Lift`

ing noise below, are to do with Agda’s universe system, which I’ll try explain in a bit)

Now, the trick for the ordering is to keep a proof that two neighboring values are ordered correctly in the tree at each leaf (as there’s a leaf between every pair of nodes, this is exactly the place you *should* store such a proof). A problem arises with the extremal leaves in the tree (leftmost and rightmost): each leaf is missing one neighboring value, so how can it store a proof of order? The solution is to affix two elements to our key type which we define as the greatest and least elements of the set.

After all that, we can get bring back Haskell into the story, and define or tree types:

The two definitions are similar, but have a few obvious differences. The Agda version stores the ordering proof at the leaves, as well as the bounds as indices. Its *universe* is also different: briefly, universes are one of the ways to avoid Russell’s paradox when you’re dealing with dependent types.

In normal, standard Haskell, we think of types as things that describe values (how quaint!). When you’ve got a list, everything in the list has the same type, and that is good and right.

These days, though, we’re not so constrained:

This can quite happily store elements of different types:

And look at that bizarre-looking list on the wrong side of “`::`

”! Types aren’t just describing values, they’re acting like values themselves. What type does `[Bool, String, Integer]`

even have, anyway? Why, `[Type]`

of course!

So we see that types can be put in lists, and types have types: the natural question then is:

And this is where Haskell and Agda diverge: in Haskell, we say `Type :: Type`

(as the old extension `TypeInType`

implied), and that’s that. From a certain point of view, we’ve opened the door to Russell’s paradox (we’ve allowed a set to be a member of itself). This isn’t an issue in Haskell, though, as the type-level language was already inconsistent.

Agda goes another way, saying that `Set`

(Agda’s equivalent for `Type`

) has the type `Set₁`

, and `Set₁`

has the type `Set₂`

, and so on^{1}. These different sets are called “universes” and their numbers “levels”. When we write `k ⊔ v ⊔ r`

, we’re saying we want to take the greatest universe level from those three possible levels: the level of the key, the value, and the relation, respectively.

AVL trees maintain their invariants through relatively simple rotations. We’ll start with the right rotation, which fixes an imbalance of two on the left. Because the size of the tree returned might change, we’ll need to wrap it in a datatype:

We could actually have the Agda definition be the same as Haskell’s, it doesn’t make much difference. I’m mainly using it here to demonstrate dependent pairs in Agda. The first member of the pair is just a boolean (increased in height/not increased in height). The second member is a tree whose height *depends* on the actual value of the boolean. The `∃`

business is just a fancy syntax; it also waggles its eyebrows at the way a (dependent) pair of type `(x , y)`

means “There exists an x such that y”.

Using this, we can write the type for right-rotation:

There are two possible cases, single rotation:

And double:

I won’t bore you with left-rotation: suffice to say, it’s the opposite of right-rotation.

Finally, the main event: insertion. Once the above functions have all been defined, it’s not very difficult, as it happens: by and large, the types guide you to the right answer. Of course, this is only after we decided to use the pivotal pragmatism and balance approach.

```
insertWith
:: Ord k
=> (v -> v -> v)
-> k
-> v
-> Tree h k v
-> Tree k v ++? h
insertWith _ v vc Leaf =
Incr (Node v vc O Leaf Leaf)
insertWith f v vc (Node k kc bl tl tr) =
case compare v k of
LT ->
case insertWith f v vc tl of
Stay tl' ->
Stay (Node k kc bl tl' tr)
Incr tl' -> case bl of
L -> rotr k kc tl' tr
O -> Incr (Node k kc L tl' tr)
R -> Stay (Node k kc O tl' tr)
EQ ->
Stay (Node v (f vc kc) bl tl tr)
GT ->
case insertWith f v vc tr of
Stay tr' ->
Stay (Node k kc bl tl tr')
Incr tr' -> case bl of
L -> Stay (Node k kc O tl tr')
O -> Incr (Node k kc R tl tr')
R -> rotl k kc tl tr'
```

```
insert : ∀ {l u h v}
{V : Key → Set v}
(k : Key)
→ V k
→ (V k → V k → V k)
→ Tree V l u h
→ l < k < u
→ Tree V l u 1?+⟨ h ⟩
insert v vc f (leaf l<u) (l , u) =
1+ (node v vc ▽ (leaf l) (leaf u))
insert v vc f (node k kc bl tl tr) prf
with compare v k
insert v vc f (node k kc bl tl tr) (l , _)
| tri< a _ _ with insert v vc f tl (l , a)
... | 0+ tl′ = 0+ (node k kc bl tl′ tr)
... | 1+ tl′ with bl
... | ◿ = rotʳ k kc tl′ tr
... | ▽ = 1+ (node k kc ◿ tl′ tr)
... | ◺ = 0+ (node k kc ▽ tl′ tr)
insert v vc f (node k kc bl tl tr) _
| tri≈ _ refl _ =
0+ (node k (f vc kc) bl tl tr)
insert v vc f (node k kc bl tl tr) (_ , u)
| tri> _ _ c with insert v vc f tr (c , u)
... | 0+ tr′ = 0+ (node k kc bl tl tr′)
... | 1+ tr′ with bl
... | ◿ = 0+ (node k kc ▽ tl tr′)
... | ▽ = 1+ (node k kc ◺ tl tr′)
... | ◺ = rotˡ k kc tl tr′
```

Overall, I’ve been enjoying programming in Agda. The things I liked and didn’t like surprised me:

- Editor Support
Is excellent. I use spacemacs, and the whole thing worked pretty seamlessly. Proof search and auto was maybe not as powerful as Idris’, although that might be down to lack of experience (note—as I write this, I see you can enable case-splitting in proof search, so it looks like I was right about my lack of experience). In many ways, it was much better than Haskell’s editor support: personally, I have never managed to get case-splitting to work in my Haskell setup, never mind some of the fancier features that you get in Agda.

It’s worth noting that my experience with Idris is similar: maybe it’s something about dependent types?

Of course, I missed lots of extra tools, like linters, code formatters, etc., but the tight integration with the compiler was so useful it more than made up for it.

Also, I’d implore anyone who’s had trouble with emacs before to give spacemacs a go. It works well out-of-the-box, and has a system for keybinding discovery that

*actually works*.- Documentation
Pretty good, considering. There are some missing parts (rewriting and telescopes are both stubs on the documentation site), but there seemed to be more fully worked-out examples available online for different concepts when I needed to figure them out.

Now, the thing about a lot of these complaints/commendations (*especially* with regards to tooling and personal setups) is that people tend to be pretty bad about evaluating how difficult finicky tasks like editor setups are. Once you’ve gotten the hang of some of this stuff, you forget that you ever didn’t. Agda is the second dependently-typed language I’ve really gone for a deepish dive on, and I’ve been using spacemacs for a while, so YMMV.

One area of the language itself that I would have liked to see more on was irrelevance. Looking back at the definition of the tree type, in the Haskell version there’s no singleton storing the height (the balance type stores all the information we need), which means that it definitely doesn’t exist at runtime. As I understand it, that implies that the type should be irrelevant in the equivalent Agda. However, when I actually mark it as irrelevant, everything works fine, except that missing cases warnings start showing up. I couldn’t figure out why: Haskell was able to infer full case coverage without the index, after all. Equality proof erasure, also: is it safe? Consistent?

All in all, I’d encourage more Haskellers to give Agda a try. It’s fun, interesting, and $\mathcal{Unicode}$!

No “deletion is left as an exercise to the reader” here, no sir! Fuller implementations of both the Haskell and Agda versions of the code here are available: first, a pdf of the Agda code with lovely colours is here. The accompanying repository is here, and the equivalent for the Haskell code is here. Of course, if you would rather read something by someone who knows what they’re talking about, please see the

Danielsson, Nils Anders. 2018. “The Agda standard library.” https://agda.github.io/agda-stdlib/README.html.

McBride, Conor Thomas. 2014. “How to Keep Your Neighbours in Order.” In *Proceedings of the 19th ACM SIGPLAN International Conference on Functional Programming*, 297–309. ICFP ’14. New York, NY, USA: ACM. doi:10.1145/2628136.2628163. https://personal.cis.strath.ac.uk/conor.mcbride/pub/Pivotal.pdf.

Pfaff, Ben. 2004. “Performance Analysis of BSTs in System Software.” In *Proceedings of the Joint International Conference on Measurement and Modeling of Computer Systems*, 410–411. SIGMETRICS ’04/performance ’04. New York, NY, USA: ACM. doi:10.1145/1005686.1005742. https://benpfaff.org/papers/libavl.pdf.

Weirich, Stephanie. 2014. “Depending on Types.” In *Proceedings of the 19th ACM SIGPLAN International Conference on Functional Programming*, 241–241. ICFP ’14. New York, NY, USA: ACM. doi:10.1145/2628136.2631168. https://www.cis.upenn.edu/~sweirich/talks/icfp14.pdf.

My phrasing is maybe a little confusing here. When

`Set`

“has the type”`Set₁`

it means that`Set`

is*in*`Set₁`

, not the other way around.↩

Tags: Haskell, Probability

Here are the slides for a short talk I gave to a reading group I’m in at Harvard today. The speaker notes are included in the pdf, code and the tex is available in the repository.

]]>
Tags: Probability, Haskell

Ever since the famous pearl by Erwig and Kollmansberger (2006), probabilistic programming with monads has been an interesting and diverse area in functional programming, with many different approaches.

I’m going to present five here, some of which I have not seen before.

As presented in the paper, a simple and elegant formulation of probability distributions looks like this:

It’s a list of possible events, each tagged with their probability of happening. Here’s the probability distribution representing a die roll, for instance:

The semantics can afford to be a little fuzzy: it doesn’t hugely matter if the probabilities don’t add up to 1 (you can still extract meaningful answers when they don’t). However, I can’t see a way in which either negative probabilities or an empty list would make sense. It would be nice if those states were unrepresentable.

Its monadic structure multiplies conditional events:

```
instance Functor Prob where
fmap f xs = Prob [ (f x, p) | (x,p) <- runProb xs ]
instance Applicative Prob where
pure x = Prob [(x,1)]
fs <*> xs
= Prob
[ (f x,fp*xp)
| (f,fp) <- runProb fs
, (x,xp) <- runProb xs ]
instance Monad Prob where
xs >>= f
= Prob
[ (y,xp*yp)
| (x,xp) <- runProb xs
, (y,yp) <- runProb (f x) ]
```

In most of the examples, we’ll need a few extra functions in order for the types to be useful. First is support:

And second is expectation:

```
expect :: (a -> Rational) -> Prob a -> Rational
expect p xs = sum [ p x * xp | (x,xp) <- runProb xs ]
probOf :: (a -> Bool) -> Prob a -> Rational
probOf p = expect (bool 0 1 . p)
```

It’s useful to be able to construct uniform distributions:

```
uniform xs = Prob [ (x,n) | x <- xs ]
where
n = 1 % toEnum (length xs)
die = uniform [1..6]
>>> probOf (7==) $ do
x <- die
y <- die
pure (x+y)
1 % 6
```

As elegant as the above approach is, it leaves something to be desired when it comes to efficiency. In particular, you’ll see a combinatorial explosion at every step. To demonstrate, let’s take the example above, using three-sided dice instead so it doesn’t take up too much space.

The probability table looks like this:

```
2 1/9
3 2/9
4 1/3
5 2/9
6 1/9
```

But the internal representation looks like this:

```
2 1/9
3 1/9
4 1/9
3 1/9
4 1/9
5 1/9
4 1/9
5 1/9
6 1/9
```

States are duplicated, because the implementation has no way of knowing that two outcomes are the same. We could collapse equivalent outcomes if we used a `Map`

, but then we can’t implement `Functor`

, `Applicative`

, or `Monad`

. The types:

```
class Functor f where
fmap :: (a -> b) -> f a -> f b
class Functor f => Applicative f where
pure :: a -> f a
(<*>) :: f (a -> b) -> f a -> f b
class Applicative f => Monad f where
(>>=) :: f a -> (a -> f b) -> f b
```

Don’t allow an `Ord`

constraint, which is what we’d need to remove duplicates. We can instead make our own classes which *do* allow constraints:

```
{-# LANGUAGE RebindableSyntax #-}
{-# LANGUAGE TypeFamilies #-}
import Prelude hiding (Functor(..),Applicative(..),Monad(..))
import Data.Kind
class Functor f where
type Domain f a :: Constraint
type Domain f a = ()
fmap :: Domain f b => (a -> b) -> f a -> f b
class Functor f => Applicative f where
{-# MINIMAL pure, liftA2 #-}
pure :: Domain f a => a -> f a
liftA2 :: Domain f c => (a -> b -> c) -> f a -> f b -> f c
(<*>) :: Domain f b => f (a -> b) -> f a -> f b
(<*>) = liftA2 ($)
class Applicative f => Monad f where
(>>=) :: Domain f b => f a -> (a -> f b) -> f b
fail :: String -> a
fail = error
return :: (Applicative f, Domain f a) => a -> f a
return = pure
```

This setup gets over a couple common annoyances in Haskell, like making `Data.Set`

a Monad:

```
instance Functor Set where
type Domain Set a = Ord a
fmap = Set.map
instance Applicative Set where
pure = Set.singleton
liftA2 f xs ys = do
x <- xs
y <- ys
pure (f x y)
instance Monad Set where
(>>=) = flip foldMap
```

And, of course, the probability monad:

```
newtype Prob a = Prob
{ runProb :: Map a Rational
}
instance Functor Prob where
type Domain Prob a = Ord a
fmap f = Prob . Map.mapKeysWith (+) f . runProb
instance Applicative Prob where
pure x = Prob (Map.singleton x 1)
liftA2 f xs ys = do
x <- xs
y <- ys
pure (f x y)
instance Ord a => Monoid (Prob a) where
mempty = Prob Map.empty
mappend (Prob xs) (Prob ys) = Prob (Map.unionWith (+) xs ys)
instance Monad Prob where
Prob xs >>= f
= Map.foldMapWithKey ((Prob .) . flip (Map.map . (*)) . runProb . f) xs
support = Map.keys . runProb
expect p = getSum . Map.foldMapWithKey (\k v -> Sum (p k * v)) . runProb
probOf p = expect (bool 0 1 . p)
uniform xs = Prob (Map.fromList [ (x,n) | x <- xs ])
where
n = 1 % toEnum (length xs)
ifThenElse True t _ = t
ifThenElse False _ f = f
die = uniform [1..6]
>>> probOf (7==) $ do
x <- die
y <- die
pure (x + y)
1 % 6
```

Coming up with the right implementation all at once is quite difficult: luckily, there are more general techniques for designing DSLs that break the problem into smaller parts, which also give us some insight into the underlying composition of the probability monad.

The technique relies on an algebraic concept called “free objects”. A free object for some class is a minimal implementation of that class. The classic example is lists: they’re the free monoid. Monoid requires that you have an additive operation, an empty element, and that the additive operation be associative. Lists have all of these things: what makes them *free*, though, is that they have nothing else. For instance, the additive operation on lists (concatenation) isn’t commutative: if it was, they wouldn’t be the free monoid any more, because they satisfy an extra law that’s not in monoid.

For our case, we can use the free monad: this takes a functor and gives it a monad instance, in a way we know will satisfy all the laws. This encoding is used in several papers (Ścibior, Ghahramani, and Gordon 2015; Larsen 2011).

The idea is to first figure out what primitive operation you need. We’ll use weighted choice:

Then you encode it as a functor:

We’ll say the left-hand-choice has chance $p$, and the right-hand $1-p$. Then, you just wrap it in the free monad:

And you already have a monad instance. Support comes from the `Foldable`

instance:

Expectation is an “interpreter” for the DSL:

```
expect :: (a -> Rational) -> Prob a -> Rational
expect p = iter f . fmap p
where
f (Choose c l r) = l * c + r * (1-c)
```

For building up the tree, we can use Huffman’s algorithm:

```
fromList :: (a -> Rational) -> [a] -> Prob a
fromList p = go . foldMap (\x -> singleton (p x) (Pure x))
where
go xs = case minView xs of
Nothing -> error "empty list"
Just ((xp,x),ys) -> case minView ys of
Nothing -> x
Just ((yp,y),zs) ->
go (insertHeap (xp+yp) (Free (Choose (xp/(xp+yp)) x y)) zs)
```

And finally, it gets the same notation as before:

```
uniform = fromList (const 1)
die = uniform [1..6]
probOf p = expect (bool 0 1 . p)
>>> probOf (7==) $ do
x <- die
y <- die
pure (x + y)
1 % 6
```

One of the advantages of the free approach is that it’s easy to define multiple interpreters. We could, for instance, write an interpreter that constructs a diagram:

```
>>> drawTree ((,) <$> uniform "abc" <*> uniform "de")
┌('c','d')
┌1 % 2┤
│ └('c','e')
1 % 3┤
│ ┌('a','d')
│ ┌1 % 2┤
│ │ └('a','e')
└1 % 2┤
│ ┌('b','d')
└1 % 2┤
└('b','e')
```

There’s a lot to be said about free objects in category theory, also. Specifically, they’re related to initial and terminal (also called final) objects. The encoding above is initial, the final encoding is simply `Cont`

:

Here, also, we get the monad instance for free. In contrast to previously, expect is free:

Support, though, isn’t possible.

This version is also called the Giry monad: there’s a deep and fascinating theory behind it, which I probably won’t be able to do justice to here. Check out Jared Tobin’s post (2017) for a good deep dive on it.

The branching structure of the tree captures the semantics of the probability monad well, but it doesn’t give us much insight into the original implementation. The question is, how can we deconstruct this:

Eric Kidd (2007) pointed out that the monad is the composition of the writer and list monads:

but that seems unsatisfying: in contrast to the tree-based version, we don’t encode any branching structure, we’re able to have empty distributions, and it has the combinatorial explosion problem.

Adding a weighting to nondeterminism is encapsulated more concretely by the `ListT`

transformer. It looks like this:

It’s a cons-list, with an effect before every layer^{1}.

While this can be used to give us the monad we need, I’ve found that something more like this fits the abstraction better:

It’s a nonempty list, with the first element exposed. Turns out this is very similar to the cofree comonad:

Just like the initial free encoding, we can start with a primitive operation:

And we get all of our instances as well:

```
newtype Prob a
= Prob
{ runProb :: Cofree Perhaps a
} deriving (Functor,Foldable)
instance Comonad Prob where
extract (Prob xs) = extract xs
duplicate (Prob xs) = Prob (fmap Prob (duplicate xs))
foldProb :: (a -> Rational -> b -> b) -> (a -> b) -> Prob a -> b
foldProb f b = r . runProb
where
r (x :< Impossible) = b x
r (x :< WithChance p xs) = f x p (r xs)
uniform :: [a] -> Prob a
uniform (x:xs) = Prob (coiterW f (EnvT (length xs) (x :| xs)))
where
f (EnvT 0 (_ :| [])) = Impossible
f (EnvT n (_ :| (y:ys)))
= WithChance (1 % fromIntegral n) (EnvT (n - 1) (y:|ys))
expect :: (a -> Rational) -> Prob a -> Rational
expect p = foldProb f p
where
f x n xs = (p x * n + xs) / (n + 1)
probOf :: (a -> Bool) -> Prob a -> Rational
probOf p = expect (\x -> if p x then 1 else 0)
instance Applicative Prob where
pure x = Prob (x :< Impossible)
(<*>) = ap
append :: Prob a -> Rational -> Prob a -> Prob a
append = foldProb f (\x y -> Prob . (x :<) . WithChance y . runProb)
where
f e r a p = Prob . (e :<) . WithChance ip . runProb . a op
where
ip = p * r / (p + r + 1)
op = p / (r + 1)
instance Monad Prob where
xs >>= f = foldProb (append . f) f xs
```

We see here that we’re talking about gambling-style odds, rather than probability. I wonder if the two representations are dual somehow?

The application of comonads to streams (`ListT`

) has been explored before (Uustalu and Vene 2005); I wonder if there are any insights to be gleaned from this particular probability comonad.

Erwig, Martin, and Steve Kollmansberger. 2006. “Functional pearls: Probabilistic functional programming in Haskell.” *Journal of Functional Programming* 16 (1): 21–34. doi:10.1017/S0956796805005721.

Kidd, Eric. 2007. “Build your own probability monads.”

Larsen, Ken Friis. 2011. “Memory Efficient Implementation of Probability Monads.”

Ścibior, Adam, Zoubin Ghahramani, and Andrew D. Gordon. 2015. “Practical Probabilistic Programming with Monads.” In *Proceedings of the 2015 ACM SIGPLAN Symposium on Haskell*, 50:165–176. Haskell ’15. New York, NY, USA: ACM. doi:10.1145/2804302.2804317.

Tobin, Jared. 2017. “Implementing the Giry Monad.” *jtobin.io*.

Uustalu, Tarmo, and Varmo Vene. 2005. “The Essence of Dataflow Programming.” In *Proceedings of the Third Asian Conference on Programming Languages and Systems*, 2–18. APLAS’05. Berlin, Heidelberg: Springer-Verlag. doi:10.1007/11575467_2.

Note this is

*not*the same as the`ListT`

in transformers; instead it’s a “ListT done right”.↩

Part 4 of a 4-part series on Breadth-First Traversals

Tags: Haskell

After the last post, Noah Easterly pointed me to their tree-traversals library, and in particular the `Phases`

applicative transformer. It allows you to batch applicative effects to be run together: for the breadth-first traversal, we can batch the effects from each level together, giving us a lovely short solution to the problem.

```
breadthFirst c = runPhasesForwards . go
where
go (x:<xs) = liftA2 (:<) (now (c x)) (delay (traverse go xs))
```

In my efforts to speed this implementation up, I came across a wide and interesting literature on scheduling effects, which I’ll go through a little here.

The first thing that jumps to mind, for me, when I think of “scheduling” is coroutines. These are constructs that let you finely control the order of execution of effects. They’re well explored in Haskell by now, and most libraries will let you do something like the following:

We first print `1`

, then, after a delay, we print `2`

. The `delay`

doesn’t make a difference if we just run the whole thing:

But you can see its effect when we use the `interleave`

combinator:

Hopefully you can see how useful this might be, and the similarity to the `Phases`

construction.

The genealogy of most coroutine libraries in Haskell seems to trace back to Blažević (2011) or Kiselyov (2012): the implementation I have been using in these past few examples (`IterT`

) comes from a slightly different place. Let’s take a quick detour to explore it a little.

In functional programming, there are several constructions for modeling error-like states: `Maybe`

for your nulls, `Either`

for your exceptions. What separates these approaches from the “unsafe” variants (null pointers, unchecked exceptions) is that we can *prove*, in the type system, that the error case is handled correctly.

Conspicuously absent from the usual toolbox for modeling partiality is a way to model *nontermination*. At first glance, it may seem strange to attempt to do so in Haskell. After all, if I have a function of type:

I can prove that I won’t throw any errors (with `Either`

, that is), because the type `Int`

doesn’t contain `Left _`

. I’ve also proved, miraculously, that I won’t make any null dereferences, because `Int`

also doesn’t contain `Nothing`

. I *haven’t* proved, however, that I won’t loop infinitely, because (in Haskell), `Int`

absolutely *does* contain $\bot$.

So we’re somewhat scuppered. On the other hand, While we can’t *prove* termination in Haskell, we can:

- Model it.
- Prove it in something else.

Which is exactly what Venanzio Capretta did in the fascinating (and quite accessible) talk “Partiality is an effect” (Capretta, Altenkirch, and Uustalu 2004)^{1}.

The monad in question looks like this:

We’re writing in Idris for the time being, so that we can prove termination and so on. The “recursive call” to `Iter`

is guarded by the `Inf`

type: this turns on a different kind of totality checking in the compiler. Usually, Idris will prevent you from constructing infinite values. But that’s exactly what we want to do here. Take the little-known function `until`

:

It’s clearly not necessarily total, and the totality checker will complain as such when we try and implement it directly:

But we can use `Iter`

to model that possible totality:

```
until : (a -> Bool) -> (a -> a) -> a -> Iter a
until p f x = if p x then Now x else Later (until p f (f x))
```

Of course, nothing’s for free: when we get the ability to construct infinite values, we lose the ability to consume them.

We get an error on the `run`

function. However, as you would expect, we can run *guarded* iteration: iteration up until some finite point.

```
runUntil : Nat -> Iter a -> Maybe a
runUntil Z _ = Nothing
runUntil (S n) (Now x) = Just x
runUntil (S n) (Later x) = runUntil n x
```

Making our way back to Haskell, we must first—as is the law—add a type parameter, and upgrade our humble monad to a monad transformer:

The semantic meaning of the extra `m`

here is interesting: each layer adds not just a recursive step, or a single iteration, but a single effect. Interpreting things in this way gets us back to the original goal:

The `Later`

constructor above can be translated to a `delay`

function on the transformer:

And using this again, we can write the following incredibly short definition for `unfoldTreeM_BF`

:

```
unfoldTreeM_BF :: Monad m => (b -> m (a, [b])) -> b -> m (Tree a)
unfoldTreeM_BF f = retract . go
where
go b = do
(x,xs) <- lift (f b)
fmap (Node x) (interleave (map (delay . go) xs))
```

It would be nice to bring this back to traversals, but alas, `IterT`

is pretty monad-centric. What’s more, if it’s analogous to `Phases`

it certainly doesn’t look like it:

However, in the documentation for `IterT`

, there’s the following little note:

Where `FreeT`

is the free monad transformer. This seems to strongly hint that we could get the same thing for applicatives with `ApT`

. Let’s try it:

The `Applicative`

instance is a little hairy, but it *seems* correct:

```
instance Applicative f =>
Applicative (Phases f) where
pure = Phases . pure
liftA2 f' (Phases (ApT xs')) (Phases (ApT ys')) =
Phases (ApT (liftA2 (go f') xs' ys'))
where
go
:: ∀ a b c.
(a -> b -> c)
-> ApF Identity f a
-> ApF Identity f b
-> ApF Identity f c
go f (Pure x) ys = fmap (f x) ys
go f xs (Pure y) = fmap (`f` y) xs
go f (Ap x (ApT xs)) (Ap y (ApT ys)) =
Ap
(liftA2 (,) x y)
(ApT (liftA2 (go (\xx yy -> uncurry f . (xx *** yy))) xs ys))
```

(on a side note: thank *goodness* for `liftA2`

finally getting into `Applicative`

)

And we get all the normal combinators:

```
delay :: Applicative f => Phases f a -> Phases f a
delay = Phases . ApT . pure . Ap (pure ()) . fmap const . runPhases
lift :: Functor f => f a -> Phases f a
lift = Phases . liftApO
```

The issue comes with running the thing at the end: `Monad`

creeps back in.

```
retract :: Monad f => Phases f a -> f a
retract = fmap (runIdentity . retractAp) . joinApT . runPhases
```

Because the effects are all layered on top of each other, you need to flatten them out at the end, which requires `join`

. Mind you, it does work: it’s just not as general as it could be.

All’s not lost, though. Turns out, we never needed the transformer in the first place: we could just define the different applicative instance straight off.

```
newtype Phases f a = Phases
{ runPhases :: Ap f a
} deriving Functor
instance Applicative f =>
Applicative (Phases f) where
pure = Phases . Pure
liftA2 f' (Phases xs') (Phases ys') = Phases (go f' xs' ys')
where
go :: ∀ a b c.
(a -> b -> c)
-> Ap f a
-> Ap f b
-> Ap f c
go f (Pure x) ys = fmap (f x) ys
go f xs (Pure y) = fmap (`f` y) xs
go f (Ap x xs) (Ap y ys) =
Ap
(liftA2 (,) x y)
(go (\xx yy -> uncurry f . (xx *** yy)) xs ys)
delay :: Applicative f => Phases f a -> Phases f a
delay = Phases . Ap (pure ()) . fmap const . runPhases
retract :: Applicative f => Phases f a -> f a
retract = retractAp . runPhases
lift :: f a -> Phases f a
lift = Phases . liftAp
```

In the wonderful article Coroutine Pipelines (Blažević 2011), several different threads on coroutine-like constructions are unified. What I’ve demonstrated above isn’t yet as powerful as what you might see in a full coroutine library: ideally, you’d want generators and sinks. As it turns out, when we look back at the note from `IterT`

:

We can get both of those other constructs by swapping out `Identity`

^{2}:

(`Sink`

is usually called an `Iteratee`

)

This is the fundamental abstraction that underlies things like the pipes library (Gonzalez 2018).

The only missing part from the first coroutine example by now is `interleave`

. In the free library, it has the following signature:

But we should be able to spot that, really, it’s a traversal. And, as a traversal, it should rely on some underlying `Applicative`

instance. Let’s try and come up with one:

```
newtype Parallel m f a = Parallel
{ runParallel :: FreeT m f a
}
instance (Functor f, Functor m) =>
Functor (Parallel m f) where
fmap f = Parallel . FreeT . fmap go . runFreeT . runParallel
where
go = bimap f (FreeT . fmap go . runFreeT)
instance (Applicative f, Applicative m) =>
Applicative (Parallel m f) where
pure = Parallel . FreeT . pure . Pure
Parallel fs' <*> Parallel xs' = Parallel (unw fs' xs')
where
unw (FreeT fs) (FreeT xs) = FreeT (liftA2 go fs xs)
go (Pure f) = bimap f (runParallel . fmap f . Parallel)
go (Free fs) = Free . \case
Pure x -> fmap (runParallel . fmap ($x) . Parallel) fs
Free xs -> liftA2 unw fs xs
```

Now, interleave is just `sequenceA`

!

So we can see that there’s a “parallel” applicative for both the free monad and the free applicative. To try and understand this type a little better, we can leverage our intuition about a much simpler, more familiar setting: lists. There’s an interesting similarity between lists and the free monad: `FreeT ((,) a)`

) looks a lot like “`ListT`

done right” (so much so, in fact, that most coroutine libraries provide their own version of it). More concretely, list also has a famous “parallel” applicative: `ZipList`

!

```
newtype ZipList a
= ZipList
{ getZipList :: [a]
} deriving Functor
instance Applicative ZipList where
pure = ZipList . repeat
liftA2 f (ZipList xs) (ZipList ys) = ZipList (zipWith f xs ys)
```

We’ll use some of our knowledge about `ZipList`

to help us in the next section.

We’ve seen that efforts to model both coroutines and partiality end up in the same neighborhood: there’s yet another way to get there, which seems (at first) almost the opposite of the second. It starts with a blog post from Conor McBride (2009) called “Time flies like an applicative functor”. Curiously, here too breadth-first labeling is the focus. Remember first the lovely circular solution from (**???**):

```
data Tree a = Leaf | Node a (Tree a) (Tree a)
relabel :: Tree x -> [[a]] -> (Tree a, [[a]])
relabel Leaf xss = (Leaf,xss)
relabel (Node _ l r) ((x:xs):xss0) =
let (l',xss1) = relabel l xss0
(r',xss2) = relabel r xss1
in (Node x l' r',xs:xss2)
bflabel :: Tree x -> [a] -> Tree a
bflabel tr xs = u
where
(u,xss) = relabel tr (xs:xss)
```

As lovely as it is, spare a thought for the poor totality checker: it’s hard to imagine how it would even *start* to show that something so lazy and circular would terminate. `IterT`

won’t help us here, either: it can help us express programs that *might* diverge, not weird-looking ones that definitely won’t.

The solution presented is a type (`De`

) which has a limited set of combinators: a fixpoint (`fix :: (De x -> x) -> x`

), and an applicative instance. As long as all problematic recursive calls are instead expressed using those combinators, the termination checker should be satisfied.

`De`

can be thought of as a “delay” wrapper. Values of type `De a`

are one step in the future, `De (De a)`

are two, and so on. This idea was later expanded upon in Atkey (2011) and Atkey and McBride (2013) to *clock variables*. Instead of types with a delay, types are tagged with how much more time they have (something like “fuel” in the Idris sense, maybe). So a value of type $a^\mathsf{K}$ is tagged with time $\mathsf{K}$, effectively meaning “I have $\mathsf{K}$ productive steps left before I diverge”. “Productive steps” will mean something different for every data type: for lists, it could mean that it can produce up until the $\mathsf{K}$th cons-cell. In the paper (Atkey and McBride 2013) this is fleshed out a little more, with fixpoint combinators and so on. As a concrete example, take the type of the cons operator on streams:

It increments the clock on the type, saying that it has one more productive step than it did before. This is kind of the opposite of a “delay”: previously, the scheduling types have meant “this is available $\mathsf{K}$ number of steps in the future” rather than “this is available for another $\mathsf{K}$ steps”. We can still describe delays in this system, though, using the $\rhd^\mathsf{K}$ notation:

$\begin{equation} \text{Cons} : \text{a} \rightarrow \rhd^\mathsf{K}\text{Stream a} \rightarrow \text{Stream a} \end{equation}$Let’s first try express some of this in the free monad:

```
data K = Z | S K
data Delay :: K -> (Type -> Type) -> (Type -> Type) -> Type -> Type where
Now :: a -> Delay n f m a
Later :: f (DelayT n f m a) -> Delay (S n) f m a
instance (Functor f, Functor m) => Functor (Delay n f m) where
fmap f (Now x) = Now (f x)
fmap f (Later xs) = Later (fmap (fmap f) xs)
newtype DelayT n f m a = DelayT { runDelayT :: m (Delay n f m a) }
instance (Functor f, Functor m) =>
Functor (DelayT n f m) where
fmap f = DelayT . fmap (fmap f) . runDelayT
```

We can straight away express one of the combinators from the paper, `force`

:

```
force :: Functor m => (∀ k. DelayT k f m a) -> m a
force (DelayT xs) = fmap f xs
where
f :: Delay Z f m a -> a
f (Now x) = x
```

Similar trick to `runST`

here: if the type is delayed however long we want it to be, then it mustn’t really be delayed at all.

Next, remember that we have types for streams (generators) from the `IterT`

monad:

And cons does indeed have the right type:

```
cons :: Applicative m => a -> Stream n a m b -> Stream (S n) a m b
cons x xs = DelayT (pure (Later (x,xs)))
```

We also get an applicative:

```
instance (Applicative f, Applicative m) =>
Applicative (DelayT n f m) where
pure = DelayT . pure . Now
DelayT fs' <*> DelayT xs' = DelayT (liftA2 go fs' xs')
where
go :: ∀ k a b. Delay k f m (a -> b) -> Delay k f m a -> Delay k f m b
go (Now f) = fmap f
go (Later fs) = Later . \case
Now x -> fmap (fmap ($x)) fs
Later xs -> liftA2 (<*>) fs xs
```

Now, I’m not sure how much this stuff actually corresponds to the paper, but what caught my eye is the statement that `De`

is a classic “applicative-not-monad”: just like `ZipList`

. However, under the analogy that the free monad is listy, and the parallel construction is ziplist-y, what we have in the `DelayT`

is the equivalent of a length-indexed list. These have an applicative instance similar to ziplists: but they also have a monad. Can we apply the same trick here?

There’s a lot of fascinating stuff out there—about clock variables, especially—that I hope to get a chance to learn about once I get a chance. What I’m particularly interested to follow up on includes:

- Comonads and their relationship to these constructions. Streams are naturally expressed as comonads, could they be used as a basis on which to build a similar “delay” mechanism?
- I’d love to explore more efficient implementations like the ones in Spivey (2017).
- I’m interested to see the relationship between these types, power series, and algebras for combinatorial search (Spivey 2009).

Atkey, Robert. 2011. “How to be a Productive Programmer - by putting things off until tomorrow.” Heriot-Watt University. https://bentnib.org/posts/2011-11-14-productive-programmer.html.

Atkey, Robert, and Conor McBride. 2013. “Productive coprogramming with guarded recursion.” In, 197. ACM Press. doi:10.1145/2500365.2500597. https://bentnib.org/productive.html.

Blažević, Mario. 2011. “Coroutine Pipelines.” *The Monad.Reader* 19 (19) (August): 29–50. https://themonadreader.files.wordpress.com/2011/10/issue19.pdf.

Capretta, Venanzio, Thorsten Altenkirch, and Tarmo Uustalu. 2004. “Partiality is an effect.” In *Dependently Typed Programming*, 04381:20. Dagstuhl seminar proceedings. Dagstuhl, Germany: Internationales Begegnungs- und Forschungszentrum für Informatik (IBFI), Schloss Dagstuhl, Germany. https://www.ioc.ee/~tarmo/tday-veskisilla/uustalu-slides.pdf.

Gonzalez, Gabriel. 2018. “Pipes: Compositional pipelines.” http://hackage.haskell.org/package/pipes.

Kiselyov, Oleg. 2012. “Iteratees.” In *Proceedings of the 11th International Conference on Functional and Logic Programming*, 166–181. Lecture notes in computer science. Berlin, Heidelberg: Springer, Berlin, Heidelberg. doi:10.1007/978-3-642-29822-6_15. http://okmij.org/ftp/Haskell/Iteratee/describe.pdf.

McBride, Conor. 2009. “Time flies like an applicative functor.” *Epilogue for Epigram*. https://mazzo.li/epilogue/index.html\%3Fp=186.html.

Spivey, J. Michael. 2009. “Algebras for combinatorial search.” *Journal of Functional Programming* 19 (3-4) (July): 469–487. doi:10.1017/S0956796809007321. https://pdfs.semanticscholar.org/db3e/373bb6e7e7837ebc524da0a25903958554ed.pdf.

Spivey, Michael. 2017. “Faster coroutine pipelines.” *Proceedings of the ACM on Programming Languages* 1 (ICFP) (August): 1–23. doi:10.1145/3110249. http://dl.acm.org/citation.cfm?doid=3136534.3110249.

There is a later, seemingly more formal version of the talk available (

**???**), but the one from 2004 was a little easier for me to understand, and had a lot more Haskell code.↩Small note:

`(,) a`

and`(->) a`

are adjunct. I wonder if there is any implication from this? Certainly, producers and consumers seem adjunct, but there’s no instance I can find for it in adjunctions.↩

Part 3 of a 4-part series on Breadth-First Traversals

Tags: Haskell

After looking at the algorithms I posted last time, I noticed some patterns emerging which I thought deserved a slightly longer post. I’ll go through the problem (Gibbons 2015) in a little more detail, and present some more algorithms to go along with it.

The original question was posed by Etian Chatav:

What is the correct way to write breadth first traversal of a

`[Tree]`

?

The breadth-first traversal here is a traversal in the lensy sense, i.e:

The `Tree`

type we’re referring to here is a rose tree; we can take the one defined in `Data.Tree`

:

Finally, instead of solving the (somewhat intermediate) problem of traversing a forest, we’ll look directly at traversing the tree itself. In other words, our solution should have the type:

As in Gibbons (2015), let’s first look at just converting the tree to a list in breadth-first order. In other words, given the tree:

```
┌3
┌2┤
│ └4
1┤
│ ┌6
└5┤
└7
```

We want the list:

Last time I looked at this problem, the function I arrived at was as follows:

```
breadthFirstEnumerate :: Tree a -> [a]
breadthFirstEnumerate ts = f ts b []
where
f (Node x xs) fw bw = x : fw (xs : bw)
b [] = []
b qs = foldl (foldr f) b qs []
```

It’s admittedly a little difficult to understand, but it’s really not too complex: we’re popping items off the front of a queue, and pushing the subforest onto the end. `fw`

is the recursive call here: that’s where we send the queue with the element pushed on. Even though it may *look* like we’re pushing onto the front (as we’re using a cons), this is really the *end* of the queue, since it’s being consumed in reverse, with `foldl`

.

We can compare it to the technique used in Allison (2006) and Smith (2009), where it’s called *corecursive queues*. Breadth-first enumeration is accomplished as follows in Smith (2009):

```
levelOrder :: Tree a -> [a]
levelOrder tr = map rootLabel qs
where
qs = enqs [tr] 0 qs
enqs [] n xs = deq n xs
enqs (t:ts) n xs = t : enqs ts (n+1) xs
deq 0 _ = []
deq n (x:xs) = enqs (subForest x) (n-1) xs
```

We get to avoid tracking the length of the queue, however.

Before we go the full way to traversal, we can try add a little structure to our breadth-first enumeration, by delimiting between levels in the tree. We want our function to have the following type:

Looking back at our example tree:

```
┌3
┌2┤
│ └4
1┤
│ ┌6
└5┤
└7
```

We now want the list:

This function is strictly more powerful than `breadthFirstEnumerate`

, as we can define one in terms of the other:

It’s also just a generally useful function, so there are several example implementations available online.

The one provided in Data.Tree is as follows:

Pretty nice, but it looks to me like it’s doing a lot of redundant work. We could write it as an unfold:

```
levels t = unfoldr (f . concat) [[t]]
where
f [] = Nothing
f xs = Just (unzip [(y,ys) | Node y ys <- xs])
```

The performance danger here lies in `unzip`

: one could potentially optimize that for a speedup.

Another definition, in the style of `breadthFirstEnumerate`

above, is as follows:

```
levels ts = f b ts [] []
where
f k (Node x xs) ls qs = k (x : ls) (xs : qs)
b _ [] = []
b k qs = k : foldl (foldl f) b qs [] []
```

Here, we maintain a stack building up the current level, as well as a queue that we send to the next level. Because we’re consing onto the front of the stack, the subforest needs to be traversed in reverse, to build up the output list in the right order. This is why we’re using a second `foldl`

here, whereas the original had `foldr`

on the inner loop.

Looking at the implicit queue version, I noticed that it’s just using a church-encoded pair to reverse the direction of the fold. Instead of doing both reversals, we can use a normal pair, and run it in one direction:

```
levels ts = b (f ts ([],[]))
where
f (Node x xs) (ls,qs) = (x:ls,xs:qs)
b (_,[]) = []
b (k,qs) = k : b (foldr (flip (foldr f)) ([],[]) qs)
```

Secondly, we’re running a fold on the second component of the pair: why not run the fold immediately, rather than building the intermediate list. In fact, we’re running a fold over the *whole* thing, which we can do straight away:

```
levels ts = f ts []
where
f (Node x xs) (q:qs) = (x:q) : foldr f qs xs
f (Node x xs) [] = [x] : foldr f [] xs
```

After looking at it for a while, I realized it’s similar to an inlined version of the algorithm presented in Gibbons (2015):

```
levels t = [rootLabel t] : foldr (lzw (++)) [] (map levels (subForest t))
where
lzw f (x:xs) (y:ys) = f x y : lzw f xs ys
lzw _ xs [] = xs
lzw _ [] ys = ys
```

Before going any further, all of the functions so far can be redefined to work on the cofree comonad:

When `f`

is specialized to `[]`

, we get the original rose tree. So far, though, all we actually require is `Foldable`

.

From now on, then, we’ll use `Cofree`

instead of `Tree`

.

Finally, we can begin on the traversal itself. We know how to execute the effects in the right order, what’s missing is to build the tree back up in the right order.

First thing we’ll use is a trick with `Traversable`

, where we fill a container from a list. In other words:

With the state monad (or applicative, in this case, I suppose), we can define a “pop” action, which takes an element from the supply:

And then we `traverse`

that action over our container:

When we use fill, it’ll have the following type:

```
breadthFirst :: (Applicative f, Traversable t)
=> (a -> f b) -> Cofree t a -> f (Cofree t b)
breadthFirst = ...
where
...
fill :: t (Cofree t a) -> State [Cofree t b] (t (Cofree t b))
fill = traverse (const pop)
```

Hopefully that makes sense: we’re going to get the subforest from here:

And we’re going to fill it with the result of the traversal, which changes the contents from `a`

s to `b`

s.

One of the nice things about working with applicatives is that they compose, in a variety of different ways. In other words, if I have one effect, `f`

, and another `g`

, and I want to run them both on the contents of some list, I can do it in one pass, either by layering the effects, or putting them side-by-side.

In our case, we need to deal with two effects: the one generated by the traversal, (the one the caller wants to use), and the internal state we’re using to fill up the forests in our tree. We could use `Compose`

explicitly, but we can avoid some calls to `pure`

if we write the combinators we’re going to use directly:

```
map2
:: (Functor f, Functor g)
=> (a -> b -> c) -> f a -> g b -> f (g c)
map2 f x xs =
fmap (\y -> fmap (f y) xs) x
app2
:: (Applicative f, Applicative g)
=> (a -> b -> c -> d) -> f a -> g b -> f (g c) -> f (g d)
app2 f x xs =
liftA2 (\y -> liftA2 (f y) xs) x
```

The outer applicative (`f`

) will be the user’s effect, the inner will be `State`

.

First we’ll try convert the zippy-style `levels`

to a traversal. First, convert the function over to the cofree comonad:

```
levels tr = f tr []
where
f (x:<xs) (q:qs) = (x:q) : foldr f qs xs
f (x:<xs) [] = [x] : foldr f [] xs
```

Next, instead of building up a list of just the root labels, we’ll pair them with the subforests:

```
breadthFirst tr = f tr []
where
f (x:<xs) (q:qs) = ((x,xs):q) : foldr f qs xs
f (x:<xs) [] = [(x,xs)] : foldr f [] xs
```

Next, we’ll fill the subforests:

```
breadthFirst tr = f tr []
where
f (x:<xs) (q:qs) = ((x,fill xs):q) : foldr f qs xs
f (x:<xs) [] = [(x,fill xs)] : foldr f [] xs
```

Then, we can run the applicative effect on the root label:

```
breadthFirst c tr = f tr []
where
f (x:<xs) (q:qs) = ((c x,fill xs):q) : foldr f qs xs
f (x:<xs) [] = [(c x,fill xs)] : foldr f [] xs
```

Now, to combine the effects, we can use the combinators we defined before:

```
breadthFirst c tr = f tr []
where
f (x:<xs) (q:qs) =
app2 (\y ys zs -> (y:<ys) : zs) (c x) (fill xs) q : foldr f qs xs
f (x:<xs) [] =
map2 (\y ys -> [y:<ys]) (c x) (fill xs) : foldr f [] xs
```

This builds a list containing all of the level-wise traversals of the tree. To collapse them into one, we can use a fold:

```
breadthFirst :: (Traversable t, Applicative f)
=> (a -> f b)
-> Cofree t a
-> f (Cofree t b)
breadthFirst c tr =
head <$> foldr (liftA2 evalState) (pure []) (f tr [])
where
f (x:<xs) (q:qs) =
app2 (\y ys zs -> (y:<ys):zs) (c x) (fill xs) q : foldr f qs xs
f (x:<xs) [] =
map2 (\y ys -> [y:<ys]) (c x) (fill xs) : foldr f [] xs
```

Converting the queue-based implementation is easy once we’ve done it with the zippy one. The result is (to my eye) a little easier to read, also:

```
breadthFirst
:: (Applicative f, Traversable t)
=> (a -> f b) -> Cofree t a -> f (Cofree t b)
breadthFirst c tr =
fmap head (f b tr e [])
where
f k (x:<xs) ls qs =
k (app2 (\y ys zs -> (y:<ys):zs) (c x) (fill xs) ls) (xs:qs)
b _ [] = pure []
b l qs = liftA2 evalState l (foldl (foldl f) b qs e [])
e = pure (pure [])
```

There are a couple things to notice here: first, we’re not using `map2`

anywhere. That’s because in the zippy version we were able to notice when the queue was exhausted, so we could just output the singleton effect. Here, instead, we’re using `pure (pure [])`

: this is potentially a source of inefficiency, as `liftA2 f (pure x) y`

is less efficient than `fmap (f x) y`

for some applicatives.

On the other hand, we don’t build up a list of levels to be combined with `foldr (liftA2 evalState)`

at any point: we combine them at every level immediately. You may be able to do the same in the zippy version, but I haven’t figured it out yet.

The final point to make here is to do with the very last thing we do in the traversal: `fmap head`

. Strictly speaking, any `fmap`

s in the code should be unnecessary: we *should* be able to fuse them all with any call to `liftA2`

. This transformation is often called the “Yoneda embedding”. We can use it here like so:

```
breadthFirst
:: ∀ t a f b. (Traversable t, Applicative f)
=> (a -> f b) -> Cofree t a -> f (Cofree t b)
breadthFirst c tr = f (b head) tr e []
where
f k (x:<xs) ls qs =
k (app2 (\y ys zs -> (y:<ys) : zs) (c x) (fill xs) ls) (xs : qs)
b :: ∀ x. ([Cofree t b] -> x)
-> f (State [Cofree t b] [Cofree t b])
-> [t (Cofree t a)]
-> f x
b k _ [] = pure (k [])
b k l qs =
liftA2 (\x -> k . evalState x) l (foldl (foldl f) (b id) qs e [])
e = pure (pure [])
```

Notice that we need scoped type variables here, since the type of `b`

changes depending on when it’s called.

Transforming the iterative version is slightly different from the other two:

```
breadthFirst c tr = fmap head (go [tr])
where
go [] = pure []
go xs =
liftA2
evalState
(getCompose (traverse f xs))
(go (foldr (\(_:<ys) b -> foldr (:) b ys) [] xs))
f (x:<xs) = Compose (map2 (:<) (c x) (fill xs))
```

We’re using `Compose`

directly here, in contrast to the other two algorithms.

Performance-wise, no one algorithm wins out in every case. For enumeration, the zippy algorithm is the fastest in most cases—except when the tree had a large branching factor; then, the iterative algorithm wins out. For the traversals, the iterative algorithm is usually better—except for monads with more expensive applicative instances.

I’m still not convinced that the zippy traversal is as optimized as it could be, however. If anyone has a better implementation, I’d love to see it!

Using the composability of applicatives, we can fuse several operations over traversables into one pass. Unfortunately, however, this can often introduce a memory overhead that makes the whole operation slower overall. One such example is the iterative algorithm above:

```
breadthFirst c tr = fmap head (go [tr])
where
go [] = pure []
go xs = liftA2 evalState zs (go (ys []))
where
Compose (Endo ys,Compose zs) = traverse f xs
f (x :< xs) =
Compose
(Endo (flip (foldr (:)) xs)
,Compose (map2 (:<) (c x) (fill xs)))
```

We only traverse the subforest of each node once now, fusing the fill operation with building the list to send to the recursive call. This is expensive (especially memory-wise), though, and traversing the descendant is cheap; the result is that the one-pass version is slower (in my tests).

The cofree comonad allows us to generalize over the type of “descendants”—from lists (in `Tree`

) to anything traversable. We could also generalize over the type of the traversal itself: given a way to access the descendants of a node, we should be able to traverse all nodes in a breadth-first order. This kind of thing is usually accomplished by Plated: it’s a class that gives you a traversal over the immediate descendants of some recursive type. Adapting the iterative version is relatively simple:

```
breadthFirstOf :: Traversal' a a -> Traversal' a a
breadthFirstOf trav c tr = fmap head (go [tr])
where
go [] = pure []
go xs =
liftA2
evalState
(getCompose (traverse f xs))
(go (foldr (\ys b -> foldrOf trav (:) b ys) [] xs))
f xs = Compose (fmap fill (c xs))
fill = trav (const (State (\(x:xs) -> (x, xs))))
```

We can use this version to get back some of the old functions above:

```
breadthFirstEnumerate :: Traversable f => Cofree f a -> [a]
breadthFirstEnumerate = toListOf (breadthFirstOf plate . _extract)
```

Building a tree breadth-first, monadically, is still an unsolved problem (it looks like: Feuer 2015).

Using some of these we can implement a monadic breadth-first unfold for the cofree comonad:

```
unfoldM :: (Monad m, Traversable t)
=> (b -> m (a, t b))
-> b
-> m (Cofree t a)
unfoldM c tr = go head [tr]
where
go k [] = pure (k [])
go k xs = do
ys <- traverse c xs
go (k . evalState (traverse f ys)) (toList (Compose (Compose ys)))
f (x,xs) = fmap (x:<) (fill xs)
```

Allison, Lloyd. 2006. “Circular Programs and Self-Referential Structures.” *Software: Practice and Experience* 19 (2) (October 30): 99–109. doi:10.1002/spe.4380190202. http://users.monash.edu/~lloyd/tildeFP/1989SPE/.

Feuer, David. 2015. “Is a lazy, breadth-first monadic rose tree unfold possible?” Question. *Stack Overflow*. https://stackoverflow.com/q/27748526.

Gibbons, Jeremy. 2015. “Breadth-First Traversal.” *Patterns in Functional Programming*. https://patternsinfp.wordpress.com/2015/03/05/breadth-first-traversal/.

Smith, Leon P. 2009. “Lloyd Allison’s Corecursive Queues: Why Continuations Matter.” *The Monad.Reader*, July 29. https://meldingmonads.files.wordpress.com/2009/06/corecqueues.pdf.

Part 2 of a 4-part series on Breadth-First Traversals

Tags: Haskell

I was looking again at the issue of writing breadth-first traversals for rose trees, and in particular the problem explored in Gibbons (2015). The breadth-first traversal here is a traversal in the lensy sense.

First, let’s look back at getting the levels out of the tree. Here’s the old function I arrived at last time:

```
levels :: Forest a -> [[a]]
levels ts = foldl f b ts [] []
where
f k (Node x xs) ls qs = k (x : ls) (xs : qs)
b _ [] = []
b k qs = k : foldl (foldl f) b qs [] []
```

After wrangling the definition a little, I got to the following (much cleaner) definition:

```
levels :: Tree a -> [[a]]
levels tr = f tr [] where
f (Node x xs) (y:ys) = (x:y) : foldr f ys xs
f (Node x xs) [] = [x] : foldr f [] xs
```

Before going any further, all of the functions so far can be redefined to work on the cofree comonad:

When `f`

is specialized to `[]`

, we get the original rose tree. But what we actually require is much less specific: `levels`

, for instance, only needs `Foldable`

.

```
levelsCofree :: Foldable f => Cofree f a -> [[a]]
levelsCofree tr = f tr []
where
f (x:<xs) (y:ys) = (x:y) : foldr f ys xs
f (x:<xs) [] = [x] : foldr f [] xs
```

Using this, we can write the efficient breadth-first traversal:

```
breadthFirst
:: (Applicative f, Traversable t)
=> (a -> f b) -> Cofree t a -> f (Cofree t b)
breadthFirst c (t:<ts) =
liftA2 evalState (map2 (:<) (c t) (fill ts)) chld
where
chld = foldr (liftA2 evalState) (pure []) (foldr f [] ts)
fill = traverse (const (state (\(x:xs) -> (x,xs))))
f (x:<xs) (q:qs)
= app2 (\y ys zs -> (y:<ys) : zs) (c x) (fill xs) q
: foldr f qs xs
f (x:<xs) []
= map2 (\y ys -> [y:<ys]) (c x) (fill xs)
: foldr f [] xs
map2 k x xs = fmap (\y -> fmap (k y) xs) x
app2 k x xs = liftA2 (\y -> liftA2 (k y) xs) x
```

At every level, the subforest’s shape it taken (`fill`

), and it’s traversed recursively. We can fuse these two steps into one:

```
breadthFirst
:: (Traversable t, Applicative f)
=> (a -> f b) -> Cofree t a -> f (Cofree t b)
breadthFirst c (t:<ts) =
liftA2
evalState
(map2 (:<) (c t) fill)
(foldr (liftA2 evalState) (pure []) (chld []))
where
Compose (Endo chld,fill) = go ts
go = traverse (\x -> Compose (Endo (f x), state (\(y:ys) -> (y,ys))))
f (x:<xs) (q:qs) = app2 (\y ys zs -> (y:<ys) : zs) (c x) r q : rs qs
where Compose (Endo rs,r) = go xs
f (x:<xs) [] = map2 (\y ys -> [y:<ys]) (c x) r : rs []
where Compose (Endo rs,r) = go xs
map2 k x xs = fmap (\y -> fmap (k y) xs) x
app2 k x xs = liftA2 (\y -> liftA2 (k y) xs) x
```

The overhead from this approach scraps any benefit, though.

*Patterns in Functional Programming*. https://patternsinfp.wordpress.com/2015/03/05/breadth-first-traversal/.

Part 1 of a 1-part series on Sorting

Tags: Haskell, Algorithms

I was working on some performance-intensive stuff recently, and I ran into the issue of sorting very small amounts of values (think 3, 4, 5).

The standard way to do this is with sorting networks. The way I’ll be using doesn’t actually perform any parallelism (unfortunately), but it is a clean way to write the networks in Haskell without too much repetition.

This website will generate an optimal sorting network for your given size, and the output (for 3) looks like this:

```
[[1,2]]
[[0,2]]
[[0,1]]
```

Each pair of indices represents a “compare-and-swap” operation: so the first line means “compare the value at 1 to the value at 2: if it’s bigger, swap them”. For 5, the network looks like this:

```
[[0,1],[2,3]]
[[0,2],[1,3]]
[[1,2],[0,4]]
[[1,4]]
[[2,4]]
[[3,4]]
```

Pairs on the same line can be performed in parallel.

For our case, I’m going to be looking at sorting tuples, but the technique can easily be generalized to vectors, etc.

The first trick is to figure out how to do “swapping”: we don’t want mutation, so what we can do instead is swap the *reference* to some value, by shadowing its name. In other words:

```
swap2 :: (a -> a -> Bool) -> a -> a -> (a, a)
swap2 lte x y | lte x y = (x, y)
| otherwise = (y, x)
sort3 :: (a -> a -> Bool) -> (a,a,a) -> (a,a,a)
sort3 lte (_0,_1,_2)
= case swap2 lte _1 _2 of
(_1, _2) -> case swap2 lte _0 _2 of
(_0, _2) -> case swap2 lte _0 _1 of
(_0, _1) -> (_0, _1, _2)
```

The indentation is hard to read, though, and wrapping-and-unwrapping tuples makes me nervous about the performance (although it may be inlined). The next step is to *church-encode* the pairs returned:

```
swap2 :: (a -> a -> Bool) -> a -> a -> (a -> a -> b) -> b
swap2 lte x y k
| lte x y = k x y
| otherwise = k y x
sort3 :: (a -> a -> Bool) -> (a,a,a) -> (a,a,a)
sort3 lte (_0,_1,_2)
= swap2 lte _1 _2 $ \ _1 _2 ->
swap2 lte _0 _2 $ \ _0 _2 ->
swap2 lte _0 _1 $ \ _0 _1 ->
(_0,_1,_2)
```

Then, to get this to compile down to efficient code, we can make judicious use of `inline`

from GHC.Exts:

```
import GHC.Exts (inline)
swap2 :: (a -> a -> Bool) -> a -> a -> (a -> a -> b) -> b
swap2 lte x y k
| inline lte x y = inline k x y
| otherwise = inline k y x
{-# INLINE swap2 #-}
sort3 :: (a -> a -> Bool) -> (a, a, a) -> (a, a, a)
sort3 lte (_0,_1,_2)
= swap2 lte _1 _2 $ \ _1 _2 ->
swap2 lte _0 _2 $ \ _0 _2 ->
swap2 lte _0 _1 $ \ _0 _1 ->
(_0,_1,_2)
{-# INLINE sort3 #-}
```

And to see if this really does make efficient code, let’s look at the core (cleaned up):

```
sort3
= \ (lte :: a -> a -> Bool)
(ds :: (a, a, a)) ->
case ds of wild_X8 (_0, _1, _2) ->
case lte _1 _2 of
False ->
case lte _0 _1 of
False -> (_2, _1, _0)
True ->
case lte _0 _2 of
False -> (_2, _0, _1)
True -> (_0, _2, _1)
True ->
case lte _0 _2 of
False ->
case lte _2 _1 of
False -> (_1, _2, _0)
True -> (_2, _1, _0)
True ->
case lte _0 _1 of
False -> (_1, _0, _2)
True -> wild_X8
```

Fantastic! When we specialize to `Int`

, we get all of the proper unpacking:

Core (with just the variable names cleaned up this time):

```
sort3Int
= \ (w :: (Int, Int, Int)) ->
case w of w_X { (_0, _1, _2) ->
case _0 of w_0 { GHC.Types.I# _0U ->
case _1 of w_1 { GHC.Types.I# _1U ->
case _2 of w_2 { GHC.Types.I# _2U ->
case GHC.Prim.<=# _1U _2U of {
__DEFAULT ->
case GHC.Prim.<=# _0U _1U of {
__DEFAULT -> (w_2, w_1, w_0);
1# ->
case GHC.Prim.<=# _0U _2U of {
__DEFAULT -> (w_2, w_0, w_1);
1# -> (w_0, w_2, w_1)
}
};
1# ->
case GHC.Prim.<=# _0U _2U of {
__DEFAULT ->
case GHC.Prim.<=# _2U _1U of {
__DEFAULT -> (w_1, w_2, w_0);
1# -> (w_2, w_1, w_0)
};
1# ->
case GHC.Prim.<=# _0U _1U of {
__DEFAULT -> (w_1, w_0, w_2);
1# -> w_X
}
}
}
}
}
}
}
```

Now, for the real test: sorting 5-tuples, using the network above.

```
sort5 :: (a -> a -> Bool) -> (a,a,a,a,a) -> (a,a,a,a,a)
sort5 lte (_0,_1,_2,_3,_4)
= swap2 lte _0 _1 $ \ _0 _1 ->
swap2 lte _2 _3 $ \ _2 _3 ->
swap2 lte _0 _2 $ \ _0 _2 ->
swap2 lte _1 _3 $ \ _1 _3 ->
swap2 lte _1 _2 $ \ _1 _2 ->
swap2 lte _0 _4 $ \ _0 _4 ->
swap2 lte _1 _4 $ \ _1 _4 ->
swap2 lte _2 _4 $ \ _2 _4 ->
swap2 lte _3 _4 $ \ _3 _4 ->
(_0,_1,_2,_3,_4)
{-# INLINE sort5 #-}
```

The core output from this is over 1000 lines long: you can see it (with the variable names cleaned up) here.

In my benchmarks, these functions are indeed quicker than their counterparts in vector, but I’m not confident in my knowledge of Haskell performance to make much of a strong statement about them.

]]>
Tags: Haskell, Dependent Types

The code from this post is available as a gist.

One of the most basic tools for use in type-level programming is the Peano definition of the natural numbers:

Using the new `TypeFamilyDependencies`

extension, these numbers can be used to describe the “size” of some type. I’m going to use the proportion symbol here:

Using this type family we can describe induction on the natural numbers:

```
class Finite n where
induction ∷ t ∝ Z → (∀ k. t ∝ k → t ∝ S k) → t ∝ n
instance Finite Z where
induction z _ = z
{-# inline induction #-}
instance Finite n ⇒ Finite (S n) where
induction z s = s (induction z s)
{-# inline induction #-}
```

The `induction`

function reads as the standard mathematical definition of induction: given a proof (value) of the zero case, and a proof that any proof is true for its successor, we can give you a proof of the case for any finite number.

An added bonus here is that the size of something can usually be resolved at compile-time, so any inductive function on it should also be resolved at compile time.

We can use it to provide the standard instances for basic length-indexed lists:

Some instances for those lists are easy:

However, for `Applicative`

, we need some way to recurse on the size of the list. This is where induction comes in.

This lets us write `pure`

in a pleasingly simple way:

But can we also write `<*>`

using induction? Yes! Because we’ve factored out the induction itself, we just need to describe the notion of a “sized” function:

Then we can write `<*>`

as so:

```
instance Finite n ⇒
Applicative (List n) where
pure x = induction Nil (x :-)
(<*>) =
induction
(\Nil Nil → Nil)
(\k (f :- fs) (x :- xs) → f x :- k fs xs)
```

What about the `Monad`

instance? For that, we need a little bit of plumbing: the type signature of `>>=`

is:

One of the parameters (the second `a`

) doesn’t have a size: we’ll need to work around that, with `Const`

:

Using this, we can write our `Monad`

instance:

```
head' ∷ List (S n) a → a
head' (x :- _) = x
tail' ∷ List (S n) a → List n a
tail' (_ :- xs) = xs
instance Finite n ⇒
Monad (List n) where
xs >>= (f ∷ a → List n b) =
induction
(\Nil _ → Nil)
(\k (y :- ys) fn → head' (fn (Const y)) :-
k ys (tail' . fn . Const . getConst))
xs
(f . getConst ∷ Const a n → List n b)
```

Getting the above to work actually took a surprising amount of work: the crux is that the `∝`

type family needs to be injective, so the “successor” proof can typecheck. Unfortunately, this means that every type can only have one notion of “size”. What I’d prefer is to be able to pass in a function indicating exactly *how* to get the size out of a type, that could change depending on the situation. So we could recurse on the first argument of a function, for instance, or just its second, or just the result. This would need either type-level lambdas (which would be cool), or generalized type family dependencies.

Tags: Haskell, Pattern Synonyms

Pattern Synonyms is an excellent extension for Haskell. There are some very cool examples of their use out there, and I thought I’d add to the list.

Lists are *the* fundamental data structure for functional programmers. Unfortunately, once more specialized structures are required, you often have to switch over to an uncomfortable, annoying API which isn’t as pleasant or fun to use as cons and nil. With pattern synonyms, though, that’s not so! For instance, here’s how you would do it with a run-length-encoded list:

```
data List a
= Nil
| ConsN {-# UNPACK #-} !Int
a
(List a)
cons :: Eq a => a -> List a -> List a
cons x (ConsN i y ys)
| x == y = ConsN (i+1) y ys
cons x xs = ConsN 1 x xs
uncons :: List a -> Maybe (a, List a)
uncons Nil = Nothing
uncons (ConsN 1 x xs) = Just (x, xs)
uncons (ConsN n x xs) = Just (x, ConsN (n-1) x xs)
infixr 5 :-
pattern (:-) :: Eq a => a -> List a -> List a
pattern x :- xs <- (uncons -> Just (x, xs))
where
x :- xs = cons x xs
{-# COMPLETE Nil, (:-) #-}
zip :: List a -> List b -> List (a,b)
zip (x :- xs) (y :- ys) = (x,y) :- zip xs ys
zip _ _ = Nil
```

A little more useful would be to do the same with a heap:

```
data Tree a
= Leaf
| Node a (Tree a) (Tree a)
smerge :: Ord a => Tree a -> Tree a -> Tree a
smerge Leaf ys = ys
smerge xs Leaf = xs
smerge h1@(Node x lx rx) h2@(Node y ly ry)
| x <= y = Node x (smerge h2 rx) lx
| otherwise = Node y (smerge h1 ry) ly
cons :: Ord a => a -> Tree a -> Tree a
cons x = smerge (Node x Leaf Leaf)
uncons :: Ord a => Tree a -> Maybe (a, Tree a)
uncons Leaf = Nothing
uncons (Node x l r) = Just (x, smerge l r)
infixr 5 :-
pattern (:-) :: Ord a => a -> Tree a -> Tree a
pattern x :- xs <- (uncons -> Just (x, xs))
where
x :- xs = cons x xs
{-# COMPLETE Leaf, (:-) #-}
sort :: Ord a => [a] -> [a]
sort = go . foldr (:-) Leaf
where
go Leaf = []
go (x :- xs) = x : go xs
```

In fact, this pattern can be generalized, so *any* container-like-thing with a cons-like-thing can be modified as you would with lists. You can see the generalization in lens.

One of the most confusing things I remember about learning Haskell early-on was that the vast majority of the Monads examples didn’t work, because they were written pre-transformers. In other words, the state monad was defined like so:

But in transformers nowadays (which is where you get `State`

from if you import it in the normal way), the definition is:

This results in some *very* confusing error messages when you try run example code.

However, we can pretend that the change never happened, with a simple pattern synonym:

```
newtype StateT s m a = StateT { runStateT :: s -> m (a, s) }
type State s = StateT s Identity
runState :: State s a -> s -> (a, s)
runState xs = runIdentity . runStateT xs
pattern State :: (s -> (a, s)) -> State s a
pattern State x <- (runState -> x)
where
State x = StateT (Identity . x)
```

If you want to write type-level proofs on numbers, you’ll probably end up using Peano numerals and singletons:

```
data Nat = Z | S Nat
data Natty n where
Zy :: Natty Z
Sy :: Natty n -> Natty (S n)
type family (+) (n :: Nat) (m :: Nat) :: Nat where
Z + m = m
S n + m = S (n + m)
plusZeroIsZero :: Natty n -> n + Z :~: n
plusZeroIsZero Zy = Refl
plusZeroIsZero (Sy n) = case plusZeroIsZero n of
Refl -> Refl
```

Pretty cool, right? We can even erase the proof (if we really trust it) using rewrite rules:

This isn’t *ideal*, but it’s getting there.

However, if we ever want to use these things at runtime (perhaps as a type-level indication of some data structure’s size), we’re going to rely on the value-level Peano addition, which is bad news.

Not so with pattern synonyms!

```
data family The k :: k -> Type
class Sing (a :: k) where sing :: The k (a :: k)
data Nat = Z | S Nat
newtype instance The Nat n = NatSing Natural
instance Sing Z where
sing = NatSing 0
instance Sing n => Sing (S n) where
sing =
(coerce :: (Natural -> Natural) -> (The Nat n -> The Nat (S n)))
succ sing
data Natty n where
ZZy :: Natty Z
SSy :: The Nat n -> Natty (S n)
getNatty :: The Nat n -> Natty n
getNatty (NatSing n :: The Nat n) = case n of
0 -> gcastWith (unsafeCoerce Refl :: n :~: Z) ZZy
_ -> gcastWith (unsafeCoerce Refl :: n :~: S m) (SSy (NatSing (pred n)))
pattern Zy :: () => (n ~ Z) => The Nat n
pattern Zy <- (getNatty -> ZZy) where Zy = NatSing 0
pattern Sy :: () => (n ~ S m) => The Nat m -> The Nat n
pattern Sy x <- (getNatty -> SSy x) where Sy (NatSing x) = NatSing (succ x)
{-# COMPLETE Zy, Sy #-}
type family (+) (n :: Nat) (m :: Nat) :: Nat where
Z + m = m
S n + m = S (n + m)
-- | Efficient addition, with type-level proof.
add :: The Nat n -> The Nat m -> The Nat (n + m)
add = (coerce :: (Natural -> Natural -> Natural)
-> The Nat n -> The Nat m -> The Nat (n + m)) (+)
-- | Proof on efficient representation.
addZeroRight :: The Nat n -> n + Z :~: n
addZeroRight Zy = Refl
addZeroRight (Sy n) = gcastWith (addZeroRight n) Refl
```

(unfortunately, incomplete pattern warnings don’t work here)

So you’ve got a tree type:

And you’ve spent some time writing a (reasonably difficult) function on the tree:

```
showTree :: Show a => Tree a -> String
showTree Tip = ""
showTree (Bin x' ls' rs') = go True id xlen' ls'
$ showString xshw'
$ endc ls' rs'
$ showChar '\n'
$ go False id xlen' rs' ""
where
xshw' = show x'
xlen' = length xshw'
go _ _ _ Tip = id
go up k i (Bin x ls rs) = branch True ls
. k
. pad i
. showChar (bool '└' '┌' up)
. showString xshw
. endc ls rs
. showChar '\n'
. branch False rs
where
xshw = show x
xlen = length xshw
branch d
| d == up = go d (k . pad i) (xlen + 1)
| otherwise = go d (k . pad i . showChar '│') xlen
endc Tip Tip = id
endc Bin {} Tip = showChar '┘'
endc Tip Bin {} = showChar '┐'
endc Bin {} Bin {} = showChar '┤'
pad = (++) . flip replicate ' '
```

But, for some reason or another, you need to add a field to your `Bin`

constructor, to store the size of the subtree (for instance). Does this function have to change? No! Simply change the tree definition as so:

```
data Tree a
= Tip
| Bin' Int a (Tree a) (Tree a)
pattern Bin x ls rs <- Bin' n x ls rs
{-# COMPLETE Tip, Bin #-}
```

And all the old code works!

This gets to the core of pattern synonyms: it’s another tool which we can use to separate implementation from API.

Say you’ve got a data type that has certain constraints on what values it can hold. You’re not writing a paper for ICFP, so expressing those constraints as a beautiful type isn’t required: you just want to only export the constructor and accessors, and write some tests to make sure that those functions always obey the constraints.

But once you do this you’ve lost something: pattern-matching. Let’s get it back with pattern synonyms!

As our simple example, our constraint is going to be “A list where the values are always ordered”:

]]>
Tags: Haskell, Algorithms

There’s a popular UK TV show called Countdown with a round where contestants have to get as close to some target number as possible by constructing an arithmetic expression from six random numbers.

You don’t have to use all of the numbers, and you’re allowed use four operations: addition, subtraction, multiplication, and division. Additionally, each stage of the calculation must result in a positive integer.

Here’s an example. Try get to the target 586:

$100,25,1,5,3,10$

On the show, contestants get 30 seconds to think of an answer.

Solving it in Haskell was first explored in depth in Hutton (2002). There, a basic “generate-and-test” implementation was provided and proven correct.

As an optimization problem, there are several factors which will influence the choice of algorithm:

- There’s no obvious heuristic for constructing subexpressions in order to get to a final result. In other words, if we have $25 * 3 + 10$ and $25 * 3 * 10$, there’s no easy way to tell which is “closer” to $586$. The latter is closer numerically, but the former is what we ended up using in the solution.
- Because certain subexpressions aren’t allowed, we’ll be able to prune the search space as we go.
- Ideally, we’d only want to calculate each possible subexpression once, making it a pretty standard dynamic programming problem.

I’ll be focusing on the third point in this post, but we can add the second point in at the end. First, however, let’s write a naive implementation.

I can’t think of a simpler way to solve the problem than generate-and-test, so we’ll work from there. Testing is easy (`(target ==) . eval`

), so we’ll focus on generation. The core function we’ll use for this is usually called “unmerges”:

```
unmerges [x,y] = [([x],[y])]
unmerges (x:xs) =
([x],xs) :
concat
[ [(x:ys,zs),(ys,x:zs)]
| (ys,zs) <- unmerges xs ]
unmerges _ = []
```

It generates all possible 2-partitions of a list, ignoring order:

I haven’t looked much into how to optimize this function or make it nicer, as we’ll be swapping it out later.

Next, we need to make the recursive calls:

```
allExprs :: (a -> a -> [a]) -> [a] -> [a]
allExprs _ [x] = [x]
allExprs c xs =
[ e
| (ys,zs) <- unmerges xs
, y <- allExprs c ys
, z <- allExprs c zs
, e <- c y z ]
```

Finally, using the simple-reflect library, we can take a look at the output:

```
>>> allExprs (\x y -> [x+y,x*y]) [1,2] :: [Expr]
[1 + 2,1 * 2]
>>> allExprs (\x y -> [x+y]) [1,2,3] :: [Expr]
[1 + (2 + 3),1 + 2 + 3,2 + (1 + 3)]
```

Even at this early stage, we can actually already write a rudimentary solution:

```
countdown :: [Integer] -> Integer -> [Expr]
countdown xs targ =
filter
((==) targ . toInteger)
(allExprs
(\x y -> [x,y,x+y,x*y])
(map fromInteger xs))
>>> mapM_ print (countdown [100,25,1,5,3,10] 586)
1 + (100 * 5 + (25 * 3 + 10))
1 + (100 * 5 + 25 * 3 + 10)
1 + (25 * 3 + (100 * 5 + 10))
1 + 100 * 5 + (25 * 3 + 10)
100 * 5 + (1 + (25 * 3 + 10))
100 * 5 + (1 + 25 * 3 + 10)
100 * 5 + (25 * 3 + (1 + 10))
1 + (100 * 5 + 25 * 3) + 10
1 + 100 * 5 + 25 * 3 + 10
100 * 5 + (1 + 25 * 3) + 10
100 * 5 + 25 * 3 + (1 + 10)
1 + 25 * 3 + (100 * 5 + 10)
25 * 3 + (1 + (100 * 5 + 10))
25 * 3 + (1 + 100 * 5 + 10)
25 * 3 + (100 * 5 + (1 + 10))
```

As you can see from the output, there’s a lot of repetition. We’ll need to do some memoization to speed it up.

The normal way most programmers think about “memoization” is something like this:

```
memo_dict = {0:0,1:1}
def fib(n):
if n in memo_dict:
return memo_dict[n]
else:
res = fib(n-1) + fib(n-2)
memo_dict[n] = res
return res
```

In other words, it’s a fundamentally stateful process. We need to mutate some mapping when we haven’t seen the argument before.

Using laziness, though, we can emulate the same behavior purely. Instead of mutating the mapping on function calls, we fill the whole thing at the beginning, and then index into it. As long as the mapping is lazy, it’ll only evaluate the function calls when they’re needed. We could use lists as our mapping to the natural numbers:

The benefit here is that we avoid the extra work of redundant calls. However, we pay for the speedup in three ways:

- Space: we need to take up memory space storing the cached solutions.
- Indexing: while we no longer have to pay for the expensive recursive calls, we
*do*now have to pay for indexing into the data structure. In this example, we’re paying linear time to index into the list. - Generality: the memoization is tied directly to the argument type to the function. We need to be able to use the argument to our memoized function as an index into some data structure. While a lot of argument types admit some type of indexing (whether they’re
`Hashable`

,`Ord`

, etc.), some don’t, and we can’t memoize those using this technique.

We’re going to look at a technique that allow us to somewhat mitigate 2 and 3 above, using something called a *nexus*.

The standard technique of memoization is focused on the arguments to the function, creating a concrete representation of them in memory to map to the results. Using nexuses, as described in Bird and Hinze (2003), we’ll instead focus on the function itself, creating a concrete representation of its call graph in memory. Here’s the call graph of Fibonacci:

```
┌fib(1)=1
┌fib(2)=1┤
│ └fib(0)=0
┌fib(3)=2┤
│ └fib(1)=1
┌fib(4)=3┤
│ │ ┌fib(1)=1
│ └fib(2)=1┤
│ └fib(0)=0
┌fib(5)=5┤
│ │ ┌fib(1)=1
│ │ ┌fib(2)=1┤
│ │ │ └fib(0)=0
│ └fib(3)=2┤
│ └fib(1)=1
fib(6)=8┤
│ ┌fib(1)=1
│ ┌fib(2)=1┤
│ │ └fib(0)=0
│ ┌fib(3)=2┤
│ │ └fib(1)=1
└fib(4)=3┤
│ ┌fib(1)=1
└fib(2)=1┤
└fib(0)=0
```

Turning *that* into a concrete datatype wouldn’t do us much good: it still has the massively redundant computations in it. However, we can recognize that entire subtrees are duplicates of each other: in those cases, instead of creating both subtrees, we could just create one and have each parent point to it^{1}:

```
┌fib(5)=5┬────────┬fib(3)=2┬────────┬fib(1)=1
fib(6)=8┤ │ │ │ │
└────────┴fib(4)=3┴────────┴fib(2)=1┴fib(0)=0
```

This is a nexus. In Haskell, it’s not observably different from the other form, except that it takes up significantly less space. It’s also much quicker to construct.

If we use it to memoize `fib`

, we’ll no longer be indexing on the argument: we’ll instead follow the relevant branch in the tree to the subcomputation, which is just chasing a pointer. It also means the argument doesn’t have to be constrained to any specific type. Here’s how you’d do it:

```
data Tree
= Leaf
| Node
{ val :: Integer
, left :: Tree
, right :: Tree}
fib :: Integer -> Integer
fib = val . go
where
go 0 = Node 0 Leaf Leaf
go 1 = Node 1 (Node 0 Leaf Leaf) Leaf
go n = node t (left t) where t = go (n-1)
node l r = Node (val l + val r) l r
```

So this approach sounds amazing, right? No constraints on the argument type, no need to pay for indexing: why doesn’t everyone use it everywhere? The main reason is that figuring out a nexus for the call-graph is *hard*. In fact, finding an optimal one is NP-hard in general (Steffen and Giegerich 2006).

The second problem is that it’s difficult to abstract out. The standard technique of memoization relies on building a mapping from keys to values: about as bread-and-butter as it gets in programming. Even more, we already know how to say “values of this type can be used efficiently as keys in some mapping”: for Data.Map it’s `Ord`

, for Data.HashMap it’s `Hashable`

. All of this together means we can build a nice library for memoization which exports the two following functions:

Building a nexus, however, is not bread-and-butter. On top of that, it’s difficult to say something like “recursive functions of this structure can be constructed using a nexus”. What’s the typeclass for that? In comparison to the signatures above, the constraint will need to be on the *arrows*, not the `a`

. Even talking about the structure of recursive functions is regarded as somewhat of an advanced subject: that said, the recursion-schemes package allows us to do so, and even has facilities for constructing something *like* nexuses with histomorphisms (Tobin 2016). I’m still looking to see if there’s a library out there that *does* manage to abstract nexuses in an ergonomic way, so I’d love to hear if there was one (or if there’s some more generalized form which accomplishes the same).

That’s enough preamble. The nexus we want to construct for countdown is *not* going to memoize as much as possible: in particular, we’re only going to memoize the shape of the trees, not the operators used. This will massively reduce the memory overhead, and still give a decent speedup (Bird and Mu 2005, 11 “building a skeleton tree first”).

With that in mind, the ideal nexus looks something like this:

We can represent the tree in Haskell as a rose tree:

Constructing the nexus itself isn’t actually the most interesting part of this solution: *consuming* it is. We need to be able to go from the structure above into a list that’s the equivalent of `unmerges`

. Doing a breadth-first traversal of the diagram above (without the top element) will give us:

$abc, abd, acd, bcd, ab, ac, bc, ad, bd, cd, a, b, c, d$

If you split that list in half, and zip it with its reverse, you’ll get the output of `unmerges`

.

However, the breadth-first traversal of the diagram isn’t the same thing as the breadth-first traversal of the rose tree. The latter will traverse $abc, abd, acd, bcd$, and then the children of $abc$ ($ab,ac,bc$), and then the children of $abd$ ($ab,ad,bd$): and here’s our problem. We traverse $ab$ twice, because we can’t know that $abc$ and $abd$ are pointing to the same value. What we have to do is first prune the tree, removing duplicates, and then perform a breadth-first traversal on that.

Luckily, the duplicates follow a pattern, allowing us to remove them without having to do any equality checking. In each row, the first node has no duplicates in its children, the second’s first child is a duplicate, the third’s first and second children are duplicates, and so on. You should be able to see this in the diagram above. Adapting a little from the paper, we get an algorithm like this:

```
para :: (a -> [a] -> b -> b) -> b -> [a] -> b
para f b = go
where
go [] = b
go (x:xs) = f x xs (go xs)
prune :: Forest a -> Forest a
prune ts = pruneAt ts 0
where
pruneAt = para f (const [])
f (Node x []) t _ _ = Node x [] : t
f (Node x us) _ a k =
Node x (pruneAt (drop k us) k) : a (k + 1)
```

I went through this in a previous post, so this is the end solution:

```
breadthFirst :: Forest a -> [a]
breadthFirst ts = foldr f b ts []
where
f (Node x xs) fw bw = x : fw (xs:bw)
b [] = []
b q = foldl (foldr f) b q []
```

With the appropriate incantations, this is actually the fastest implementation I’ve found.

We can actually inline both of the above functions, fusing them together:

```
spanNexus :: Forest a -> [a]
spanNexus ts = foldr f (const b) ts 0 []
where
f (Node x us) fw k bw = x : fw (k+1) ((drop k us, k) : bw)
b [] = []
b qs = foldl (uncurry . foldr f . const) b qs []
```

So, now we can go from the tree to our list of splits. Next step is to convert that list into the output of unmerges, by zipping the reverse of the first half with the second. We can use an algorithm described in Danvy and Goldberg (2005) to do the zipping and reversing:

```
fold xs n = go xs n (const [])
where
go xs 0 k = k xs
go (x:xs) n k = go xs (n-2) (\(y:ys) -> (x,y) : k ys)
```

And we can inline the function which collapses those results into one:

```
fold xs n = go xs n (const [])
where
go 0 xss k = k xss
go n (xs:xss) k =
go (n-2) xss (\(ys:yss) -> [ z
| x <- xs
, y <- ys
, z <- cmb x y
] ++ k yss)
```

And that’s all we need!

```
import qualified Data.Tree as Rose
data Tree a
= Leaf Int a
| Node [Tree a]
deriving (Show,Eq,Functor)
enumerateTrees :: (a -> a -> [a]) -> [a] -> [a]
enumerateTrees _ [] = []
enumerateTrees cmb xs = (extract . steps . initial) xs
where
step = map nodes . group
steps [x] = x
steps xs = steps (step xs)
initial = map (Leaf 1 . flip Rose.Node [] . pure)
extract (Leaf _ x) = Rose.rootLabel x
extract (Node [x]) = extract x
group [_] = []
group (Leaf _ x:vs) = Node [Leaf 2 [x, y] | Leaf _ y <- vs] : group vs
group (Node u:vs) = Node (zipWith comb (group u) vs) : group vs
comb (Leaf n xs) (Leaf _ x) = Leaf (n + 1) (xs ++ [x])
comb (Node us) (Node vs) = Node (zipWith comb us vs)
forest ts = foldr f (const b) ts 0 []
where
f (Rose.Node x []) fw !k bw = x : fw (k + 1) bw
f (Rose.Node x us) fw !k bw = x : fw (k + 1) ((drop k us, k) : bw)
b [] = []
b qs = foldl (uncurry . foldr f . const) b qs []
nodes (Leaf n x) = Leaf 1 (node n x)
nodes (Node xs) = Node (map nodes xs)
node n ts = Rose.Node (walk (2 ^ n - 2) (forest ts) (const [])) ts
where
walk 0 xss k = k xss
walk n (xs:xss) k =
walk (n-2) xss (\(ys:yss) -> [ z
| x <- xs
, y <- ys
, z <- cmb x y
] ++ k yss)
```

The first thing to do for the Countdown solution is to figure out a representation for expressions. The one from simple-reflect is perfect for displaying the result, but we should memoize its calculation.

Then, some helpers for building:

```
data Op = Add | Dif | Mul | Div
binOp f g x y = Memoed ((f `on` expr) x y) ((g `on` result) x y)
apply :: Op -> Memoed -> Memoed -> Memoed
apply Add x y = binOp (+) (+) x y
apply Dif x y
| result y < result x = binOp (-) (-) x y
| otherwise = binOp (-) (-) y x
apply Mul x y = binOp (*) (*) x y
apply Div x y = binOp div div x y
```

Finally, the full algorithm:

```
enumerateExprs :: [Int] -> [Memoed]
enumerateExprs = enumerateTrees cmb . map (\x -> Memoed (fromIntegral x) x)
where
cmb x y =
nubs $
x :
y :
[ apply op x y
| op <- [Add, Dif, Mul, Div]
, legal op (result x) (result y) ]
legal Add _ _ = True
legal Dif x y = x /= y
legal Mul _ _ = True
legal Div x y = x `mod` y == 0
nubs xs = foldr f (const []) xs IntSet.empty
where
f e a s
| IntSet.member (result e) s = a s
| otherwise = e : a (IntSet.insert (result e) s)
countdown :: Int -> [Int] -> [Expr]
countdown targ = map expr . filter ((==) targ . result) . enumerateExprs
>>> (mapM_ print . reduction . head) (countdown 586 [100,25,1,5,3,10])
25 * 3 + 1 + (100 * 5 + 10)
75 + 1 + (100 * 5 + 10)
76 + (100 * 5 + 10)
76 + (500 + 10)
76 + 510
586
```

There are some optimizations going on here, taken mainly from Bird and Mu (2005):

- We filter out illegal operations, as described originally.
- We filter out any expressions that have the same value.

So we’ve followed the paper, written the code: time to test. The specification of the function is relatively simple: calculate all applications of the commutative operator to some input, *without* recalculating subtrees.

We’ll need a free structure for the “commutative operator”:

Here’s the problem: it’s not commutative! We can remedy it by only exporting a constructor that creates the tree in a commutative way, and we can make it a pattern synonym so it looks normal:

```
{-# LANGUAGE DeriveFoldable #-}
{-# LANGUAGE PatternSynonyms #-}
module Commutative
(Tree(Leaf)
,pattern (:*:))
where
data Tree a
= Leaf a
| Tree a :^: Tree a
deriving (Eq,Ord,Show,Foldable)
pattern (:*:) :: Ord a => Tree a -> Tree a -> Tree a
pattern xs :*: ys <- xs :^: ys where
xs :*: ys
| xs <= ys = xs :^: ys
| otherwise = ys :^: xs
{-# COMPLETE Leaf, (:*:) #-}
```

Now we need to check if all applications are actually tested. First, to generate all trees:

```
allTrees :: Ord a => [a] -> Set (Tree a)
allTrees [x] = Set.singleton (Leaf x)
allTrees xs = Set.unions (map (uncurry f) (unmerges xs))
where
f ls rs = Set.fromList ((liftA2 (:*:) `on` (Set.toList . allTrees)) ls rs)
allSubTrees :: Ord a => [a] -> Set (Tree a)
allSubTrees [x] = Set.singleton (Leaf x)
allSubTrees xs =
Set.unions (map (uncurry f . (allSubTrees *** allSubTrees)) (unmerges xs))
where
f ls rs =
Set.unions
[ls, rs, Set.fromList ((liftA2 (:*:) `on` Set.toList) ls rs)]
```

Then, to test:

```
prop_exhaustiveSearch :: Natural -> Bool
prop_exhaustiveSearch n =
let src = [0 .. fromIntegral n]
expect = allSubTrees src
actual =
Set.fromList
(enumerateTrees
(\xs ys ->
[xs, ys, xs :*: ys])
(map Leaf src))
in expect == actual
prop_exhaustiveSearchFull :: Natural -> Bool
prop_exhaustiveSearchFull n =
let src = [0 .. fromIntegral n]
expect = Map.fromSet (const 1) (allTrees src)
actual =
freqs
(enumerateTrees
(\xs ys -> [xs :*: ys])
(map Leaf src))
in expect == actual
```

Testing for repeated calls is more tricky. Remember, the memoization is supposed to be unobservable: in order to see it, we’re going to have to use some unsafe operations.

```
traceSubsequences
:: ((Tree Int -> Tree Int -> [Tree Int]) -> [Tree Int] -> [Tree Int])
-> [Int]
-> (Map (Tree Int) Int, [Tree Int])
traceSubsequences enm ints =
runST $
do ref <- newSTRef Map.empty
let res = enm (combine ref) (map (conv ref) ints)
traverse_ (foldr seq (pure ())) res
intm <- readSTRef ref
pure (intm, res)
where
combine ref xs ys = unsafeRunST ([xs :*: ys] <$ modifySTRef' ref (incr (xs :*: ys)))
{-# NOINLINE combine #-}
conv ref x = unsafeRunST (Leaf x <$ modifySTRef' ref (incr (Leaf x)))
{-# NOINLINE conv #-}
unsafeRunST cmp = unsafePerformIO (unsafeSTToIO cmp)
prop_noRepeatedCalls :: Property
prop_noRepeatedCalls =
property $ sized $
\n ->
pure $
let src = [0 .. n]
(tint,tres) = fmap freqs (traceSubsequences enumerateTrees src)
(fint,fres) = fmap freqs (traceSubsequences dummyEnumerate src)
in counterexample
(mapCompare (freqs (allSubTrees src)) tint)
(all (1 ==) tint) .&&.
counterexample (mapCompare tres fres) (tres == fres) .&&.
(n > 2 ==> tint /= fint)
```

Here, `dummyEnumerate`

is some method which performs the same task, but *doesn’t* construct a nexus, so we can ensure that our tests really do catch faulty implementations.

Bird, Richard, and Ralf Hinze. 2003. “Functional Pearl Trouble Shared is Trouble Halved.” In *Proceedings of the 2003 ACM SIGPLAN Workshop on Haskell*, 1–6. Haskell ’03. New York, NY, USA: ACM. doi:10.1145/871895.871896. http://doi.acm.org/10.1145/871895.871896.

Bird, Richard, and Shin-Cheng Mu. 2005. “Countdown: A case study in origami programming.” *Journal of Functional Programming* 15 (05) (August): 679. doi:10.1017/S0956796805005642. http://www.journals.cambridge.org/abstract_S0956796805005642.

Danvy, Olivier, and Mayer Goldberg. 2005. “There and Back Again.” *BRICS Report Series* 12 (3). doi:10.7146/brics.v12i3.21869. https://tidsskrift.dk/brics/article/view/21869.

Hutton, Graham. 2002. “The Countdown Problem.” *J. Funct. Program.* 12 (6) (November): 609–616. doi:10.1017/S0956796801004300. http://www.cs.nott.ac.uk/~pszgmh/countdown.pdf.

Steffen, Peter, and Robert Giegerich. 2006. “Table Design in Dynamic Programming.” *Information and Computation* 204 (9) (September): 1325–1345. doi:10.1016/j.ic.2006.02.006. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.85.601&rep=rep1&type=pdf.

Tobin, Jared. 2016. “Time Traveling Recursion Schemes.” *jtobin.io*. https://jtobin.io/time-traveling-recursion.

If you think that structure looks more like a funny linked list than a tree, that’s because it is. Instead of talking about “left” and “right” branches, we could talk about the first and second elements in a list: in fact, this is exactly what’s happening in the famous

`zipWith`

Fibonacci implementation (in reverse).Or, in my favourite version:

↩