Here’s a useful function from Data.List:

```
groupBy :: (a -> a -> Bool) -> [a] -> [[a]]
groupBy (==) "aabcdda"
-- ["aa","b","c","dd","a"]
```

However, as has been pointed out before, `groupBy`

expects an equivalence relation, and can exhibit surprising behavior when it doesn’t get one. Let’s say, for instance, that we wanted to group numbers that were close together:

```
groupClose :: [Integer] -> [[Integer]]
groupClose = groupBy (\x y -> abs (x - y) < 3)
```

What would you expect on the list `[1, 2, 3, 4, 5]`

? All in the same group? Well, what you actually get is:

`[[1,2,3],[4,5]]`

This is because the implementation of `groupBy`

only compares to the first element in each group:

```
groupBy _ [] = []
groupBy eq (x:xs) = (x:ys) : groupBy eq zs
where (ys,zs) = span (eq x) xs
```

Brandon Simmons gave a definition of `groupBy`

that is perhaps more useful, but it used explicit recursion, rather than a fold.

A definition with `foldr`

turned out to be trickier than I expected. I found some of the laziness properties especially difficult:

```
>>> head (groupBy (==) (1:2:undefined))
[1]
>>> (head . head) (groupBy (==) (1:undefined))
1
>>> (head . head . tail) (groupBy (==) (1:2:undefined))
2
```

Here’s the definition I came up with, after some deliberation:

```
groupBy :: (a -> a -> Bool) -> [a] -> [[a]]
groupBy p xs = build (\c n ->
let f x a q
| q x = (x : ys, zs)
| otherwise = ([], c (x : ys) zs)
where (ys,zs) = a (p x)
in snd (foldr f (const ([], n)) xs (const False)))
{-# INLINE groupBy #-}
```

Seemingly benign changes to the function will break one or more of the above tests. In particular, the laziness of a “where” binding needs to be taken into account. Here’s an early attempt which failed:

```
groupBy :: (a -> a -> Bool) -> [a] -> [[a]]
groupBy p xs = build (\c n ->
let f x a q d
| q x = a (p x) (d . (:) x)
| otherwise = d [] (a (p x) (c . (:) x))
in foldr f (\_ d -> d [] n) xs (const False) (\ _ y -> y))
```

Once done, though, it works as expected:

```
>>> groupBy (==) "aaabcccdda"
["aaa","b","ccc","dd","a"]
>>> groupBy (==) []
[]
>>> groupBy (<=) [1,2,2,3,1,2,0,4,5,2]
[[1,2,2,3],[1,2],[0,4,5],[2]]
```

It’s the fastest version I could find that obeyed the above laziness properties.

]]>There are three main ways to fold things in Haskell: from the right, from the left, and from either side. Let’s look at the left vs right variants first. `foldr`

works from the right:

```
foldr (+) 0 [1,2,3]
1 + (2 + (3 + 0))
```

And `foldl`

from the left:

```
foldl (+) 0 [1,2,3]
((0 + 1) + 2) + 3
```

As you’ll notice, the result of the two operations above is the same (6; although one may take much longer than the other). In fact, *whenever* the result of `foldr`

and `foldl`

is the same for a pair of arguments (in this case `+`

and `0`

), we say that that pair forms a `Monoid`

for some type (well, there’s some extra stuff to do with `0`

, but I only care about associativity at the moment). In this case, the `Sum`

monoid is formed:

```
newtype Sum a = Sum { getSum :: a }
instance Num a => Monoid (Sum a) where
mempty = Sum 0
mappend (Sum x) (Sum y) = Sum (x + y)
```

When you know that you have a monoid, you can use the `foldMap`

function: this is the third kind of fold. It says that you don’t care which of `foldl`

or `foldr`

is used, so the implementer of `foldMap`

can put the parentheses wherever they want:

```
foldMap Sum [1,2,3]
(1 + 2) + (3 + 0)
0 + ((1 + 2) + 3)
((0 + 1) + 2) + 3
```

And we can’t tell the difference from the result. This is a pretty bare-bones introduction to folds and monoids: you won’t need to know more than that for the rest of this post, but the topic area is fascinating and deep, so don’t let me give you the impression that I’ve done anything more than scratched the surface.

Quite often, we *do* care about where the parentheses go. Take, for instance, a binary tree type, with values at the leaves:

```
data Tree a
= Empty
| Leaf a
| Tree a :*: Tree a
instance Show a =>
Show (Tree a) where
show Empty = "()"
show (Leaf x) = show x
show (l :*: r) = "(" ++ show l ++ "*" ++ show r ++ ")"
```

We can’t (well, shouldn’t) us `foldMap`

here, because we would be able to tell the difference between different arrangements of parentheses:

```
foldMap something [1,2,3]
((1*2)*(3*()))
(()*((1*2)*3))
(((()*1)*2)*3)
```

So we use one of the folds which lets us choose the arrangements of parentheses:

```
(foldr (:*:) Empty . map Leaf) [1,2,3,4,5,6]
-- (1*(2*(3*(4*(5*(6*()))))))
(foldl (:*:) Empty . map Leaf) [1,2,3,4,5,6]
-- ((((((()*1)*2)*3)*4)*5)*6)
```

The issue is that neither of the trees generated are necessarily what we want: often, we want something more *balanced*.

To try and find a more balanced fold, let’s (for now) assume we’re always going to get non-empty input. This will let us simplify the `Tree`

type a little, to:

```
data Tree a
= Leaf a
| Tree a :*: Tree a
deriving Foldable
instance Show a =>
Show (Tree a) where
show (Leaf x) = show x
show (l :*: r) = "(" ++ show l ++ "*" ++ show r ++ ")"
```

Then, we can use Jon Fairbairn’s fold described in this email, adapted a bit for our non-empty input:

```
import Data.List.NonEmpty (NonEmpty(..))
treeFold :: (a -> a -> a) -> NonEmpty a -> a
treeFold f = go
where
go (x :| []) = x
go (a :| b:l) = go (f a b :| pairMap l)
pairMap (x:y:rest) = f x y : pairMap rest
pairMap xs = xs
```

There are two parts to this function: `pairMap`

and the `go`

helper. `pairMap`

combines adjacent elements in the list using the combining function. As a top-level function it might look like this:

```
pairMap f (x:y:rest) = f x y : pairMap f rest
pairMap f xs = xs
pairMap (++) ["a","b","c","d","e"]
-- ["ab","cd","e"]
```

As you can see, it leaves any leftovers untouched at the end of the list.

The `go`

helper applies `pairMap`

repeatedly to the list until it has only one element. This gives us much more balanced results that `foldl`

or `foldr`

(turn on `-XOverloadedLists`

to write non-empty lists using this syntax):

```
(treeFold (:*:) . fmap Leaf) [1,2,3,4,5,6]
-- (((1*2)*(3*4))*(5*6))
(treeFold (:*:) . fmap Leaf) [1,2,3,4,5,6,7,8]
-- (((1*2)*(3*4))*((5*6)*(7*8)))
```

However, there are still cases where one branch will be much larger than its sibling. The fold fills a balanced binary tree from the left, but any leftover elements are put at the top level. In other words:

```
(treeFold (:*:) . fmap Leaf) [1..9]
-- ((((1*2)*(3*4))*((5*6)*(7*8)))*9)
```

That `9`

hanging out on its own there is a problem.

One observation we can make is that `pairMap`

always starts from the same side on each iteration, like a typewriter moving from one line to the next. This has the consequence of building up the leftovers on one side, leaving them until the top level.

We can improve the situation slightly by going back and forth, slalom-style, so we consume leftovers on each iteration:

```
treeFold :: (a -> a -> a) -> NonEmpty a -> a
treeFold f = goTo where
goTo (y :| []) = y
goTo (a :| b : rest) = goFro (pairMap f (f a b) rest)
goFro (y :| []) = y
goFro (a :| b : rest) = goTo (pairMap (flip f) (f b a) rest)
pairMap f = go [] where
go ys y (a:b:rest) = go (y:ys) (f a b) rest
go ys y [z] = z :| y : ys
go ys y [] = y :| ys
```

Notice that we have to flip the combining function to make sure the ordering is the same on output. For the earlier example, this solves the issue:

```
(treeFold (:*:) . fmap Leaf) [1..9]
-- (((1*2)*((3*4)*(5*6)))*((7*8)*9))
```

It does *not* build up the tree as balanced as it possibly could, though:

```
(treeFold (:*:) . fmap Leaf) [1,2,3,4,5,6]
-- ((1*2)*((3*4)*(5*6)))
```

There’s four elements in the right branch, and two in the left in the above example. Three in each would be optimal.

Wait—optimal in what sense, exactly? What do we mean when we say one tree is more balanced than another? Let’s say the “balance factor” is the largest difference in size of two sibling trees:

```
balFac :: Tree a -> Integer
balFac = fst . go where
go :: Tree a -> (Integer, Integer)
go (Leaf _) = (0, 1)
go (l :*: r) = (lb `max` rb `max` abs (rs - ls), rs + ls) where
(lb,ls) = go l
(rb,rs) = go r
```

And one tree is more balanced than another if it has a smaller balance factor.

There’s effectively no limit on the balance factor for the typewriter method: when the input is one larger than a power of two, it’ll stick the one extra in one branch and the rest in another (as with `[1..9]`

in the example above).

For the slalom method, it looks like there’s something more interesting going on, limit-wise. I haven’t been able to verify this formally (yet), but from what I can tell, a tree of height $n$ will have at most a balance factor of the $n$th Jacobsthal number. That’s (apparently) also the number of ways to tie a tie using $n + 2$ turns.

That was just gathered from some quick experiments and oeis.org, but it seems to make sense intuitively. Jacobsthal numbers are defined like this:

```
j 0 = 0
j 1 = 1
j n = j (n-1) + 2 * j (n-2)
```

So, at the top level, there’s the imbalance caused by the second-last `pairFold`

, plus the imbalance caused by the third-to-last. However, the third-to-last imbalance is twice what it was at that level, because it is now working with an already-paired-up list. Why isn’t the second last imbalance also doubled? Because it’s counteracted by the fact that we turned around: the imbalance is in an element that’s a leftover element. At least that’s what my intuition is at this point.

The minimum balance factor is, of course, one. Unfortunately, to achieve that, I lost some of the properties of the previous folds:

Up until now, I have been avoiding taking the length of the incoming list. It would lose a lot of laziness, cause an extra traversal, and generally seems like an ugly solution. Nonetheless, it gives the most balanced results I could find so far:

```
treeFold :: (a -> a -> a) -> NonEmpty a -> a
treeFold f (x:|xs) = go (length (x:xs)) (x:xs) where
go 1 [y] = y
go n ys = f (go m a) (go (n-m) b) where
(a,b) = splitAt m ys
m = n `div` 2
```

`splitAt`

is an inefficient operation, but if we let the left-hand call return its unused input from the list, we can avoid it:

```
treeFold :: (a -> a -> a) -> NonEmpty a -> a
treeFold f (x:|xs) = fst (go (length (x:xs)) (x:xs)) where
go 1 (y:ys) = (y,ys)
go n ys = (f l r, rs) where
(l,ls) = go m ys
(r,rs) = go (n-m) ls
m = n `div` 2
```

Finally, you may have spotted the state monad in this last version. We can make the similarity explicit:

```
treeFold :: (a -> a -> a) -> NonEmpty a -> a
treeFold f (x:|xs) = evalState (go (length (x:xs))) (x:xs) where
go 1 = state (\(y:ys) -> (y,ys))
go n = do
let m = n `div` 2
l <- go m
r <- go (n-m)
return (f l r)
```

And there you have it: three different ways to fold in a more balanced way. Perhaps surprisingly, the first is the fastest in my tests. I’d love to hear if there’s a more balanced version (which is lazy, ideally) that is just as efficient as the first implementation.

I have found two other uses for these folds other than simply constructing more balanced binary trees. The first is summation of floating-point numbers. If you sum floating-point numbers in the usual way with `foldl'`

(or, indeed, with an accumulator in an imperative language), you will see an error growth of $\mathcal{O}(n)$, where $n$ is the number of floats you’re summing.

A well-known solution to this problem is the Kahan summation algorithm. It carries with it a running compensation for accumulating errors, giving it $\mathcal{O}(1)$ error growth. There are two downsides to the algorithm: it takes four times the number of numerical operations to perform, and isn’t parallel.

For that reason, it’s often not used in practice: instead, floats are summed *pairwise*, in a manner often referred to as cascade summation. This is what’s used in NumPy. The error growth isn’t quite as good—$\mathcal{O}(\log{n})$—but it takes the exact same number of operations as normal summation. On top of that:

Dividing a fold into roughly-equal chunks is exactly the kind of problem encountered when trying to parallelize certain algorithms. Adapting the folds above so that their work is performed in parallel is surprisingly easy:

```
splitPar :: (a -> a -> a) -> (Int -> a) -> (Int -> a) -> Int -> a
splitPar f = go
where
go l r 0 = f (l 0) (r 0)
go l r n = lt `par` (rt `pseq` f lt rt)
where
lt = l (n-m)
rt = r m
m = n `div` 2
treeFoldParallel :: (a -> a -> a) -> NonEmpty a -> a
treeFoldParallel f xs =
treeFold const (splitPar f) xs numCapabilities
```

The above will split the fold into `numCapabilities`

chunks, and perform each one in parallel. `numCapabilities`

is a constant defined in GHC.Conc: it’s the number of threads which can be run simultaneously at any one time. Alternatively, you could the function include a parameter for how many chunks to split the computation into. You could also have the fold adapt as it went, choosing whether or not to spark based on how many sparks exist at any given time:

```
parseq :: a -> b -> b
parseq a b =
runST
(bool (par a b) (seq a b) <$>
unsafeIOToST (liftA2 (>) numSparks getNumCapabilities))
treeFoldAdaptive :: (a -> a -> a) -> a -> [a] -> a
treeFoldAdaptive f =
Lazy.treeFold
(\l r ->
r `parseq` (l `parseq` f l r))
```

Adapted from this comment by Edward Kmett. This is actually the fastest version of all the folds.

All of this is provided in a library I’ve put up on Hackage.

]]>I have been working a little more on my semirings library recently, and I have come across some interesting functions in the process. First, a quick recap on the `Semiring`

class and some related functions:

```
class Semiring a where
one :: a
zero :: a
infixl 6 <+>
(<+>) :: a -> a -> a
infixl 7 <.>
(<.>) :: a -> a -> a
add :: (Foldable f, Semiring a) => f a -> a
add = foldl' (<+>) zero
mul :: (Foldable f, Semiring a) => f a -> a
mul = foldl' (<.>) one
instance Semiring Integer where
one = 1
zero = 0
(<+>) = (+)
(<.>) = (*)
instance Semiring Bool where
one = True
zero = False
(<+>) = (||)
(<.>) = (&&)
```

You can think of it as a replacement for `Num`

, but it turns out to be much more generally useful than that.

The first interesting function is to do with matrix multiplication. Here’s the code for multiplying two matrices represented as nested lists:

```
mulMatrix :: Semiring a => [[a]] -> [[a]] -> [[a]]
mulMatrix xs ys = map (\row -> map (add . zipWith (<.>) row) cs) xs
where
cs = transpose ys
```

One of the issues with this code (other than its woeful performance) is that it seems needlessly list-specific. `zipWith`

seems like the kind of thing that exists on a bunch of different structures. Indeed, the `ZipList`

wrapper uses `zipWith`

as its `<*>`

implementation. Let’s try that for now:

```
mulMatrix :: (Semiring a, Applicative f) => f (f a) -> f (f a) -> f (f a)
mulMatrix xs ys = fmap (\row -> fmap (add . liftA2 (<.>) row) cs) xs
where
cs = transpose ys
```

Of course, now `add`

needs to work on our `f`

, so it should be `Foldable`

```
mulMatrix
:: (Semiring a, Applicative f, Foldable f)
=> f (f a) -> f (f a) -> f (f a)
mulMatrix = ...
```

`transpose`

is the missing piece now. A little bit of `Applicative`

magic can help us out again, though: `sequenceA`

is `transpose`

on `ZipList`

s (McBride and Paterson 2008).

```
mulMatrix
:: (Semiring a, Applicative f, Traversable f)
=> f (f a) -> f (f a) -> f (f a)
mulMatrix xs ys =
fmap (\row -> fmap (add . liftA2 (<.>) row) cs) xs
where
cs = sequenceA ys
```

One further generalization: The two `f`

s don’t actually need to be the same:

```
mulMatrix
:: (Applicative n
,Traversable m
,Applicative m
,Applicative p
,Semiring a)
=> n (m a) -> m (p a) -> n (p a)
mulMatrix xs ys = fmap (\row -> fmap (add . liftA2 (<.>) row) cs) xs
where
cs = sequenceA ys
```

Happily, the way that the wrappers (`n`

, `m`

, and `p`

) match up coincides precisely with how matrix dimensions match up in matrix multiplication. Quoting from the Wikipedia definition:

if $A$ is an $n \times m$ matrix and $B$ is an $m \times p$ matrix, their matrix product $AB$ is an $n \times p$ matrix

This function is present in the linear package with some different constraints. In fairness, `Applicative`

probably isn’t the best thing to use here since it doesn’t work for so many instances (`MonadZip`

or something similar may be more suitable), but it’s very handy to have, and works out-of the box for types like:

```
data Three a
= Three a a a
deriving (Functor, Foldable, Traversable, Eq, Ord, Show)
instance Applicative Three where
pure x = Three x x x
Three fx fy fz <*> Three xx xy xz = Three (fx xx) (fy xy) (fz xz)
```

Which makes it (to my mind) useful enough to keep. Also, it hugely simplified the code for matrix multiplication in square matrices I had, from Okasaki (1999).

If you’re putting a general class in a library that you want people to use, and there exist sensible instances for common Haskell types, you should probably provide those instances in the library to avoid orphans. The meaning of “sensible” here is vague: generally speaking, if there is only one obvious or clear instance, then it’s sensible. For a list instance for the semiring class, for instance, I could figure out several law-abiding definitions for `<+>`

, `one`

and `zero`

, but only one for `<.>`

: polynomial multiplication. You know, where you multiply two polynomials like so:

$(x^3 + 2x + 3)(5x + 3x^2 + 4) = 9x^5 + 15x^4 + 18x^3 + 28x^2 + 38x + 24$

A more general definition looks something like this:

$(a_0x^0 + a_1x^1 + a_2x^2)(b_0x^0 + b_1x^1 + b_2x^2) =$ $a_0b_0x^0 + (a_0b_1 + a_1b_0)x^1 + (a_0b_2 + a_1b_1 + a_2b_0)x^2 + (a_1b_2 + a_2b_1)x^3 + a_2b_2x^4$

Or, fully generalized:

$c_k = a_0b_k + a_1b_{k-1} + \ldots + a_{k-1}b_1 + a_kb_0$ $f(x) \times g(x) = \sum_{i=0}^{n+m}c_ix^i$

So it turns out that you can represent polynomials pretty elegantly as lists. Take an example from above:

$x^3 + 2x + 3$

And rearrange it in order of the powers of $x$:

$3x^0 + 2x^1 + x^3$

And fill in missing coefficients:

$3x^0 + 2x^1 + 0x^2 + 1x^3$

And then the list representation of that polynomial is the list of those coefficients:

`[3, 2, 0, 1]`

For me, the definitions of multiplication above were pretty hard to understand. In Haskell, however, the definition is quite beautiful:

```
instance Semiring a => Semiring [a] where
one = [one]
zero = []
[] <+> ys = ys
xs <+> [] = xs
(x:xs) <+> (y:ys) = x <+> y : (xs <+> ys)
_ <.> [] = []
[] <.> _ = []
(x:xs) <.> (y:ys) = (x<.>y) : map (x<.>) ys <+> xs <.> (y:ys)
```

This definition for `<.>`

can be found on page 4 of McIlroy (1999). Although there was a version of the paper with a slightly different definition:

```
_ <.> [] = []
[] <.> _ = []
(x:xs) <.> (y:ys)
= (x<.>y) : (map (x<.>) ys <+> map (<.>y) xs <+> (zero : (xs <.> ys)))
```

Similar to one which appeared in Dolan (2013).

As it happens, I prefer the first definition. It’s shorter, and I figured out how to write it as a fold:

```
_ <.> [] = []
xs <.> ys = foldr f [] xs where
f x zs = map (x <.>) ys <+> (zero : zs)
```

And if you inline the `<+>`

, you get a reasonable speedup:

```
xs <.> ys = foldr f [] xs
where
f x zs = foldr (g x) id ys (zero : zs)
g x y a (z:zs) = x <.> y <+> z : a zs
g x y a [] = x <.> y : a []
```

The definition of `<+>`

can also use a fold on either side for fusion purposes:

```
(<+>) = foldr f id where
f x xs (y:ys) = x <+> y : xs ys
f x xs [] = x : xs []
(<+>) = flip (foldr f id) where
f y ys (x:xs) = x <+> y : ys xs
f y ys [] = y : ys []
```

There are rules in the library to choose one of the above definitions if fusion is available.

This definition is much more widely useful than it may seem at first. Say, for instance, you wanted to search through pairs of things from two infinite lists. You can’t use the normal way to pair things for lists, the Cartesian product, because it will diverge:

```
[(x,y) | x <- [1..], y <- [1..]]
-- [(1,1),(1,2),(1,3),(1,4),(1,5),(1,6),(1,7),(1,8),(1,9),(1,10)...
```

You’ll never get beyond 1 in the first list. Zipping isn’t an option either, because you won’t really explore the search space, only corresponding pairs. Brent Yorgey showed that if you want a list like this:

```
[(y,x-y) | x <- [0..], y <- [0..x] ]
-- [(0,0),(0,1),(1,0),(0,2),(1,1),(2,0),(0,3),(1,2),(2,1),(3,0)...
```

Then what you’re looking for is a convolution (the same thing as polynomial multiplication). `<.>`

above can be adapted readily:

```
convolve :: [a] -> [b] -> [[(a,b)]]
convolve xs ys = foldr f [] xs
where
f x zs = foldr (g x) id ys ([] : zs)
g x y a (z:zs) = ((x, y) : z) : a zs
g x y a [] = [(x, y)] : a []
```

Flatten out this result to get your ordering. This convolution is a little different from the one in the blog post. By inlining `<+>`

we can avoid the expensive `++`

function, without using difference lists.

Here’s another cool use of lists as polynomials: they can be used as a positional numeral system. Most common numeral systems are positional, including Arabic (the system you most likely use, where twenty-four is written as 24) and binary. Non-positional systems are things like Roman numerals. Looking at the Arabic system for now, we see that the way of writing down numbers:

$1989$

Can be thought of the sum of each digit multiplied by ten to the power of its position:

$1989 = 1 \times 10^3 \plus 9 \times 10^2 \plus 8 \times 10^1 \plus 9 \times 10^0$ $1989 = 1 \times 1000 \plus 9 \times 100 \plus 8 \times 10 \plus 9 \times 1$ $1989 = 1000 \plus 900 \plus 80 \plus 9$ $1989 = 1989$

Where the positions are numbered from the right. In other words, it’s our polynomial list from above in reverse. As well as that, the convolution is long multiplication.

Now, taking this straight off we can try some examples:

```
-- 12 + 15 = 27
[2, 1] <+> [5, 1] == [7, 2]
-- 23 * 2 = 46
[3, 2] <.> [2] == [6, 4]
```

The issue, of course, is that we’re not handling carrying properly:

`[6] <+> [6] == [12]`

No matter: we can perform all the carries after the addition, and everything works out fine:

```
carry
:: Integral a
=> a -> [a] -> [a]
carry base xs = foldr f (toBase base) xs 0
where
f e a cin = r : a q where
(q,r) = quotRem (cin + e) base
toBase :: Integral a => a -> a -> [a]
toBase base = unfoldr f where
f 0 = Nothing
f n = Just (swap (quotRem n base))
```

Wrap the whole thing in a newtype and we can have a `Num`

instance:

```
newtype Positional
= Positional
{ withBase :: Integer -> [Integer]
}
instance Num Positional where
Positional x + Positional y = Positional (carry <*> x <+> y)
Positional x * Positional y = Positional (carry <*> x <.> y)
fromInteger m = Positional (\base -> toBase base m)
abs = id
signum = id
negate = id
toDigits :: Integer -> Positional -> [Integer]
toDigits base p = reverse (withBase p base)
```

This also lets us choose our base after the fact:

```
sumHundred = (sum . map fromInteger) [1..100]
toDigits 10 sumHundred
-- [5,0,5,0]
toDigits 2 sumHundred
-- [1,0,0,1,1,1,0,1,1,1,0,1,0]
```

All the hand-optimizing, inlining, and fusion magic in the world won’t make a list-based implementation of convolution faster than a proper one on vectors, unfortunately. In particular, for larger vectors, a fast Fourier transform can be used. Also, usually code like this will be parallelized, rather than sequential. That said, it can be helpful to implement the slower version on vectors, in the usual indexed way, for comparison’s sake:

```
instance Semiring a =>
Semiring (Vector a) where
one = Vector.singleton one
zero = Vector.empty
xs <+> ys =
case compare (Vector.length xs) (Vector.length ys) of
EQ -> Vector.zipWith (<+>) xs ys
LT -> Vector.unsafeAccumulate (<+>) ys (Vector.indexed xs)
GT -> Vector.unsafeAccumulate (<+>) xs (Vector.indexed ys)
signal <.> kernel
| Vector.null signal = Vector.empty
| Vector.null kernel = Vector.empty
| otherwise = Vector.generate (slen + klen - 1) f
where
f n =
foldl'
(\a k ->
a <+>
Vector.unsafeIndex signal k <.>
Vector.unsafeIndex kernel (n - k))
zero
[kmin .. kmax]
where
!kmin = max 0 (n - (klen - 1))
!kmax = min n (slen - 1)
!slen = Vector.length signal
!klen = Vector.length kernel
```

As has been observed before (Rivas, Jaskelioff, and Schrijvers 2015) there’s a pretty suggestive similarity between semirings and the `Applicative`

/`Alternative`

classes in Haskell:

```
class Semiring a where
one :: a
zero :: a
(<+>) :: a -> a -> a
(<.>) :: a -> a -> a
class Applicative f where
pure :: a -> f a
(<*>) :: f (a -> b) -> f a -> f b
class Alternative f where
empty :: f a
(<|>) :: f a -> f a -> f a
```

So can our implementation of convolution be used to implement the methods for these classes? Partially:

```
newtype Search f a = Search { runSearch :: [f a] }
instance Functor f => Functor (Search f) where
fmap f (Search xs) = Search ((fmap.fmap) f xs)
instance Alternative f => Applicative (Search f) where
pure x = Search [pure x]
_ <*> Search [] = Search []
Search xs <*> Search ys = Search (foldr f [] xs) where
f x zs = foldr (g x) id ys (empty : zs)
g x y a (z:zs) = (x <*> y <|> z) : a zs
g x y a [] = (x <*> y) : a []
instance Alternative f => Alternative (Search f) where
Search xs <|> Search ys = Search (go xs ys) where
go [] ys = ys
go xs [] = xs
go (x:xs) (y:ys) = (x <|> y) : go xs ys
empty = Search []
```

At first, this seems perfect: the types all match up, and the definitions seem sensible. The issue is with the laws: `Applicative`

and `Alternative`

are missing *four* that semirings require. In particular: commutativity of plus, annihilation by zero, and distributivity left and right:

```
xs <|> ys = ys <|> xs
empty <*> xs = fs <*> empty = empty
fs <*> (xs <|> ys) = fs <*> xs <|> fs <*> ys
(fs <|> gs) <*> xs = fs <*> xs <|> gs <*> ys
```

The vast majority of the instances of `Alternative`

today fail one or more of these laws. Taking lists as an example, `++`

obviously isn’t commutative, and `<*>`

only distributes when it’s on the right.

What’s the problem, though? Polynomial multiplication follows *more* laws than those required by `Applicative`

: why should that worry us? Unfortunately, in order for multiplication to follow those laws, it actually relies on the underlying semiring being law-abiding. And it *fails* the applicative laws when it isn’t.

There are two angles from which we could come at this problem: either we relax the semiring laws and try and make our implementation of convolution rely on them as little as possible, or we find `Alternative`

instances which follow the semiring laws. Or we could meet in the middle, relaxing the laws as much as possible until we find some `Alternative`

s that meet our standards.

This has actually been accomplished in several papers: the previously mentioned Rivas, Jaskelioff, and Schrijvers (2015) discusses near-semirings, defined as semiring-like structures with associativity, identity, and these two laws:

$0 \times x = 0$ $(x \plus y) \times z = (x \times z) \plus (y \times z)$

In contrast to normal semirings, zero only annihilates when it’s on the left, and multiplication only distributes over addition when it’s on the right. Addition is not required to be commutative.

The lovely paper Spivey (2009) has a similar concept: a “bunch”.

```
class Bunch m where
return :: a -> m a
(>>=) :: m a -> (a -> m b) -> m b
zero :: m a
(<|>) :: m a -> m a -> m a
wrap :: m a -> m a
```

The laws are all the same (with `<*>`

implemented in terms of `>>=`

), and the extra `wrap`

operation can be expressed like so:

```
wrap :: Alternative f => Search f a -> Search f a
wrap (Search xs) = Search (empty : xs)
```

A definition of `>>=`

for our polynomials is also provided:

```
[] >>= _ = []
(x:xs) >>= f = foldr (<|>) empty (fmap f x) <|> wrap (xs >>= f)
```

This will require the underlying `f`

to be `Foldable`

. We can inline a little, and express the whole thing as a fold:

```
instance (Foldable f, Alternative f) => Monad (Search f) where
Search xs >>= k = foldr f empty xs where
f e a = foldr ((<|>) . k) (wrap a) e
```

For `Search`

to meet the requirements of a bunch, the paper notes that the `f`

must be assumed to be a bag, i.e., the order of its elements must be ignored.

Kiselyov et al. (2005) kind of goes the other direction, defining a monad which has fair disjunction and conjunction. Unfortunately, the fair conjunction loses associativity.

The end of the paper on algebras for combinatorial search wonders if notions of distance could be added to some of the algebras. I *think* that should be as simple as supplying a suitable near-semiring for `f`

, but the definition of `>>=`

would need to be changed. The near-semiring I had in mind was the probability monad. It works correctly if inlined:

```
newtype Search s a = Search { runSearch :: [[(a,s)]] }
instance Functor (Search s) where
fmap f (Search xs) = Search ((fmap.fmap.first) f xs)
instance Semiring s => Applicative (Search s) where
pure x = Search [[(x,one)]]
_ <*> Search [] = Search []
Search xs <*> Search ys = Search (foldr f [] xs) where
f x zs = foldr (g x) id ys (empty : zs)
g x y a (z:zs) = (m x y ++ z) : a zs
g x y a [] = (m x y) : a []
m ls rs = [(l r, lp<.>rp) | (l,lp) <- ls, (r,rp) <- rs]
instance Semiring s => Alternative (Search s) where
Search xs <|> Search ys = Search (go xs ys) where
go [] ys = ys
go xs [] = xs
go (x:xs) (y:ys) = (x ++ y) : go xs ys
empty = Search []
wrap :: Search s a -> Search s a
wrap (Search xs) = Search ([] : xs)
instance Semiring s => Monad (Search s) where
Search xs >>= k = foldr f empty xs where
f e a = foldr ((<|>) . uncurry (mulIn . k)) (wrap a) e
mulIn (Search x) xp = Search ((fmap.fmap.fmap) (xp<.>) x)
```

But I couldn’t figure out how to get it to work for a more generalized inner monad. The above could probably be sped up, or randomized, using the many well-known techniques for probability monad optimization.

Dolan, Stephen. 2013. “Fun with semirings: A functional pearl on the abuse of linear algebra.” In, 48:101. ACM Press. doi:10.1145/2500365.2500613. https://www.cl.cam.ac.uk/~sd601/papers/semirings.pdf.

Kiselyov, Oleg, Chung-chieh Shan, Daniel P Friedman, and Amr Sabry. 2005. “Backtracking, interleaving, and terminating monad transformers (functional pearl).” *ACM SIGPLAN Notices* 40 (9): 192–203. http://okmij.org/ftp/Computation/monads.html#LogicT.

McBride, Conor, and Ross Paterson. 2008. “Applicative programming with effects.” *Journal of functional programming* 18 (01): 1–13. http://strictlypositive.org/Idiom.pdf.

McIlroy, M. Douglas. 1999. “Power Series, Power Serious.” *J. Funct. Program.* 9 (3) (May): 325–337. doi:10.1017/S0956796899003299. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.333.3156&rep=rep1&type=pdf.

Okasaki, Chris. 1999. “From Fast Exponentiation to Square Matrices: An Adventure in Types.” In *Proceedings of the ACM SIGPLAN International Conference on Functional Programming (ICFP’99), Paris, France, September 27-29, 1999*, 34:28. ACM. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.456.357&rep=rep1&type=pdf.

Rivas, Exequiel, Mauro Jaskelioff, and Tom Schrijvers. 2015. “From monoids to near-semirings: The essence of MonadPlus and Alternative.” In *Proceedings of the 17th International Symposium on Principles and Practice of Declarative Programming*, 196–207. ACM. doi:10.1145/2790449.2790514. http://www.fceia.unr.edu.ar/~mauro/pubs/FromMonoidstoNearsemirings.pdf.

Spivey, J. Michael. 2009. “Algebras for combinatorial search.” *Journal of Functional Programming* 19 (3-4) (July): 469–487. doi:10.1017/S0956796809007321. https://pdfs.semanticscholar.org/db3e/373bb6e7e7837ebc524da0a25903958554ed.pdf.

Tags: Haskell

There are a couple partial functions in the Haskell Prelude which people seem to agree shouldn’t be there. `head`

, for example, will throw an error on an empty list. Most seem to agree that it should work something more like this:

```
head :: Foldable f => f a -> Maybe a
head = foldr (const . Just) Nothing
```

There are other examples, like `last`

, `!!`

, etc.

One which people *don’t* agree on, however, is division by zero. In the current Prelude, the following will throw an error:

`1 / 0`

The “safe” version might have a signature like this:

`(/) :: Fractional a => a -> a -> Maybe a`

However, this turns out to be quite a headache for writing code generally. So the default is the (somewhat) unsafe version.

Is there a way to introduce a safer version without much overhead, so the programmer is given the option? Of course! With some newtype magic, it’s pretty simple to write a wrapper which catches division by zero in some arbitrary monad:

```
newtype AppNum f a = AppNum
{ runAppNum :: f a
} deriving (Functor,Applicative,Monad,Alternative,Show,Eq,MonadFail)
instance (Num a, Applicative f) =>
Num (AppNum f a) where
abs = fmap abs
signum = fmap signum
(+) = liftA2 (+)
(*) = liftA2 (*)
(-) = liftA2 (-)
negate = fmap negate
fromInteger = pure . fromInteger
instance (Fractional a, MonadFail f, Eq a) =>
Fractional (AppNum f a) where
fromRational = pure . fromRational
xs / ys =
ys >>=
\case
0 -> fail "divide by zero"
y -> fmap (/ y) xs
```

I’m using the `-XLambdaCase`

extension and `MonadFail`

here.

You’ll notice that you only need `Applicative`

for most of the arithmetic operations above. In fact, you only need `Monad`

when you want to examine the contents of `f`

. Using that fact, we can manipulate expression trees using the free applicative from the free package. Say, for instance, we want to have free variables in our expressions. Using `Either`

, it’s pretty easy:

```
type WithVars = AppNum (Ap (Either String)) Integer
var :: String -> WithVars
var = AppNum . liftAp . Left
```

We can collect the free variables from an expression:

```
vars :: WithVars -> [String]
vars = runAp_ (either pure (const [])) . runAppNum
x = 1 :: WithVars
y = var "y"
z = var "z"
vars (x + y + z) -- ["y","z"]
```

If we want to sub in, though, we’re going to run into a problem: we can’t just pass in a `Map String Integer`

because you’re able to construct values like this:

```
bad :: AppNum (Ap (Either String)) (Integer -> Integer -> Integer)
bad = AppNum (liftAp (Left "oh noes"))
```

We’d need to pass in a `Map String (Integer -> Integer -> Integer)`

as well; in fact you’d need a map for every possible type. Which isn’t feasible.

Luckily, we *can* constrain the types of variables in our expression so that they’re always `Integer`

, using a GADT:

```
data Variable a where
Constant :: a -> Variable a
Variable :: String -> Variable Integer
```

The type above seems useless on its own: it doesn’t have a `Functor`

instance, never mind an `Applicative`

, so how can it fit into `AppNum`

?

The magic comes from the free applicative, which converts any type of kind `Type -> Type`

into an applicative. With that in mind, we can change around the previous code:

```
type WithVars = AppNum (Ap Variable) Integer
var :: String -> WithVars
var = AppNum . liftAp . Variable
vars :: WithVars -> [String]
vars = runAp_ f . runAppNum
where
f :: Variable a -> [String]
f (Constant _) = []
f (Variable s) = [s]
```

And write the function to sub in for us:

```
variableA
:: Applicative f
=> (String -> f Integer) -> Variable a -> f a
variableA _ (Constant x) = pure x
variableA f (Variable s) = f s
variable :: (String -> Integer) -> Variable a -> a
variable _ (Constant x) = x
variable f (Variable s) = f s
replace :: Map String Integer -> WithVars -> Integer
replace m = runAp (variable (m Map.!)) . runAppNum
replace (Map.fromList [("z",2), ("y",3)]) (x + y + z)
-- 6
```

This will fail if a free variable isn’t present in the map, unfortunately. To fix it, we *could* use `Either`

instead of `Identity`

:

```
replace :: Map String Integer -> WithVars -> Either String Integer
replace m =
runAp
(variableA $
\s ->
maybe (Left s) Right (Map.lookup s m)) .
runAppNum
```

But this only gives us the first missing variable encountered. We’d like to get back *all* of the missing variables, ideally: accumulating the `Left`

s. `Either`

doesn’t accumulate values, as if it did it would break the monad laws.

There’s no issue with the *applicative* laws, though, which is why the validation package provides a *non-monadic* either-like type, which we can use here.

```
replace :: Map String Integer -> WithVars -> AccValidation [String] Integer
replace m =
runAp
(variableA $
\s ->
maybe (AccFailure [s]) pure (Map.lookup s m)) .
runAppNum
replace (Map.fromList []) (x + y + z)
-- AccFailure ["y","z"]
```

There are a bunch more applicatives you could use instead of `Either`

. Using lists, for instance, you could calculate the possible outcomes from a range of inputs:

```
range :: WithVars -> [Integer]
range = runAp (variable (const [1..3])) . runAppNum
range (x + y + z)
-- [3,4,5,4,5,6,5,6,7]
```

Or you could ask the user for input:

```
query :: WithVars -> IO Integer
query = runAp (variable f) . runAppNum
where
f s = do
putStr "Input a value for "
putStrLn s
fmap read getLine
```

Finally, and this one’s a bit exotic, you could examine every variable in turn, with defaults for the others:

```
zygo
:: (forall x. f x -> x)
-> (forall x. f x -> (x -> a) -> b)
-> Ap f a
-> [b]
zygo (l :: forall x. f x -> x) (c :: forall x. f x -> (x -> a) -> b) =
fst . go id
where
go :: forall c. (c -> a) -> Ap f c -> ([b], c)
go _ (Pure x) = ([], x)
go k (Ap x f) = (c x (k . ls) : xs, ls lx)
where
(xs,ls) = go (k . ($ lx)) f
lx = l x
examineEach :: WithVars -> [Integer -> Integer]
examineEach = zygo (variable (const 1)) g . runAppNum
where
g :: Variable a -> (a -> b) -> Integer -> b
g (Constant x) rhs _ = rhs x
g (Variable _) rhs i = rhs i
```

This produces a list of functions which are equivalent to subbing in for each variable with the rest set to 1.

]]>
Tags: Haskell, Dependent Types

A while ago I read this post on reddit (by David Feuer), about sorting traversables (which was a follow-up on this post by Will Fancher), and I was inspired to write some pseudo-dependently-typed Haskell. The post (and subsequent library) detailed how to use size-indexed heaps to perform fast, total sorting on any traversable. I ended up with a library which has five size-indexed heaps (Braun, pairing, binomial, skew, and leftist), each verified for structural correctness. I also included the non-indexed implementations of each for comparison (as well as benchmarks, tests, and all that good stuff).

The purpose of this post is to go through some of the tricks I used and problems I encountered writing a lot of type-level code in modern Haskell.

In order to index things by their size, we’ll need a type-level representation of size. We’ll use Peano numbers for now:

`data Peano = Z | S Peano`

`Z`

stands for zero, and `S`

for successor. The terseness is pretty necessary here, unfortunately: arithmetic becomes unreadable otherwise. The simplicity of this definition is useful for proofs and manipulation; however any runtime representation of these numbers is going to be woefully slow.

With the `DataKinds`

extension, the above is automatically promoted to the type-level, so we can write type-level functions (type families) on the `Peano`

type:

```
type family Plus (n :: Peano) (m :: Peano) :: Peano where
Plus Z m = m
Plus (S n) m = S (Plus n m)
```

Here the `TypeFamilies`

extension is needed. I’ll try and mention every extension I’m using as we go, but I might forget a few, so check the repository for all of the examples (quick aside: I *did* manage to avoid using `UndecidableInstances`

, but more on that later). One pragma that’s worth mentioning is:

`{-# OPTIONS_GHC -fno-warn-unticked-promoted-constructors #-}`

This suppresses warnings on the definition of `Plus`

above. Without it, GHC would want us to write:

```
type family Plus (n :: Peano) (m :: Peano) :: Peano where
Plus 'Z m = m
Plus ('S n) m = 'S (Plus n m)
```

I think that looks pretty ugly, and it can get much worse with more involved arithmetic. The only thing I have found the warnings useful for is `[]`

: the type-level empty list gives an error in its unticked form.

In the original post, a pairing heap (Fredman et al. 1986) was used, for its simplicity and performance. The implementation looked like this:

```
data Heap n a where
E :: Heap Z a
T :: a -> HVec n a -> Heap (S n) a
data HVec n a where
HNil :: HVec Z a
HCons :: Heap m a -> HVec n a -> HVec (Plus m n) a
```

You immediately run into trouble when you try to define merge:

```
merge :: Ord a => Heap m a -> Heap n a -> Heap (Plus m n) a
merge E ys = ys
merge xs E = xs
merge h1@(T x xs) h2@(T y ys)
| x <= y = T x (HCons h2 xs)
| otherwise = T y (HCons h1 ys)
```

Three errors show up here, but we’ll look at the first one:

`Could not deduce (m ~ (Plus m Z))`

GHC doesn’t know that $x = x + 0$. Somehow, we’ll have to *prove* that it does.

In a language with true dependent types, proving the proposition above is as simple as:

```
plusZeroNeutral : (n : Nat) -> n + 0 = n
plusZeroNeutral Z = Refl
plusZeroNeutral (S k) = cong (plusZeroNeutral k)
```

(this example is in Idris)

In Haskell, on the other hand, we can’t do the same: functions on the value-level `Peano`

have no relationship with functions on the type-level `Peano`

. There’s no way to automatically link or promote one to the other.

This is where singletons come in (Eisenberg and Weirich 2012). A singleton is a datatype which mirrors a type-level value exactly, except that it has a type parameter which matches the equivalent value on the type-level. In this way, we can write functions on the value-level which are linked to the type-level. Here’s a potential singleton for `Peano`

:

```
data Natty n where
Zy :: Natty Z
Sy :: Natty n -> Natty (S n)
```

(we need `GADTs`

for this example)

Now, when we pattern-match on `Natty`

, we get a proof of whatever its type parameter was. Here’s a trivial example:

```
isZero :: Natty n -> Maybe (n :~: Z)
isZero Zy = Just Refl
isZero (Sy _) = Nothing
```

When we match on `Zy`

, the *only value* which `n`

could have been is `Z`

, because the only way to construct `Zy`

is if the type parameter is `Z`

.

Using this technique, the `plusZeroNeutral`

proof looks reasonably similar to the Idris version:

```
plusZeroNeutral :: Natty n -> Plus n Z :~: n
plusZeroNeutral Zy = Refl
plusZeroNeutral (Sy n) = case plusZeroNeutral n of
Refl -> Refl
```

To generalize the singletons a little, we could probably use the singletons library, or we could roll our own:

```
data family The k :: k -> Type
data instance The Peano n where
Zy :: The Peano Z
Sy :: The Peano n -> The Peano (S n)
plusZeroNeutral :: The Peano n -> Plus n Z :~: n
plusZeroNeutral Zy = Refl
plusZeroNeutral (Sy n) = case plusZeroNeutral n of
Refl -> Refl
```

The `The`

naming is kind of cute, I think. It makes the signature look *almost* like the Idris version (`the`

is a function from the Idris standard library). The `The`

type family requires the `TypeInType`

extension, which I’ll talk a little more about later.

There’s an issue with these kinds of proofs: the proof code runs *every time* it is needed. Since the same value is coming out the other end each time (`Refl`

), this seems wasteful.

In a language like Idris, this problem is avoided by noticing that you’re only using the proof for its type information, and then erasing it at runtime. In Haskell, we can accomplish the same with a rule:

```
{-# NOINLINE plusZeroNeutral #-}
{-# RULES
"plusZeroNeutral" forall x. plusZeroNeutral x
= unsafeCoerce (Refl :: 'Z :~: 'Z)
#-}
```

This basically says “if this type-checks, then the proof must exist, and therefore the proof must be valid. So don’t bother running it”. Unfortunately, that’s a *little bit* of a lie. It’s pretty easy to write a proof which type-checks that *isn’t* valid:

```
falseIsTrue :: False :~: True
falseIsTrue = falseIsTrue
```

We won’t be able to perform computations which rely on this proof in Haskell, though: because the computation will never terminate, the proof will never provide an answer. This means that, while the proof isn’t valid, it *is* type safe. That is, of course, unless we use our manual proof-erasure technique. The `RULES`

pragma will happily replace it with the `unsafeCoerce`

version, effectively introducing unsoundness into our proofs. The reason that this doesn’t cause a problem for language like Idris is that Idris has a totality checker: you *can’t* write the above definition (with the totality checker turned on) in Idris.

So what’s the solution? Do we have to suffer through the slower proof code to maintain correctness? In reality, it’s usually OK to assume termination. It’s pretty easy to see that a proof like `plusZeroNeutral`

is total. It’s worth bearing in mind, though, that until Haskell gets a totality checker (likely never, apparently) these proofs aren’t “proper”.

One extra thing: while you’re proving things in one area of your code, you might not have the relevant singleton handy. To generate them on-demand, you’ll need a typeclass:

```
class KnownSing (x :: k) where
sing :: The k x
instance KnownSing Z where
sing = Zy
instance KnownSing n => KnownSing (S n) where
sing = Sy sing
```

This kind of drives home the inefficiency of singleton-based proofs, and why it’s important to erase them aggressively.

One other way to solve these problems is to try find a data structure which runs the proof code anyway. As an example, consider a length-indexed list:

```
infixr 5 :-
data List n a where
Nil :: List Z a
(:-) :: a -> List n a -> List (S n) a
```

You might worry that concatenation of two lists requires some expensive proof code, like `merge`

for the pairing heap. Maybe surprisingly, the default implementation just works:

```
infixr 5 ++
(++) :: List n a -> List m a -> List (Plus n m) a
(++) Nil ys = ys
(++) (x :- xs) ys = x :- xs ++ ys
```

Why? Well, if you look back to the definition of `Plus`

, it’s almost exactly the same as the definition of `(++)`

. In effect, we’re using *lists* as the singleton for `Peano`

here.

The question is, then: is there a heap which performs these proofs automatically for functions like merge? As far as I can tell: *almost*. First though:

The standard definition of `++`

on normal lists can be cleaned up a little with `foldr`

```
(++) :: [a] -> [a] -> [a]
(++) = flip (foldr (:))
```

Can we get a similar definition for our length-indexed lists? Turns out we can, but the type of `foldr`

needs to be a little different:

```
foldrList :: (forall x. a -> b x -> b (S x))
-> b m -> List n a -> b (n + m)
foldrList f b Nil = b
foldrList f b (x :- xs) = f x (foldrList f b xs)
newtype Flip (f :: t -> u -> Type) (a :: u) (b :: t)
= Flip { unFlip :: f b a }
foldrList1 :: (forall x. a -> b x c -> b (S x) c)
-> b m c -> List n a -> b (n + m) c
foldrList1 f b
= unFlip . foldrList (\e -> Flip . f e . unFlip) (Flip b)
infixr 5 ++
(++) :: List n a -> List m a -> List (n + m) a
(++) = flip (foldrList1 (:-))
```

So what’s the point of this more complicated version? Well, if this were normal Haskell, we might get some foldr-fusion or something (in reality we would probably use `augment`

if that were the purpose).

With this type-level business, though, there’s a similar application: loop unrolling. Consider the natural-number type again. We can write a typeclass which will perform induction over them:

```
class KnownPeano (n :: Peano) where
unrollRepeat :: Proxy n -> (a -> a) -> a -> a
instance KnownPeano Z where
unrollRepeat _ = const id
{-# INLINE unrollRepeat #-}
instance KnownPeano n =>
KnownPeano (S n) where
unrollRepeat (_ :: Proxy (S n)) f x =
f (unrollRepeat (Proxy :: Proxy n) f x)
{-# INLINE unrollRepeat #-}
```

Because the recursion here calls a different `unrollRepeat`

function in the “recursive” call, we get around the usual hurdle of not being able to inline recursive calls. That means that the whole loop will be unrolled, at compile-time. We can do the same for foldr:

```
class HasFoldr (n :: Peano) where
unrollFoldr
:: (forall x. a -> b x -> b (S x))
-> b m
-> List n a
-> b (n + m)
instance HasFoldr Z where
unrollFoldr _ b _ = b
{-# INLINE unrollFoldr #-}
instance HasFoldr n => HasFoldr (S n) where
unrollFoldr f b (x :- xs) = f x (unrollFoldr f b xs)
{-# INLINE unrollFoldr #-}
```

I can’t think of many uses for this technique, but one that comes to mind is an n-ary uncurry (like Lisp’s apply):

```
infixr 5 :-
data List (xs :: [*]) where
Nil :: List '[]
(:-) :: a -> List xs -> List (a ': xs)
class KnownList (xs :: [*]) where
foldrT
:: (forall y ys. y -> result ys -> result (y ': ys))
-> result '[]
-> List xs
-> result xs
instance KnownList ('[] :: [*]) where
foldrT _ = const
{-# INLINE foldrT #-}
instance KnownList xs =>
KnownList (x ': xs) where
foldrT f b (x :- xs) = f x (foldrT f b xs)
{-# INLINE foldrT #-}
type family Func (xs :: [*]) (y :: *) where
Func '[] y = y
Func (x ': xs) y = x -> Func xs y
newtype FunType y xs = FunType
{ runFun :: Func xs y -> y
}
uncurry
:: KnownList xs
=> Func xs y -> List xs -> y
uncurry f l =
runFun
(foldrT
(c (\x g h -> g (h x)))
(FunType id)
l)
f
where
c :: (a -> ((Func xs y -> y) -> (Func zs z -> z)))
-> (a -> (FunType y xs -> FunType z zs))
c = coerce
{-# INLINE c #-}
{-# INLINE uncurry #-}
```

I *think* that you can be guaranteed the above is inlined at compile-time, making it essentially equivalent to a handwritten `uncurry`

.

Anyway, back to the size-indexed heaps. The reason that `(++)`

worked so easily on lists is that a list can be thought of as the data-structure equivalent to Peano numbers. Another numeric-system-based data structure is the binomial heap, which is based on binary numbering (I’m going mainly off of the description from Hinze 1999).

So, to work with binary numbers, let’s get some preliminaries on the type-level out of the way:

```
data instance The Bool x where
Falsy :: The Bool False
Truey :: The Bool True
data instance The [k] xs where
Nily :: The [k] '[]
Cony :: The k x -> The [k] xs -> The [k] (x : xs)
instance KnownSing True where
sing = Truey
instance KnownSing False where
sing = Falsy
instance KnownSing '[] where
sing = Nily
instance (KnownSing xs, KnownSing x) =>
KnownSing (x : xs) where
sing = Cony sing sing
```

We’ll represent a binary number as a list of Booleans:

```
type family Sum (x :: Bool) (y :: Bool) (cin :: Bool) :: Bool where
Sum False False False = False
Sum False False True = True
Sum False True False = True
Sum False True True = False
Sum True False False = True
Sum True False True = False
Sum True True False = False
Sum True True True = True
type family Carry (x :: Bool) (y :: Bool) (cin :: Bool)
(xs :: [Bool]) (ys :: [Bool]) :: [Bool] where
Carry False False False xs ys = Add False xs ys
Carry False False True xs ys = Add False xs ys
Carry False True False xs ys = Add False xs ys
Carry False True True xs ys = Add True xs ys
Carry True False False xs ys = Add False xs ys
Carry True False True xs ys = Add True xs ys
Carry True True False xs ys = Add True xs ys
Carry True True True xs ys = Add True xs ys
type family Add (cin :: Bool) (xs :: [Bool]) (ys :: [Bool]) ::
[Bool] where
Add c (x : xs) (y : ys) = Sum x y c : Carry x y c xs ys
Add False '[] ys = ys
Add False xs '[] = xs
Add True '[] ys = CarryOne ys
Add True xs '[] = CarryOne xs
type family CarryOne (xs :: [Bool]) :: [Bool] where
CarryOne '[] = True : '[]
CarryOne (False : xs) = True : xs
CarryOne (True : xs) = False : CarryOne xs
```

The odd definition of `Carry`

is to avoid `UndecidableInstances`

: if we had written, instead:

```
type family Carry (x :: Bool) (y :: Bool) (cin :: Bool) :: Bool where
Carry False False False = False
Carry False False True = False
Carry False True False = False
Carry False True True = True
Carry True False False = False
Carry True False True = True
Carry True True False = True
Carry True True True = True
type family Add (cin :: Bool) (xs :: [Bool]) (ys :: [Bool]) ::
[Bool] where
Add c (x : xs) (y : ys) = Sum x y c : Add (Carry x y c) xs ys
Add False '[] ys = ys
Add False xs '[] = xs
Add True '[] ys = CarryOne ys
Add True xs '[] = CarryOne xs
```

We would have been warned about nested type-family application.

Now we can base the merge function very closely on these type families. First, though, we’ll have to implement the heap.

There are different potential properties you can verify in a data structure. In the sort-traversable post, the property of interest was that the number of elements in the structure would stay the same after adding and removing some number $n$ of elements. For this post, we’ll also verify structural invariants. I won’t, however, verify the heap property. Maybe in a later post.

When indexing a data structure by its size, you encode an awful lot of information into the type signature: the type becomes very *specific* to the structure in question. It is possible, though, to encode a fair few structural invariants *without* getting so specific. Here’s a signature for “perfect leaf tree”:

`data BalTree a = Leaf a | Node (BalTree (a,a))`

With that signature, it’s *impossible* to create a tree with more elements in its left branch than its right; the size of the tree, however, remains unspecified. You can use a similar trick to implement matrices which must be square (from Okasaki 1999): the usual trick (`type Matrix n a = List n (List n a)`

) is too specific, providing size information at compile-time. If you’re interested in this approach, there are several more examples in Hinze (2001).

It is possible to go from the size-indexed version back to the non-indexed version, with an existential (`RankNTypes`

for this example):

```
data ErasedSize f a = forall (n :: Peano). ErasedSize
{ runErasedSize :: f n a
}
```

This will let you prove invariants in your implementation using an index, while keeping the user-facing type signature general and non-indexed.

Wasserman (2010), was able to encode all of the structural invariants of the binomial heap *without* indexing by its size (well, all invariants except truncation, which turned out to be important a little later). I’ll be using a similar approach, except I’ll leverage some of the newer bells and whistles in GHC. Where Wasserman’s version used types like this for the numbering:

```
data Zero a = Zero
data Succ rk a = BinomTree rk a :< rk a
data BinomTree rk a = BinomTree a (rk a)
```

We can reuse the type-level Peano numbers with a GADT:

```
infixr 5 :-
data Binomial xs rk a where
Nil :: Binomial '[] n a
Skip :: Binomial xs (S rk) a -> Binomial (False : xs) rk a
(:-) :: Tree rk a
-> Binomial xs (S rk) a
-> Binomial (True : xs) rk a
data Tree rk a = Root a (Node rk a)
infixr 5 :<
data Node n a where
NilN :: Node Z a
(:<) :: Tree n a -> Node n a -> Node (S n) a
```

The definition of `Tree`

here ensures that any tree of rank $n$ has $2^n$ elements. The binomial heap, then, is a list of trees, in ascending order of size, with a `True`

at every point in its type-level list where a tree is present, and a `False`

wherever one is absent. In other words, the type-level list is a binary encoding of the number of elements it contains.

And here are the merge functions:

```
mergeTree :: Ord a => Tree rk a -> Tree rk a -> Tree (S rk) a
mergeTree xr@(Root x xs) yr@(Root y ys)
| x <= y = Root x (yr :< xs)
| otherwise = Root y (xr :< ys)
merge
:: Ord a
=> Binomial xs z a
-> Binomial ys z a
-> Binomial (Add False xs ys) z a
merge Nil ys = ys
merge xs Nil = xs
merge (Skip xs) (Skip ys) = Skip (merge xs ys)
merge (Skip xs) (y :- ys) = y :- merge xs ys
merge (x :- xs) (Skip ys) = x :- merge xs ys
merge (x :- xs) (y :- ys) = Skip (mergeCarry (mergeTree x y) xs ys)
mergeCarry
:: Ord a
=> Tree rk a
-> Binomial xs rk a
-> Binomial ys rk a
-> Binomial (Add True xs ys) rk a
mergeCarry t Nil ys = carryOne t ys
mergeCarry t xs Nil = carryOne t xs
mergeCarry t (Skip xs) (Skip ys) = t :- merge xs ys
mergeCarry t (Skip xs) (y :- ys) = Skip (mergeCarry (mergeTree t y) xs ys)
mergeCarry t (x :- xs) (Skip ys) = Skip (mergeCarry (mergeTree t x) xs ys)
mergeCarry t (x :- xs) (y :- ys) = t :- mergeCarry (mergeTree x y) xs ys
carryOne
:: Ord a
=> Tree rk a -> Binomial xs rk a -> Binomial (CarryOne xs) rk a
carryOne t Nil = t :- Nil
carryOne t (Skip xs) = t :- xs
carryOne t (x :- xs) = Skip (carryOne (mergeTree t x) xs)
```

You’ll notice that no proofs are needed: that’s because the merge function itself is the same as the type family, like the way `++`

for lists was the same as the `Plus`

type family.

Of course, this structure is only verified insofar as you believe the type families. It does provide a degree of double-entry, though: any mistake in the type family will have to be mirrored in the merge function to type-check. On top of that, we can write some proofs of properties we might expect:

```
addCommutes
:: The [Bool] xs
-> The [Bool] ys
-> Add False xs ys :~: Add False ys xs
addCommutes Nily _ = Refl
addCommutes _ Nily = Refl
addCommutes (Cony Falsy xs) (Cony Falsy ys) =
gcastWith (addCommutes xs ys) Refl
addCommutes (Cony Truey xs) (Cony Falsy ys) =
gcastWith (addCommutes xs ys) Refl
addCommutes (Cony Falsy xs) (Cony Truey ys) =
gcastWith (addCommutes xs ys) Refl
addCommutes (Cony Truey xs) (Cony Truey ys) =
gcastWith (addCommutesCarry xs ys) Refl
addCommutesCarry
:: The [Bool] xs
-> The [Bool] ys
-> Add True xs ys :~: Add True ys xs
addCommutesCarry Nily _ = Refl
addCommutesCarry _ Nily = Refl
addCommutesCarry (Cony Falsy xs) (Cony Falsy ys) =
gcastWith (addCommutes xs ys) Refl
addCommutesCarry (Cony Truey xs) (Cony Falsy ys) =
gcastWith (addCommutesCarry xs ys) Refl
addCommutesCarry (Cony Falsy xs) (Cony Truey ys) =
gcastWith (addCommutesCarry xs ys) Refl
addCommutesCarry (Cony Truey xs) (Cony Truey ys) =
gcastWith (addCommutesCarry xs ys) Refl
```

Unfortunately, though, this method *does* require proofs (ugly proofs) for the delete-min operation. One of the issues is truncation: since the binary digits are stored least-significant-bit first, the same number can be represented with any number of trailing zeroes. This kept causing problems for me when it came to subtraction, and adding the requirement of no trailing zeroes (truncation) to the constructors for the heap was a pain, requiring extra proofs on merge to show that it preserves truncation.

Since some of these properties are much easier to verify on the type-level Peano numbers, one approach might be to convert back and forth between Peano numbers and binary, and use the proofs on Peano numbers instead.

```
type family BintoPeano (xs :: [Bool]) :: Peano where
BintoPeano '[] = Z
BintoPeano (False : xs) = BintoPeano xs + BintoPeano xs
BintoPeano (True : xs) = S (BintoPeano xs + BintoPeano xs)
```

First problem: this requires `UndecidableInstances`

. I’d *really* rather not have that turned on, to be honest. In Idris (and Agda), you can *prove* decidability using a number of different methods, but this isn’t available in Haskell yet.

Regardless, we can push on.

To go in the other direction, we’ll need to calculate the parity of natural numbers. Taken from the Idris tutorial:

```
data Parity (n :: Peano) where
Even :: The Peano n -> Parity (n + n)
Odd :: The Peano n -> Parity (S (n + n))
parity :: The Peano n -> Parity n
parity Zy = Even Zy
parity (Sy Zy) = Odd Zy
parity (Sy (Sy n)) = case parity n of
Even m -> gcastWith (plusSuccDistrib m m) (Even (Sy m))
Odd m -> gcastWith (plusSuccDistrib m m) (Odd (Sy m))
plusSuccDistrib :: The Peano n -> proxy m -> n + S m :~: S (n + m)
plusSuccDistrib Zy _ = Refl
plusSuccDistrib (Sy n) p = gcastWith (plusSuccDistrib n p) Refl
```

We need this function on the type-level, though, not the value-level: here, again, we run into trouble. What does `gcastWith`

look like on the type-level? As far as I can tell, it doesn’t exist (yet. Although I haven’t looked deeply into the singletons library yet).

This idea of doing dependently-typed stuff on the type-level *started* to be possible with `TypeInType`

. For instance, we could have defined our binary type as:

```
data Binary :: Peano -> Type where
O :: Binary n -> Binary (n + n)
I :: Binary n -> Binary (S (n + n))
E :: Binary Z
```

And then the binomial heap as:

```
data Binomial (xs :: Binary n) (rk :: Peano) (a :: Type) where
Nil :: Binomial E n a
Skip :: Binomial xs (S rk) a -> Binomial (O xs) rk a
(:-) :: Tree rk a
-> Binomial xs (S rk) a
-> Binomial (I xs) rk a
```

What we’re doing here is indexing a type *by an indexed type*. This wasn’t possible in Haskell a few years ago. It still doesn’t get us a nice definition of subtraction, though.

It’s pretty clear that this approach gets tedious almost immediately. What’s more, if we want the proofs to be erased, we introduce potential for errors.

The solution? Beef up GHC’s typechecker with a plugin. I first came across this approach in Kenneth Foner’s talk at Compose. He used a plugin that called out to the Z3 theorem prover (from Diatchki 2015); I’ll use a simpler plugin which just normalizes type-literals.

From what I’ve used of these plugins so far, they seem to work really well. They’re very unobtrusive, only requiring a pragma at the top of your file:

`{-# OPTIONS_GHC -fplugin GHC.TypeLits.Normalise #-}`

The plugin is only called when GHC can’t unify two types: this means you don’t get odd-looking error messages in unrelated code (in fact, the error messages I’ve seen so far have been excellent—a real improvement on the standard error messages for type-level arithmetic). Another benefit is that we get to use type-level literals (`Nat`

imported from GHC.TypeLits), rather then the noisy-looking type-level Peano numbers.

```
data Tree n a = Root a (Node n a)
data Node :: Nat -> Type -> Type where
NilN :: Node 0 a
(:<) :: {-# UNPACK #-} !(Tree n a)
-> Node n a
-> Node (1 + n) a
mergeTree :: Ord a => Tree n a -> Tree n a -> Tree (1 + n) a
mergeTree xr@(Root x xs) yr@(Root y ys)
| x <= y = Root x (yr :< xs)
| otherwise = Root y (xr :< ys)
infixr 5 :-
data Binomial :: Nat -> Nat -> Type -> Type where
Nil :: Binomial n 0 a
(:-) :: {-# UNPACK #-} !(Tree z a)
-> Binomial (1 + z) xs a
-> Binomial z (1 + xs + xs) a
Skip :: Binomial (1 + z) (1 + xs) a
-> Binomial z (2 + xs + xs) a
```

This definition also ensures that the binomial heap has no trailing zeroes in its binary representation: the `Skip`

constructor can only be applied to a heap bigger than zero.

Since we’re going to be looking at several different heaps, we’ll need a class to represent all of them:

```
class IndexedQueue h a where
{-# MINIMAL insert, empty, minViewMay, minView #-}
empty
:: h 0 a
minView
:: h (1 + n) a -> (a, h n a)
singleton
:: a -> h 1 a
singleton = flip insert empty
insert
:: a -> h n a -> h (1 + n) a
minViewMay
:: h n a
-> (n ~ 0 => b)
-> (forall m. (1 + m) ~ n => a -> h m a -> b)
-> b
class IndexedQueue h a =>
MeldableIndexedQueue h a where
merge
:: h n a -> h m a -> h (n + m) a
```

You’ll need `MultiParamTypeClasses`

for this one.

```
mergeB
:: Ord a
=> Binomial z xs a -> Binomial z ys a -> Binomial z (xs + ys) a
mergeB Nil ys = ys
mergeB xs Nil = xs
mergeB (Skip xs) (Skip ys) = Skip (mergeB xs ys)
mergeB (Skip xs) (y :- ys) = y :- mergeB xs ys
mergeB (x :- xs) (Skip ys) = x :- mergeB xs ys
mergeB (x :- xs) (y :- ys) = Skip (mergeCarry (mergeTree x y) xs ys)
mergeCarry
:: Ord a
=> Tree z a
-> Binomial z xs a
-> Binomial z ys a
-> Binomial z (1 + xs + ys) a
mergeCarry !t Nil ys = carryOne t ys
mergeCarry !t xs Nil = carryOne t xs
mergeCarry !t (Skip xs) (Skip ys) = t :- mergeB xs ys
mergeCarry !t (Skip xs) (y :- ys) = Skip (mergeCarry (mergeTree t y) xs ys)
mergeCarry !t (x :- xs) (Skip ys) = Skip (mergeCarry (mergeTree t x) xs ys)
mergeCarry !t (x :- xs) (y :- ys) = t :- mergeCarry (mergeTree x y) xs ys
carryOne :: Ord a => Tree z a -> Binomial z xs a -> Binomial z (1 + xs) a
carryOne !t Nil = t :- Nil
carryOne !t (Skip xs) = t :- xs
carryOne !t (x :- xs) = Skip (carryOne (mergeTree t x) xs)
instance Ord a => MeldableIndexedQueue (Binomial 0) a where
merge = mergeB
{-# INLINE merge #-}
instance Ord a => IndexedQueue (Binomial 0) a where
empty = Nil
singleton x = Root x NilN :- Nil
insert = merge . singleton
```

(`BangPatterns`

for this example)

On top of that, it’s very easy to define delete-min:

```
minView xs = case minViewZip xs of
Zipper x _ ys -> (x, ys)
minViewMay q b f = case q of
Nil -> b
_ :- _ -> uncurry f (minView q)
Skip _ -> uncurry f (minView q)
data Zipper a n rk = Zipper !a (Node rk a) (Binomial rk n a)
skip :: Binomial (1 + z) xs a -> Binomial z (xs + xs) a
skip x = case x of
Nil -> Nil
Skip _ -> Skip x
_ :- _ -> Skip x
data MinViewZipper a n rk where
Infty :: MinViewZipper a 0 rk
Min :: {-# UNPACK #-} !(Zipper a n rk) -> MinViewZipper a (n+1) rk
slideLeft :: Zipper a n (1 + rk) -> Zipper a (1 + n + n) rk
slideLeft (Zipper m (t :< ts) hs)
= Zipper m ts (t :- hs)
pushLeft
:: Ord a
=> Tree rk a
-> Zipper a n (1 + rk)
-> Zipper a (2 + n + n) rk
pushLeft c (Zipper m (t :< ts) hs)
= Zipper m ts (Skip (carryOne (mergeTree c t) hs))
minViewZip :: Ord a => Binomial rk (1 + n) a -> Zipper a n rk
minViewZip (Skip xs) = slideLeft (minViewZip xs)
minViewZip (t@(Root x ts) :- f) = case minViewZipMay f of
Min ex@(Zipper minKey _ _) | minKey < x -> pushLeft t ex
_ -> Zipper x ts (skip f)
minViewZipMay :: Ord a => Binomial rk n a -> MinViewZipper a n rk
minViewZipMay (Skip xs) = Min (slideLeft (minViewZip xs))
minViewZipMay Nil = Infty
minViewZipMay (t@(Root x ts) :- f) = Min $ case minViewZipMay f of
Min ex@(Zipper minKey _ _) | minKey < x -> pushLeft t ex
_ -> Zipper x ts (skip f)
```

Similarly, compare the version of the pairing heap with the plugin:

```
data Heap n a where
E :: Heap 0 a
T :: a -> HVec n a -> Heap (1 + n) a
data HVec n a where
HNil :: HVec 0 a
HCons :: Heap m a -> HVec n a -> HVec (m + n) a
insert :: Ord a => a -> Heap n a -> Heap (1 + n) a
insert x xs = merge (T x HNil) xs
merge :: Ord a => Heap m a -> Heap n a -> Heap (m + n) a
merge E ys = ys
merge xs E = xs
merge h1@(T x xs) h2@(T y ys)
| x <= y = T x (HCons h2 xs)
| otherwise = T y (HCons h1 ys)
minView :: Ord a => Heap (1 + n) a -> (a, Heap n a)
minView (T x hs) = (x, mergePairs hs)
mergePairs :: Ord a => HVec n a -> Heap n a
mergePairs HNil = E
mergePairs (HCons h HNil) = h
mergePairs (HCons h1 (HCons h2 hs)) =
merge (merge h1 h2) (mergePairs hs)
```

To the version without the plugin:

```
data Heap n a where
E :: Heap Z a
T :: a -> HVec n a -> Heap (S n) a
data HVec n a where
HNil :: HVec Z a
HCons :: Heap m a -> HVec n a -> HVec (m + n) a
class Sized h where
size :: h n a -> The Peano n
instance Sized Heap where
size E = Zy
size (T _ xs) = Sy (size xs)
plus :: The Peano n -> The Peano m -> The Peano (n + m)
plus Zy m = m
plus (Sy n) m = Sy (plus n m)
instance Sized HVec where
size HNil = Zy
size (HCons h hs) = size h `plus` size hs
insert :: Ord a => a -> Heap n a -> Heap (S n) a
insert x xs = merge (T x HNil) xs
merge :: Ord a => Heap m a -> Heap n a -> Heap (m + n) a
merge E ys = ys
merge xs E = case plusZero (size xs) of Refl -> xs
merge h1@(T x xs) h2@(T y ys)
| x <= y = case plusCommutative (size h2) (size xs) of
Refl -> T x (HCons h2 xs)
| otherwise = case plusSuccDistrib (size xs) (size ys) of
Refl -> T y (HCons h1 ys)
minView :: Ord a => Heap (S n) a -> (a, Heap n a)
minView (T x hs) = (x, mergePairs hs)
mergePairs :: Ord a => HVec n a -> Heap n a
mergePairs HNil = E
mergePairs (HCons h HNil) = case plusZero (size h) of Refl -> h
mergePairs (HCons h1 (HCons h2 hs)) =
case plusAssoc (size h1) (size h2) (size hs) of
Refl -> merge (merge h1 h2) (mergePairs hs)
```

The typechecker plugin makes it relatively easy to implement several other heaps: skew, Braun, etc. You’ll need one extra trick to implement a leftist heap, though. Let’s take a look at the unverified version:

```
data Leftist a
= Leaf
| Node {-# UNPACK #-} !Int
a
(Leftist a)
(Leftist a)
rank :: Leftist s -> Int
rank Leaf = 0
rank (Node r _ _ _) = r
{-# INLINE rank #-}
mergeL :: Ord a => Leftist a -> Leftist a -> Leftist a
mergeL Leaf h2 = h2
mergeL h1 Leaf = h1
mergeL h1@(Node w1 p1 l1 r1) h2@(Node w2 p2 l2 r2)
| p1 < p2 =
if ll <= lr
then LNode (w1 + w2) p1 l1 (mergeL r1 h2)
else LNode (w1 + w2) p1 (mergeL r1 h2) l1
| otherwise =
if rl <= rr
then LNode (w1 + w2) p2 l2 (mergeL r2 h1)
else LNode (w1 + w2) p2 (mergeL r2 h1) l2
where
ll = rank r1 + w2
lr = rank l1
rl = rank r2 + w1
rr = rank l2
```

In a weight-biased leftist heap, the left branch in any tree must have at least as many elements as the right branch. Ideally, we would encode that in the representation of size-indexed leftist heap:

```
data Leftist n a where
Leaf :: Leftist 0 a
Node :: !(The Nat (n + m + 1))
-> a
-> Leftist n a
-> Leftist m a
-> !(m <= n)
-> Leftist (n + m + 1) a
rank :: Leftist n s -> The Nat n
rank Leaf = sing
rank (Node r _ _ _ _) = r
{-# INLINE rank #-}
```

Two problems, though: first of all, we need to be able to *compare* the sizes of two heaps, in the merge function. If we were using the type-level Peano numbers, this would be too slow. More importantly, though, we need the comparison to provide a *proof* of the ordering, so that we can use it in the resulting `Node`

constructor.

In Agda, the Peano type is actually backed by Haskell’s `Integer`

at runtime. This allows compile-time proofs to be written about values which are calculated efficiently. We can mimic the same thing in Haskell with a newtype wrapper *around* `Integer`

with a phantom `Peano`

parameter, if we promise to never put an integer in which has a different value to its phantom value. We can make this promise a little more trustworthy if we don’t export the newtype constructor.

```
newtype instance The Nat n where
NatSing :: Integer -> The Nat n
instance KnownNat n => KnownSing n where
sing = NatSing $ Prelude.fromInteger $ natVal (Proxy :: Proxy n)
```

`FlexibleInstances`

is needed for the instance. We can also encode all the necessary arithmetic:

```
infixl 6 +.
(+.) :: The Nat n -> The Nat m -> The Nat (n + m)
(+.) =
(coerce :: (Integer -> Integer -> Integer)
-> The Nat n -> The Nat m -> The Nat (n + m))
(+)
{-# INLINE (+.) #-}
```

Finally, the compare function (`ScopedTypeVariables`

for this):

```
infix 4 <=.
(<=.) :: The Nat n -> The Nat m -> The Bool (n <=? m)
(<=.) (NatSing x :: The Nat n) (NatSing y :: The Nat m)
| x <= y =
case (unsafeCoerce (Refl :: True :~: True) :: (n <=? m) :~: True) of
Refl -> Truey
| otherwise =
case (unsafeCoerce (Refl :: True :~: True) :: (n <=? m) :~: False) of
Refl -> Falsy
{-# INLINE (<=.) #-}
totalOrder :: p n -> q m -> (n <=? m) :~: False -> (m <=? n) :~: True
totalOrder (_ :: p n) (_ :: q m) Refl =
unsafeCoerce Refl :: (m <=? n) :~: True
type x <= y = (x <=? y) :~: True
```

It’s worth mentioning that all of these functions are somewhat axiomatic: there’s no checking of these definitions going on, and any later proofs are only correct in terms of these functions.

If we want our merge function to *really* look like the non-verified version, though, we’ll have to mess around with the syntax a little.

When matching on a singleton, *within* the case-match, proof of the singleton’s type is provided. For instance:

```
type family IfThenElse (c :: Bool) (true :: k) (false :: k) :: k
where
IfThenElse True true false = true
IfThenElse False true false = false
intOrString :: The Bool cond -> IfThenElse cond Int String
intOrString Truey = 1
intOrString Falsy = "abc"
```

In Haskell, since we can overload the if-then-else construct (with `RebindableSyntax`

), we can provide the same syntax, while hiding the dependent nature:

```
ifThenElse :: The Bool c -> (c :~: True -> a) -> (c :~: False -> a) -> a
ifThenElse Truey t _ = t Refl
ifThenElse Falsy _ f = f Refl
```

Finally, then, we can write the implementation for merge, which looks almost *exactly* the same as the non-verified merge:

```
instance Ord a => IndexedQueue Leftist a where
minView (Node _ x l r _) = (x, merge l r)
{-# INLINE minView #-}
singleton x = Node sing x Leaf Leaf Refl
{-# INLINE singleton #-}
empty = Leaf
{-# INLINE empty #-}
insert = merge . singleton
{-# INLINE insert #-}
minViewMay Leaf b _ = b
minViewMay (Node _ x l r _) _ f = f x (merge l r)
instance Ord a =>
MeldableIndexedQueue Leftist a where
merge Leaf h2 = h2
merge h1 Leaf = h1
merge h1@(Node w1 p1 l1 r1 _) h2@(Node w2 p2 l2 r2 _)
| p1 < p2 =
if ll <=. lr
then Node (w1 +. w2) p1 l1 (merge r1 h2)
else Node (w1 +. w2) p1 (merge r1 h2) l1 . totalOrder ll lr
| otherwise =
if rl <=. rr
then Node (w1 +. w2) p2 l2 (merge r2 h1)
else Node (w1 +. w2) p2 (merge r2 h1) l2 . totalOrder rl rr
where
ll = rank r1 +. w2
lr = rank l1
rl = rank r2 +. w1
rr = rank l2
{-# INLINE merge #-}
```

What’s cool about this implementation is that it has the same performance as the non-verified version (if `Integer`

is swapped out for `Int`

, that is), and it *looks* pretty much the same. This is very close to static verification for free.

The `Sort`

type used in the original blog post can be generalized to *any* indexed container.

```
data Parts f g a b r where
Parts :: (forall n. g (m + n) b -> (g n b, r))
-> !(f m a)
-> Parts f g a b r
instance Functor (Parts f g a b) where
fmap f (Parts g h) =
Parts (\h' -> case g h' of (remn, r) -> (remn, f r)) h
{-# INLINE fmap #-}
instance (IndexedQueue f x, MeldableIndexedQueue f x) =>
Applicative (Parts f g x y) where
pure x = Parts (\h -> (h, x)) empty
{-# INLINE pure #-}
(Parts f (xs :: f m x) :: Parts f g x y (a -> b)) <*>
Parts g (ys :: f n x) =
Parts h (merge xs ys)
where
h :: forall o . g ((m + n) + o) y -> (g o y, b)
h v = case f v of { (v', a) ->
case g v' of { (v'', b) ->
(v'', a b)}}
{-# INLINABLE (<*>) #-}
```

This version doesn’t insist that you order the elements of the heap in any particular way: we could use indexed difference lists to reverse a container, or indexed lists to calculate permutations of a container, for instance.

I’d be very interested to see any other uses of these indexed heaps, if anyone has any ideas. Potentially the could be used in any place where there is a need for some heap which is known to be of a certain size (a true prime sieve, for instance).

I’ve explored all of these ideas here. It has implementations of all the heaps I mentioned, as well as the index-erasing type, and a size-indexed list, for reversing traversables. In the future, I might add things like a Fibonacci heap, or the optimal Brodal/Okasaki heap (Brodal and Okasaki 1996).

Brodal, Gerth Stølting, and Chris Okasaki. 1996. “Optimal Purely Functional Priority Queues.” *Journal of Functional Programming* 6 (6) (November): 839–857. doi:10.1017/S095679680000201X. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.48.973.

Diatchki, Iavor S. 2015. “Improving Haskell Types with SMT.” In *Proceedings of the 2015 ACM SIGPLAN Symposium on Haskell*, 1–10. Haskell ’15. New York, NY, USA: ACM. doi:10.1145/2804302.2804307. http://yav.github.io/publications/improving-smt-types.pdf.

Eisenberg, Richard A., and Stephanie Weirich. 2012. “Dependently Typed Programming with Singletons.” In *Proceedings of the 2012 Haskell Symposium*, 117–130. Haskell ’12. New York, NY, USA: ACM. doi:10.1145/2364506.2364522. http://cs.brynmawr.edu/~rae/papers/2012/singletons/paper.pdf.

Fredman, Michael L., Robert Sedgewick, Daniel D. Sleator, and Robert E. Tarjan. 1986. “The pairing heap: A new form of self-adjusting heap.” *Algorithmica* 1 (1-4) (January): 111–129. doi:10.1007/BF01840439. http://www.cs.princeton.edu/courses/archive/fall09/cos521/Handouts/pairingheaps.pdf.

Hinze, Ralf. 1999. “Functional Pearls: Explaining Binomial Heaps.” *Journal of Functional Programming* 9 (1) (January): 93–104. doi:10.1017/S0956796899003317. http://www.cs.ox.ac.uk/ralf.hinze/publications/#J1.

———. 2001. “Manufacturing datatypes.” *Journal of Functional Programming* 11 (5) (September): 493–524. doi:10.1017/S095679680100404X. http://www.cs.ox.ac.uk/ralf.hinze/publications/#J6.

*Proceedings of the ACM SIGPLAN International Conference on Functional Programming (ICFP’99), Paris, France, September 27-29, 1999*, 34:28. ACM. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.456.357&rep=rep1&type=pdf.

Wasserman, Louis. 2010. “Playing with Priority Queues.” *The Monad.Reader* 16 (16) (May): 37. https://themonadreader.files.wordpress.com/2010/05/issue16.pdf.

Tags: Haskell

I saw this post on reddit recently, and it got me thinking about recursion schemes. One of the primary motivations behind them is the reduction of boilerplate. The classic example is evaluation of arithmetic expressions:

```
data ExprF a
= LitF Integer
| (:+:) a a
| (:*:) a a
deriving Functor
type Expr = Fix ExprF
eval :: Expr -> Integer
eval = unfix >>> \case
LitF n -> n
x :+: y -> eval x + eval y
x :*: y -> eval x * eval y
```

The calls to `eval`

are the boilerplate: this is where the main recursion scheme, `cata`

can help.

```
evalF :: Expr -> Integer
evalF = cata $ \case
LitF n -> n
x :+: y -> x + y
x :*: y -> x * y
```

I still feel like there’s boilerplate, though. Ideally I’d like to write this:

```
evalF :: Expr -> Integer
evalF = cata $ ??? $ \case
Lit -> id
Add -> (+)
Mul -> (*)
```

The `???`

needs to be filled in. It’s a little tricky, though: the type of the algebra changes depending on what expression it’s given. GADTs will allow us to attach types to cases:

```
data ExprI a r f where
Lit :: ExprI a b (Integer -> b)
Add :: ExprI a b (a -> a -> b)
Mul :: ExprI a b (a -> a -> b)
```

The first type parameter is the same as the first type parameter to `ExprF`

. The second is the output type of the algebra, and the third is the type of the fold required to produce that output type. The third type parameter *depends* on the case matched in the GADT. Using this, we can write a function which converts a fold/pattern match to a standard algebra:

```
foldAlg :: (forall f. ExprI a r f -> f) -> (ExprF a -> r)
foldAlg f (LitF i) = f Lit i
foldAlg f (x :+: y) = f Add x y
foldAlg f (x :*: y) = f Mul x y
```

And finally, we can write the nice evaluation algebra:

```
evalF :: Expr -> Integer
evalF = cata $ foldAlg $ \case
Lit -> id
Add -> (+)
Mul -> (*)
```

I hacked together some quick template Haskell to generate the matchers over here. It uses a class `AsPatternFold`

:

```
class AsPatternFold x f | x -> f where
foldMatch :: (forall a. f r a -> a) -> (x -> r)
```

And you generate the extra data type, with an instance, by doing this:

`makePatternFolds ''ExprF`

The code it generates can be used like this:

```
evalF :: Expr -> Integer
evalF = cata $ foldMatch $ \case
LitI -> id
(:+|) -> (+)
(:*|) -> (*)
```

It’s terribly hacky at the moment, I may clean it up later.

There’s another approach to the same idea that is slightly more sensible, using record wildcards. You define a handler for you datatype (an algebra):

```
data ExprAlg a r
= ExprAlg
{ litF :: Integer -> r
, (+:) :: a -> a -> r
, (*:) :: a -> a -> r }
```

Then, to use it, you define how to interact between the handler and the datatype, like before. The benefit is that record wildcard syntax allows you to piggy back on the function definition syntax, like so:

```
data ExprF a
= LitF Integer
| (:+:) a a
| (:*:) a a
makeHandler ''ExprF
exprAlg :: ExprF Integer -> Integer
exprAlg = index ExprFAlg {..} where
litF = id
(+:) = (+)
(*:) = (*)
```

This approach is much more principled: the `index`

function, for example, comes from the adjunctions package, from the `Representable`

class. That’s because those algebras are actually representable functors, with their representation being the thing they match. They also conform to a whole bunch of things automatically, letting you combine them interesting ways.

Properly printing expressions, with minimal parentheses, is a surprisingly difficult problem. Ramsey (1998) provides a solution of the form:

```
isParens side (Assoc ao po) (Assoc ai pi) =
pi <= po && (pi /= po || ai /= ao || ao /= side)
```

Using this, we can write an algebra for printing expressions. It should work in the general case, not just on the expression type defined above, so we need to make another unfixed functor to describe the printing of an expression:

```
data Side = L | R deriving Eq
data ShowExpr t e
= ShowLit { _repr :: t }
| Prefix { _repr :: t, _assoc :: (Int,Side), _child :: e }
| Postfix { _repr :: t, _assoc :: (Int,Side), _child :: e }
| Binary { _repr :: t, _assoc :: (Int,Side), _lchild :: e
, _rchild :: e }
deriving Functor
makeLenses ''ShowExpr
```

The lenses are probably overkill. For printing, we need not only the precedence of the current level, but also the precedence one level below. Seems like the perfect case for a zygomorphism:

```
showExprAlg :: Semigroup t
=> (t -> t)
-> ShowExpr t (Maybe (Int,Side), t)
-> t
showExprAlg prns = \case
ShowLit t -> t
Prefix t s (q,y) -> t <> ifPrns R s q y
Postfix t s (p,x) -> ifPrns L s p x <> t
Binary t s (p,x) (q,y) -> ifPrns L s p x <> t <> ifPrns R s q y
where
ifPrns sid (op,oa) (Just (ip,ia))
| ip < op || ip == op && (ia /= oa || sid /= oa) = prns
ifPrns _ _ _ = id
```

The first argument to this algebra is the parenthesizing function. This algebra works fine for when the `ShowExpr`

type is already constructed:

```
showExpr' :: Semigroup t => (t -> t) -> Fix (ShowExpr t) -> t
showExpr' = zygo (preview assoc) . showExprAlg
```

But we still need to construct the `ShowExpr`

from something else first. `hylo`

might be a good fit:

`hylo :: Functor f => (f b -> b) -> (a -> f a) -> a -> b`

But that performs a catamorphism after an anamorphism, and we want a zygomorphism after an anamorphism. Luckily, the recursion-schemes library is constructed in such a way that different schemes can be stuck together relatively easily:

```
hylozygo
:: Functor f
=> (f a -> a) -> (f (a, b) -> b) -> (c -> f c) -> c -> b
hylozygo x y z = ghylo (distZygo x) distAna y (fmap Identity . z)
showExpr :: Semigroup t
=> (t -> t)
-> (e -> ShowExpr t e)
-> e -> t
showExpr = hylozygo (preview assoc) . showExprAlg
```

Let’s try it out, with a right-associative operator this time to make things more difficult:

```
data ExprF a
= LitF Integer
| (:+:) a a
| (:*:) a a
| (:^:) a a
deriving Functor
makeHandler ''ExprF
newtype Expr = Expr { runExpr :: ExprF Expr }
instance Num Expr where
fromInteger = Expr . LitF
x + y = Expr (x :+: y)
x * y = Expr (x :*: y)
infixr 8 ^*
(^*) :: Expr -> Expr -> Expr
x ^* y = Expr (x :^: y)
instance Show Expr where
show =
showExpr
(\x -> "(" ++ x ++ ")")
(index ExprFAlg {..} . runExpr)
where
litF = ShowLit . show
(+:) = Binary " + " (6,L)
(*:) = Binary " * " (7,L)
(^:) = Binary " ^ " (8,R)
```

Since we only specified `Semigroup`

in the definition of `showExpr`

, we can use the more efficient difference-list definition of `Show`

:

```
instance Show Expr where
showsPrec _ =
appEndo . showExpr
(Endo . showParen True . appEndo)
(index ExprFAlg {..} . runExpr)
where
litF = ShowLit . Endo . shows
(+:) = Binary (Endo (" + " ++)) (6,L)
(*:) = Binary (Endo (" * " ++)) (7,L)
(^:) = Binary (Endo (" ^ " ++)) (8,R)
1 ^* 2 ^* 3 -- 1 ^ 2 ^ 3
(1 ^* 2) ^* 3 -- (1 ^ 2) ^ 3
1 * 2 + 3 :: Expr -- 1 * 2 + 3
1 * (2 + 3) :: Expr -- 1 * (2 + 3)
```

Ramsey, Norman. 1998. “Unparsing Expressions With Prefix and Postfix Operators.” *Software—Practice & Experience* 28 (12): 1327–1356. http://www.cs.tufts.edu/%7Enr/pubs/unparse-abstract.html.

Tags: Haskell

In Haskell restricted monads are monads which can’t contain every type. `Set`

is a good example. If you look in the documentation for Data.Set you’ll see several functions which correspond to functions in the Functor/Applicative/Monad typeclass hierarchy:

```
map :: Ord b => (a -> b) -> Set a -> Set b
singleton :: a -> Set a
foldMap :: Ord b => (a -> Set b) -> Set a -> Set b -- specialized
```

Unfortunately, though, `Set`

can’t conform to `Functor`

, because the signature of `fmap`

looks like this:

`fmap :: Functor f => (a -> b) -> f a -> f b`

It doesn’t have an `Ord`

constraint.

This is annoying: when using `Set`

, lots of things have to be imported qualified, and you have to remember the slightly different names of extra functions like `map`

. More importantly, you’ve lost the ability to write generic code over `Functor`

or `Monad`

which will work on `Set`

.

There are a number of ways to get around this problem. Here, an approach using reflection-reification is explored. These are the types involved:

```
newtype SetC a =
SetC{unSetC :: forall r. Ord r => (a -> Set r) -> Set r}
reifySet :: Ord r => SetC r -> Set r
reifySet m = unSetC m singleton
reflectSet :: Ord r => Set r -> SetC r
reflectSet s = SetC $ \k -> S.foldr (\x r -> k x `union` r) S.empty s
```

`SetC`

is just `Cont`

in disguise. In fact, we can generalize this pattern, using Constraint Kinds:

```
newtype FreeT c m a =
FreeT { runFreeT :: forall r. c r => (a -> m r) -> m r}
reifySet :: Ord a => FreeT Ord Set a -> Set a
reifySet m = runFreeT m singleton
reflectSet :: Set r -> FreeT Ord Set r
reflectSet s = FreeT $ \k -> S.foldr (\x r -> k x `union` r) S.empty s
```

`FreeT`

looks an *awful lot* like `ContT`

by now. The type has some other interesting applications, though. For instance, this type:

`type FM = FreeT Monoid Identity`

Is the free monoid. If we use a transformers-style type synonym, the naming becomes even nicer:

```
type Free c = FreeT c Identity
runFree :: c r => Free c a -> (a -> r) -> r
runFree xs f = runIdentity (runFreeT xs (pure . f))
instance Foldable (Free Monoid) where
foldMap = flip runFree
```

Check out this package for an implementation of the non-transformer `Free`

.

This is still unsatisfying, though. Putting annotations around your code feels inelegant. The next solution is to replace the monad class altogether with our own, and turn on `-XRebindableSyntax`

. There are a few ways to design this new class. One option is to use multi-parameter type classes. Another solution is with an associated type:

```
class Functor f where
type Suitable f a :: Constraint
fmap :: Suitable f b => (a -> b) -> f a -> f b
```

This is similar to the approach taken in the rmonad library, except that library doesn’t use constraint kinds (they weren’t available when the library was made), so it has to make do with a `Suitable`

class. Also, the signature for `fmap`

in rmonad is:

`fmap :: (Suitable f a, Suitable f b) => (a -> b) -> f a -> f b`

I don’t want to constrain `a`

: I figure if you can get something *into* your monad, it *must* be suitable. And I really want to reduce the syntactic overhead of writing extra types next to your functions.

There’s also the supermonad library out there which is much more general than any of these examples: it supports indexed monads as well as constrained.

Anyway,`Monad`

is defined similarly to `Functor`

:

```
class Functor m => Monad m where
return :: Suitable m a => a -> m a
(>>=) :: Suitable m b => m a -> (a -> m b) -> m b
```

Again, I want to minimize the use of `Suitable`

, so for `>>=`

there’s only a constraint on `b`

.

Finally, here’s the `Set`

instance:

```
instance Functor Set where
type Suitable Set a = Ord a
fmap = Set.map
```

With equality constraints, you can actually make *monomorphic* containers conform to these classes (or, at least, wrappers around them).

```
import qualified Data.Text as Text
data Text a where
Text :: Text.Text -> Text Char
instance Functor Text where
type Suitable Text a = a ~ Char
fmap f (Text xs) = Text (Text.map f xs)
```

This pattern can be generalized with some more GADT magic:

```
data Monomorphic xs a b where
Monomorphic :: (a ~ b) => xs -> Monomorphic xs a b
instance (MonoFunctor xs, a ~ Element xs) => Functor (Monomorphic xs a) where
type Suitable (Monomorphic xs a) b = a ~ b
fmap f (Monomorphic xs) = Monomorphic (omap f xs)
```

Where `omap`

comes from the mono-traversable package. You could go a little further, to `Foldable`

:

```
instance (MonoFoldable xs, element ~ Element xs) =>
Foldable (Monomorphic xs element) where
foldr f b (Monomorphic xs) = ofoldr f b xs
foldMap f (Monomorphic xs) = ofoldMap f xs
foldl' f b (Monomorphic xs) = ofoldl' f b xs
toList (Monomorphic xs) = otoList xs
null (Monomorphic xs) = onull xs
length (Monomorphic xs) = olength xs
foldr1 f (Monomorphic xs) = ofoldr1Ex f xs
elem x (Monomorphic xs) = oelem x xs
maximum (Monomorphic xs) = maximumEx xs
minimum (Monomorphic xs) = minimumEx xs
sum (Monomorphic xs) = osum xs
product (Monomorphic xs) = oproduct xs
```

Changing the `FreeT`

type above a little, we can go back to normal functors and monads, and write more general reify and reflect functions:

```
newtype FreeT m a =
FreeT { runFreeT :: forall r. Suitable m r => (a -> m r) -> m r}
reify :: (Monad m, Suitable m a) => FreeT m a -> m a
reify = flip runFreeT return
reflect :: Monad m => m a -> FreeT m a
reflect x = FreeT (x >>=)
```

So now our types, when wrapped, can conform to the Prelude’s `Functor`

. It would be nice if this type could be written like so:

```
reify :: Monad m => FreeT (Suitable m) m a -> m a
reify = flip runFreeT return
reflect :: Monad m => m a -> FreeT (Suitable m) m a
reflect x = FreeT (x >>=)
```

But unfortunately type families cannot be partially applied.

The classes above aren’t very modern: they’re missing applicative. This one is tricky:

```
class Functor f => Applicative f where
pure :: Suitable a => a -> f a
(<*>) :: Suitable f b => f (a -> b) -> f a -> f b
```

The issue is `f (a -> b)`

. There’s no *way* you’re getting some type like that into `Set`

. This means that `<*>`

is effectively useless. No problem, you think: define `liftA2`

instead:

```
class Functor f => Applicative f where
pure :: Suitable a => a -> f a
liftA2 :: Suitable f c => (a -> b -> c) -> f a -> f b -> f c
(<*>) :: (Applicative f, Suitable f b) => f (a -> b) -> f a -> f b
(<*>) = liftA2 ($)
```

Great! Now we can use it with set. However, there’s no way (that I can see) to define the other lift functions: `liftA3`

, etc. Of course, if `>>=`

is available, it’s as simple as:

```
liftA3 f xs ys zs = do
x <- xs
y <- ys
z <- zs
pure (f x y z)
```

But now we can’t define it for non-monadic applicatives (square matrices, ZipLists, etc.). This also forces us to use `>>=`

when `<*>`

may have been more efficient.

The functions we’re interested in defining look like this:

```
liftA2 :: Suitable f c => (a -> b -> c) -> f a -> f b -> f c
liftA3 :: Suitable f d => (a -> b -> c -> d) -> f a -> f b -> f c -> f d
liftA4 :: Suitable f e => (a -> b -> c -> d -> e) -> f a -> f b -> f c -> f d -> f e
```

There’s a clear pattern, but no obvious way to abstract over it. Type-level shenanigans to the rescue!

The pattern might be expressed like this:

`liftA :: Func args -> Func lifted args`

We can store these types as heterogeneous lists:

```
infixr 5 :-
data Vect xs where
Nil :: Vect '[]
(:-) :: x -> Vect xs -> Vect (x ': xs)
infixr 5 :*
data AppVect f xs where
NilA :: AppVect f '[]
(:*) :: f x -> AppVect f xs -> AppVect f (x ': xs)
```

And `liftA`

can be represented like this:

```
liftA
:: Suitable f b
=> (Vect xs -> b) -> AppVect f xs -> f b
liftA2
:: Suitable f c
=> (a -> b -> c) -> f a -> f b -> f c
liftA2 f xs ys =
liftA
(\(x :- y :- Nil) ->
f x y)
(xs :* ys :* NilA)
liftA3
:: Suitable f d
=> (a -> b -> c -> d) -> f a -> f b -> f c -> f d
liftA3 f xs ys zs =
liftA
(\(x :- y :- z :- Nil) ->
f x y z)
(xs :* ys :* zs :* NilA)
```

Cool! For unrestricted applicatives, we can define `liftA`

in terms of `<*>`

:

```
liftAP :: (Prelude.Applicative f)
=> (Vect xs -> b) -> (AppVect f xs -> f b)
liftAP f NilA = Prelude.pure (f Nil)
liftAP f (x :* NilA)
= Prelude.fmap (f . (:-Nil)) x
liftAP f (x :* xs)
= ((f .) . (:-)) Prelude.<$> x Prelude.<*> liftAP id xs
```

And for types with a monad instance, we can define it in terms of `>>=`

:

```
liftAM :: (Monad f, Suitable f b)
=> (Vect xs -> b) -> (AppVect f xs -> f b)
liftAM f NilA = pure (f Nil)
liftAM f (x :* NilA) = fmap (f . (:-Nil)) x
liftAM f (x :* xs) = x >>= \y -> liftAM (f . (y:-)) xs
```

This approach is *really* slow. Every function wraps up its arguments in a `Vect`

, and it’s just generally awful.

What about *not* wrapping up the function? Type families can help here:

```
type family FunType (xs :: [*]) (y :: *) :: * where
FunType '[] y = y
FunType (x ': xs) y = x -> FunType xs y
```

It gets really difficult to define `liftA`

using `<*>`

now, though. `liftAM`

, on the other hand, is a breeze:

```
liftAM :: Monad f => FunType xs a -> AppVect f xs -> f a
liftAM f Nil = pure f
liftAM f (x :< xs) = x >>= \y -> liftAM (f y) xs
```

And no vector constructors on the right of the bind!

Still, no decent definition using `<*>`

. The problem is that we’re using a cons-list to represent a function’s arguments, but `<*>`

is left-associative, so it builds up arguments as a snoc list. Lets try using a snoc-list as the type family:

```
infixl 5 :>
data AppVect f xs where
Nil :: AppVect f '[]
(:>) :: AppVect f xs -> f x -> AppVect f (x ': xs)
type family FunType (xs :: [*]) (y :: *) :: * where
FunType '[] y = y
FunType (x ': xs) y = FunType xs (x -> y)
liftA
:: Suitable f a
=> FunType xs a -> AppVect f xs -> f a
```

`liftAP`

now gets a natural definition:

```
liftAP :: Prelude.Applicative f => FunType xs a -> AppVect f xs -> f a
liftAP f Nil = Prelude.pure f
liftAP f (Nil :> xs) = Prelude.fmap f xs
liftAP f (ys :> xs) = liftAP f ys Prelude.<*> xs
```

But what about `liftAM`

? It’s much more difficult, fundamentally because `>>=`

builds up arguments as a cons-list. To convert between the two efficiently, we need to use the trick for reversing lists efficiently: build up the reversed list as you go.

```
liftAM :: (Monad f, Suitable f a) => FunType xs a -> AppVect f xs -> f a
liftAM = go pure where
go :: (Suitable f b, Monad f)
=> (a -> f b) -> FunType xs a -> AppVect f xs -> f b
go f g Nil = f g
go f g (xs :> x) = go (\c -> x >>= f . c) g xs
```

Using these definitions, we can make `Set`

, `Text`

, and all the rest of them applicatives, while preserving the applicative operations. Also, from my preliminary testing, there seems to be *no* overhead in using these new definitions for `<*>`

.

In Sculthorpe et al. (2013), there’s discussion of this type:

```
data NM :: (* -> Constraint) -> (* -> *) -> * -> * where
Return :: a -> NM c t a
Bind :: c x => t x -> (x -> NM c t a) -> NM c t a
```

This type allows constrained monads to become normal monads. It can be used for the same purpose as the `FreeT`

type from above. In the paper, the free type is called `RCodT`

.

One way to look at the type is as a concrete representation of the monad class, with each method being a constructor.

You might wonder if there are similar constructs for functor and applicative. Functor is simple:

```
data NF :: (* -> Constraint) -> (* -> *) -> * -> * where
FMap :: c x => (x -> a) -> t x -> NF c t a
```

Again, this can conform to functor (and *only* functor), and can be interpreted when the final type is `Suitable`

.

Like above, it has a continuation version, Yoneda.

For applicatives, though, the situation is different. In the paper, they weren’t able to define a transformer for applicatives that could be interpreted in some restricted applicative. I needed one because I wanted to use `-XApplicativeDo`

notation: the desugaring uses `<*>`

, not the `liftAn`

functions, so I wanted to construct a free applicative using `<*>`

, and run it using the lift functions. What I managed to cobble to gether doesn’t *really* solve the problem, but it works for `-XApplicativeDo`

!

The key with a lot of this was realizing that `<*>`

is *snoc*, not cons. Using a free applicative:

```
data Free f a where
Pure :: a -> Free f a
Ap :: Free f (a -> b) -> f a -> Free f b
instance Prelude.Functor (Free f) where
fmap f (Pure a) = Pure (f a)
fmap f (Ap x y) = Ap ((f .) Prelude.<$> x) y
instance Prelude.Applicative (Free f) where
pure = Pure
Pure f <*> y = Prelude.fmap f y
Ap x y <*> z = Ap (flip Prelude.<$> x Prelude.<*> z) y
```

This type can conform to `Applicative`

and `Functor`

no problem. And all it needs to turn back into a constrained applicative is for the outer type to be suitable:

```
lift :: f a -> Free f a
lift = Ap (Pure id)
lower
:: forall f a c.
Free f a
-> (forall xs. FunType xs a -> AppVect f xs -> f c)
-> f c
lower (Pure x) f = f x Nil
lower (Ap fs x :: Free f a) f =
lower fs (\ft av -> f ft (av :> x))
lowerConstrained
:: (Constrained.Applicative f, Suitable f a)
=> Free f a -> f a
lowerConstrained x = lower x liftA
```

There’s probably a more efficient way to encode it, though.

Sculthorpe, Neil, Jan Bracker, George Giorgidze, and Andy Gill. 2013. “The Constrained-monad Problem.” In *Proceedings of the 18th ACM SIGPLAN International Conference on Functional Programming*, 287–298. ICFP ’13. New York, NY, USA: ACM. doi:10.1145/2500365.2500602. http://ku-fpg.github.io/files/Sculthorpe-13-ConstrainedMonad.pdf.

I’ve been playing around a lot with semirings recently. A semiring is anything with addition, multiplication, zero and one. You can represent that in Haskell as:

```
class Semiring a where
zero :: a
one :: a
infixl 7 <.>
(<.>) :: a -> a -> a
infixl 6 <+>
(<+>) :: a -> a -> a
```

It’s kind of like a combination of two monoids. It has the normal monoid laws:

```
x <+> (y <+> z) = (x <+> y) <+> z
x <.> (y <.> z) = (x <.> y) <.> z
x <+> zero = zero <+> x = x
x <.> one = one <.> x = x
```

And a few extra:

```
x <+> y = y <+> x
x <.> (y <+> z) = (x <.> y) <+> (x <.> z)
(x <+> y) <.> z = (x <.> z) <+> (y <.> z)
zero <.> a = a <.> zero = zero
```

I should note that what I’m calling a semiring here is often called a rig. I actually prefer the name “rig”: a rig is a ring without **n**egatives (cute!); whereas a *semi*ring is a rig without neutral elements, which mirrors the definition of a semigroup. The nomenclature in this area is a bit of a mess, though, so I went with the more commonly-used name for the sake of googleability.

At first glance, it looks quite numeric. Indeed, PureScript uses it as the basis for its numeric hierarchy. (In my experience so far, it’s nicer to use than Haskell’s `Num`

)

```
instance Semiring Integer where
zero = 0
one = 1
(<+>) = (+)
(<.>) = (*)
instance Semiring Double where
zero = 0
one = 1
(<+>) = (+)
(<.>) = (*)
```

However, there are far more types which can form a valid `Semiring`

instance than can form a valid `Num`

instance: the `negate`

method, for example, excludes types representing the natural numbers:

```
newtype ChurchNat = ChurchNat
{ runNat :: forall a. (a -> a) -> a -> a}
data Nat = Zero | Succ Nat
```

These form perfectly sensible semirings, though:

```
instance Semiring ChurchNat where
zero = ChurchNat (const id)
one = ChurchNat ($)
ChurchNat n <+> ChurchNat m = ChurchNat (\f -> n f . m f)
ChurchNat n <.> ChurchNat m = ChurchNat (n . m)
instance Semiring Nat where
zero = Zero
one = Succ Zero
Zero <+> x = x
Succ x <+> y = Succ (x <+> y)
Zero <.> _ = Zero
Succ Zero <.> x =x
Succ x <.> y = y <+> (x <.> y)
```

The other missing method is `fromInteger`

, which means decidedly non-numeric types are allowed:

```
instance Semiring Bool where
zero = False
one = True
(<+>) = (||)
(<.>) = (&&)
```

We can provide a more general definition of the `Sum`

and `Product`

newtypes from Data.Monoid:

```
newtype Add a = Add
{ getAdd :: a
} deriving (Eq, Ord, Read, Show, Semiring)
newtype Mul a = Mul
{ getMul :: a
} deriving (Eq, Ord, Read, Show, Semiring)
instance Functor Add where
fmap f (Add x) = Add (f x)
instance Applicative Add where
pure = Add
Add f <*> Add x = Add (f x)
```

I’m using `Add`

and `Mul`

here to avoid name clashing.

```
instance Semiring a => Monoid (Add a) where
mempty = Add zero
Add x `mappend` Add y = Add (x <+> y)
instance Semiring a => Monoid (Mul a) where
mempty = Mul one
Mul x `mappend` Mul y = Mul (x <.> y)
add :: (Semiring a, Foldable f) => f a -> a
add = getAdd . foldMap Add
mul :: (Semiring a, Foldable f) => f a -> a
mul = getMul . foldMap Mul
```

`add`

and `mul`

are equivalent to `sum`

and `product`

:

`add xs == sum (xs :: [Integer])`

`mul xs == product (xs :: [Integer])`

But they now work with a wider array of types: non-negative numbers, as we’ve seen, but specialised to `Bool`

we get the familiar `Any`

and `All`

newtypes (and their corresponding folds).

`add xs == or (xs :: [Bool])`

`mul xs == and (xs :: [Bool])`

So far, nothing amazing. We avoid a little bit of code duplication, that’s all.

In older versions of Python, there was no native set type. In its place, dictionaries were used, where the values would be booleans. In a similar fashion, before the Counter type was added in 2.7, the traditional way of representing a multiset was using a dictionary where the values were integers.

Using semirings, both of these data structures can have the same type:

```
newtype GeneralMap a b = GeneralMap
{ getMap :: Map a b
} deriving (Functor, Foldable, Show, Eq, Ord)
```

If operations are defined in terms of the `Semiring`

class, the same code will work on a set *and* a multiset:

```
insert :: (Ord a, Semiring b) => a -> GeneralMap a b -> GeneralMap a b
insert x = GeneralMap . Map.insertWith (<+>) x one . getMap
delete :: Ord a => a -> GeneralMap a b -> GeneralMap a b
delete x = GeneralMap . Map.delete x . getMap
```

How to get back the dictionary-like behaviour, then? Well, operations like `lookup`

and `assoc`

are better suited to a `Monoid`

constraint, rather than `Semiring`

:

```
lookup :: (Ord a, Monoid b) => a -> GeneralMap a b -> b
lookup x = fold . Map.lookup x . getMap
assoc :: (Ord a, Applicative f, Monoid (f b))
=> a -> b -> GeneralMap a (f b) -> GeneralMap a (f b)
assoc k v = GeneralMap . Map.insertWith mappend k (pure v) . getMap
```

`lookup`

is a function which should work on sets and multisets: however `Bool`

and `Integer`

don’t have `Monoid`

instances. To fix this, we can use the `Add`

newtype from earlier. The interface for each of these data structures can now be expressed like this:

```
type Set a = GeneralMap a (Add Bool)
type MultiSet a = GeneralMap a (Add Integer)
type Map a b = GeneralMap a (First b)
type MultiMap a b = GeneralMap a [b]
```

And each of the functions on the `GeneralMap`

specialises like this:

```
-- Set
insert :: Ord a => a -> Set a -> Set a
lookup :: Ord a => a -> Set a -> Add Bool
delete :: Ord a => a -> Set a -> Set a
-- MultiSet
insert :: Ord a => a -> MultiSet a -> MultiSet a
lookup :: Ord a => a -> MultiSet a -> Add Integer
delete :: Ord a => a -> MultiSet a -> MultiSet a
-- Map
assoc :: Ord a => a -> b -> Map a b -> Map a b
lookup :: Ord a => a -> Map a b -> First b
delete :: Ord a => a -> Map a b -> Map a b
-- MultiMap
assoc :: Ord a => a -> b -> MultiMap a b -> MultiMap a b
lookup :: Ord a => a -> MultiMap a b -> [b]
delete :: Ord a => a -> MultiMap a b -> MultiMap a b
```

This was actually where I first came across semirings: I was trying to avoid code duplication for a trie implementation. I wanted to get the Boom Hierarchy (1981) (plus maps) from the same underlying implementation.

It works *okay*. On the one hand, it’s nice that you don’t have to wrap the map type itself to get the different behaviour. There’s only one `delete`

function, which works on sets, maps, multisets, etc. I don’t need to import the `TrieSet`

module qualified, to differentiate between the *four* `delete`

functions I’ve written.

On the other hand, the `Add`

wrapper is a pain: having `lookup`

return the wrapped values is ugly, and the `Applicative`

constraint is unwieldy (we only use it for `pure`

). Both of those problems could be solved by using something like the `Newtype`

or `Wrapped`

class, which provide facilities for wrapping and unwrapping, but that might be overkill.

While `Monoid`

and `Semiring`

can take you pretty far, even to a `Monoid`

instance:

```
fromList :: (Ord a, Semiring b, Foldable f) => f a -> GeneralMap a b
fromList = foldr insert (GeneralMap Map.empty)
fromAssocs :: (Ord a, Applicative f, Monoid (f b), Foldable t)
=> t (a, b) -> GeneralMap a (f b)
fromAssocs = foldr (uncurry assoc) (GeneralMap Map.empty)
instance (Ord a, Monoid b) => Monoid (GeneralMap a b) where
mempty = GeneralMap Map.empty
mappend (GeneralMap x) (GeneralMap y) =
GeneralMap (Map.unionWith mappend x y)
singleton :: Semiring b => a -> GeneralMap a b
singleton x = GeneralMap (Map.singleton x one)
```

They seem to fall down around functions like `intersection`

:

```
intersection :: (Ord a, Semiring b)
=> GeneralMap a b -> GeneralMap a b -> GeneralMap a b
intersection (GeneralMap x) (GeneralMap y) =
GeneralMap (Map.intersectionWith (<.>) x y)
```

It works for sets, but it doesn’t make sense for multisets, and it doesn’t work for maps.

I couldn’t find a semiring for the map-like types which would give me a sensible intersection. I’m probably after a different algebraic structure.

While looking for a semiring to represent a valid intersection, I came across the probability semiring. It’s just the normal semiring over the rationals, with a lower bound of 0, and an upper of 1.

It’s useful in some cool ways: you can combine it with a list to get the probability monad (Erwig and Kollmansberger 2006). There’s an example in PureScript’s Distributions package.

`newtype Prob s a = Prob { runProb :: [(a,s)] }`

There are some drawbacks to this representation, performance-wise. In particular, there’s a combinatorial explosion on every monadic bind. One of the strategies to reduce this explosion is to use a map:

`newtype Prob s a = Prob { runProb :: Map a s }`

Because this doesn’t allow duplicate keys, it will flatten the association list on every bind. Unfortunately, the performance gain doesn’t always materialize, and in some cases there’s a performance *loss* (Larsen 2011). Also, the `Ord`

constraint on the keys prevents it from conforming to `Monad`

(at least not without difficulty).

Interestingly, this type is exactly the same as the `GeneralMap`

from before. This is a theme I kept running into, actually: the `GeneralMap`

type represents not just maps, multimaps, sets, multisets, but also a whole host of other data structures.

Edward Kmett had an interesting blog post about “Free Modules and Functional Linear Functionals” (2011b). In it, he talked about this type:

```
infixr 0 $*
newtype Linear r a = Linear { ($*) :: (a -> r) -> r }
```

Also known as `Cont`

, the continuation monad. It can encode the probability monad:

```
fromProbs :: (Semiring s, Applicative m) => [(a,s)] -> ContT s m a
fromProbs xs = ContT $ \k ->
foldr (\(x,s) a -> liftA2 (<+>) (fmap (s<.>) (k x)) a) (pure zero) xs
probOfT :: (Semiring r, Applicative m) => (a -> Bool) -> ContT r m a -> m r
probOfT e c = runContT c (\x -> if e x then pure one else pure zero)
probOf :: Semiring r => (a -> Bool) -> Cont r a -> r
probOf e = runIdentity . probOfT e
uniform :: Applicative m => [a] -> ContT Double m a
uniform xs =
let s = 1.0 / fromIntegral (length xs)
in fromProbs (map (flip (,) s) xs)
```

Multiplication isn’t paid for on every bind, making this (potentially) a more efficient implementation than both the map and the association list.

You can actually make the whole thing a semiring:

```
instance (Semiring r, Applicative m) => Semiring (ContT r m a) where
one = ContT (const (pure one))
zero = ContT (const (pure zero))
f <+> g = ContT (\k -> liftA2 (<+>) (runContT f k) (runContT g k))
f <.> g = ContT (\k -> liftA2 (<.>) (runContT f k) (runContT g k))
```

Which gives you a lovely `Alternative`

instance:

```
instance (Semiring r, Applicative m) => Alternative (ContT r m) where
(<|>) = (<+>)
empty = zero
```

This sheds some light on what was going on with the unsatisfactory `intersection`

function on `GeneralMap`

: it’s actually *multiplication*. If you wanted to stretch the analogy and make `GeneralMap`

conform to `Semiring`

, you could use the empty map for `zero`

, `mappend`

for `<+>`

, but you’d run into trouble for `one`

. `one`

is the map where every possible key has a value of one. In other words, you’d have to enumerate over every possible value for the keys. Interestingly, there’s kind of the inverse problem for Cont: while it has an easy `Semiring`

instance, in order to *inspect* the values you have to enumerate over all the possible keys.

I now have a name for the probability monad / general map / Cont thing: a *covector*.

I think that the transformer version of Cont has a valid interpretation, also. If I ever understand Hirschowitz and Maggesi (2010) I’ll put it into a later follow-up post.

As a short digression, you can beef up the `<|>`

operator a little, with something like the conditional choice operator:

```
data BiWeighted s = s :|: s
infixl 8 :|:
(|>) :: (Applicative m, Semiring s)
=> BiWeighted s
-> ContT s m a
-> ContT s m a
-> ContT s m a
((lp :|: rp) |> r) l =
(mapContT.fmap.(<.>)) lp l <|> (mapContT.fmap.(<.>)) rp r
--
(<|) :: ContT s m a
-> (ContT s m a -> ContT s m a)
-> ContT s m a
l <| r = r l
infixr 0 <|
infixr 0 |>
```

```
probOf ('a'==) (uniform "a" <| 0.4 :|: 0.6 |> uniform "b")
0.4
```

If you fiddle around with the probability monad, you can break it apart in interesting ways. For instance, extracting the `WriterT`

monad transformer gives you:

`WriterT (Product Double) []`

Eric Kidd describes it as `PerhapsT`

: a `Maybe`

with attached probability in his excellent blog post (and his paper in 2007).

Straight away, we can optimise this representation by transforming the leaky `WriterT`

into a state monad:

```
newtype WeightedT s m a = WeightedT
{ getWeightedT :: s -> m (a, s)
} deriving Functor
instance Monad m => Applicative (WeightedT s m) where
pure x = WeightedT $ \s -> pure (x,s)
WeightedT fs <*> WeightedT xs = WeightedT $ \s -> do
(f, p) <- fs s
(x, t) <- xs p
pure (f x, t)
instance Monad m => Monad (WeightedT s m) where
WeightedT x >>= f = WeightedT $ \s -> do
(x, p) <- x s
getWeightedT (f x) p
```

I’m not sure yet, but I think this might have something to do with the isomorphism between `Cont ((->) s)`

and `State s`

(Kmett 2011a).

You can even make it look like a normal (non-transformer) writer with some pattern synonyms:

```
type Weighted s = WeightedT s Identity
pattern Weighted w <- (runIdentity . flip getWeightedT zero -> w) where
Weighted (x,w) = WeightedT (\s -> Identity (x, s <.> w) )
```

And you can pretend that you’ve just got a normal tuple:

```
half :: a -> Weighted Double a
half x = Weighted (x, 0.5)
runWeighted :: Semiring s => Weighted s a -> (a, s)
runWeighted (Weighted w) = w
evalWeighted :: Semiring s => Weighted s a -> a
evalWeighted (Weighted (x,_)) = x
execWeighted :: Semiring s => Weighted s a -> s
execWeighted (Weighted (_,s)) = s
```

Looking back at Cont, it is reminiscent of a particular encoding of the free monoid from Doel (2015):

```
newtype FreeMonoid a = FreeMonoid
{ forall m. Monoid m => (a -> m) -> m }
```

So possibly covectors represent the free semiring, in some way.

Another encoding which looks free-ish is one of the efficient implementations of the probability monad from Larsen (2011):

```
data Dist a where
Certainly :: a -> Dist a -- only possible value
Choice :: Probability -> Dist a -> Dist a -> Dist a
Fmap :: (a -> b) -> Dist a -> Dist b
Join :: Dist (Dist a) -> Dist a
```

This looks an awful lot like a weighted free alternative. Is it a free semiring, then?

Maybe. There’s a parallel between the relationship between monoids and semirings and applicatives and `Alternative`

s (Rivas, Jaskelioff, and Schrijvers 2015). In a way, where monads are monoids in the category of endofunctors, alternatives are *semirings* in the category of endofunctors.

This parallel probably isn’t what I first thought it was. First of all, the above paper uses near-semirings, not semirings. A near-semiring is a semiring where the requirements for left distribution of multiplication over addition and commutative addition are dropped. Secondly, the class which most mirrors near-semirings is `MonadPlus`

, not alternative. (alternative doesn’t have annihilation) Thirdly, right distribution of multiplication over addition *isn’t* required `MonadPlus`

: it’s a further law required on top of the existing laws. Fourthly, most types in the Haskell ecosystem today which conform to `MonadPlus`

*don’t* conform to this extra law: in fact, those that do seem to be lists of some kind or another.

A further class is probably needed on top of the two already there, with the extra laws (called `Nondet`

in Fischer 2009).

An actual free near-semiring looks like this:

```
data Free f x = Free { unFree :: [FFree f x] }
data FFree f x = Pure x | Con (f (Free f x))
```

Specialised to the `Identity`

monad, that becomes:

```
data Forest a = Forest { unForest :: [Tree x] }
data Tree x = Leaf x | Branch (Forest x)
```

De-specialised to the free monad transformer, it becomes:

```
newtype FreeT f m a = FreeT
{ runFreeT :: m (FreeF f a (FreeT f m a)) }
data FreeF f a b
= Pure a
| Free (f b)
type FreeNearSemiring f = FreeT f []
```

These definitions all lend themselves to combinatorial search (Spivey 2009, Fischer (2009), Piponi (2009)), with one extra operation needed: `wrap`

.

Does the odds monad fit in to any of this?

While `WriterT (Product Rational) []`

is a valid definition of the traditional probability monad, it’s *not* the same as the odds monad. If you take the odds monad, and parameterize it over the weight of the tail, you get this:

`data Odds m a = Certain a | Choice (m (a, Odds a))`

Which looks remarkably like `ListT`

done right:

```
newtype ListT m a = ListT { next :: m (Step m a) }
data Step m a = Cons a (ListT m a) | Nil
```

That suggests a relationship between probability and odds:

```
WriterT (Product Rational) [] = Probability
ListT (Weighted Rational) = Odds
```

`ListT`

isn’t a perfect match, though: it allows empty lists. To correct this, you could use the Cofree Comonad:

`data Cofree f a = a :< (f (Cofree f a))`

Subbing in `Maybe`

for `f`

, you get a non-empty list. A *weighted* `Maybe`

is basically `PerhapsT`

, as was mentioned earlier.

Types in haskell also form a semiring.

```
(<.>) = (,)
one = ()
(<+>) = Either
zero = Void
```

There’s a subset of semirings which are star semirings. They have an operation $*$ such that:

$a* = 1 + aa* = 1 + a*a$

Or, as a class:

```
class Semiring a => StarSemiring a where
star :: a -> a
star x = one <+> plus x
plus :: a -> a
plus x = x <.> star x
```

Using this on types, you get:

`star a = Either () (a, star a)`

Which is just a standard list! Some pseudo-haskell on alternatives will give you:

```
star :: (Alternative f, Monoid a) => f a -> f a
star x = (x <.> star x) <+> pure mempty where
(<.>) = liftA2 mappend
(<+>) = <|>
```

Also known as `many`

. (although note that this breaks all the laws)

The $*$ for rationals is defined as (Droste and Kuich 2009, p8):

$a* = \begin{cases} \frac{1}{1 - a} & \quad \text{if } & 0 \leq a \lt 1, \\ \infty & \quad \text{if } & a \geq 1. \end{cases}$

So, combining the probability with the type-level business, the star of `Writer s a`

is:

`Either (1, a) (a, s / (1 - s), star (Writer s a))`

Or, to put it another way: the odds monad!

An endomorphism is a morphism from an object to itself. A less general definition (and the one most often used in Haskell) is a function of the type `a -> a`

:

`newtype Endo a = Endo { appEndo :: a -> a }`

It forms a monoid under composition:

```
instance Monoid (Endo a) where
mempty = Endo id
mappend (Endo f) (Endo g) = Endo (f . g)
```

If the underlying type is itself a commutative monoid, it also forms near-semiring:

```
instance Monoid a => Semiring (Endo a) where
Endo f <+> Endo g = Endo (\x -> f x <> g x)
zero = Endo (const mempty)
one = Endo id
Endo f <.> Endo g = Endo (f . g)
instance (Monoid a, Eq a) => StarSemiring (Endo a) where
star (Endo f) = Endo converge where
converge x = x <> (if y == mempty then y else converge y) where
y = f x
```

Here’s something interesting: there’s a similarity here to the semiring for church numerals. In fact, as far as I can tell, the functions are *exactly* the same when applied to endomorphisms of endomorphisms. To the extent that you could define church numerals with something as simple as this:

`type ChurchEndoNat = forall a. Endo (Endo a)`

And it works!

```
two, three :: ChurchEndoNat
two = one <+> one
three = one <+> two
unChurch :: Num a => ChurchEndoNat -> a
unChurch f = appEndo (appEndo f (Endo (1+))) 0
```

```
unChurch (two <.> three)
6
```

One of the most important applications (and a source of much of the notation) is regular expressions. In fact, the free semiring looks like a haskell datatype for regular expressions:

```
data FreeStar a
= Gen a
| Zer
| One
| FreeStar a :<+> FreeStar a
| FreeStar a :<.> FreeStar a
| Star (FreeStar a)
instance Semiring (FreeStar a) where
(<+>) = (:<+>)
(<.>) = (:<.>)
zero = Zer
one = One
instance StarSemiring (FreeStar a) where
star = Star
interpret :: StarSemiring s => (a -> s) -> FreeStar a -> s
interpret f = \case
Gen x -> f x
Zer -> zero
One -> one
l :<+> r -> interpret f l <+> interpret f r
l :<.> r -> interpret f l <.> interpret f r
Star x -> star (interpret f x)
```

Then, interpreting the regex is as simple as writing an interpreter (with some help from `Endo`

):

```
asRegex :: Eq a => FreeStar (a -> Bool) -> [a] -> Bool
asRegex fs = any null . appEndo (interpret f fs) . pure where
f p = Endo . mapMaybe $ \case
(x:xs) | p x -> Just xs
_ -> Nothing
char' :: Eq a => a -> FreeStar (a -> Bool)
char' c = Gen (c==)
```

Actually, you don’t need the free version at all!

```
runRegex :: Eq a => Endo [[a]] -> [a] -> Bool
runRegex fs = any null . appEndo fs . pure
char :: Eq a => a -> Endo [[a]]
char c = Endo . mapMaybe $ \case
(x:xs) | c == x -> Just xs
_ -> Nothing
```

With some `-XOverloadedStrings`

magic, you get a pretty nice interface:

```
instance IsString (Endo [String]) where
fromString = mul . map char . reverse
(<^>) :: Semiring s => s -> s -> s
(<^>) = flip (<.>)
greet :: Endo [String]
greet = "H" <^> ("a" <+> "e") <^> "llo"
```

```
runRegex greet "Hello"
True
```

```
runRegex greet "Hallo"
True
```

```
runRegex greet "Halo"
False
```

Of course, that’s about as slow as it gets when it comes to regexes. A faster representation is a nondeterministic finite automaton. One such implementation in haskell is Gabriel Gonzalez’s.

The regex type in that example can be immediately made to conform to `Semiring`

and `StarSemiring`

. However, it might be more interesting to translate the *implementation* into using semirings. The type of a regex looks like this:

```
type State = Int
{ _startingStates :: Set State
, _transitionFunction :: Char -> State -> Set State
, _acceptingStates :: Set State }
```

The set data structure jumps out as an opportunity to sub in arbitrary semirings.Swapping in the `GeneralMap`

is reasonably easy:

```
type State = Int
data Regex i s = Regex
{ _numberOfStates :: Int
, _startingStates :: GeneralMap State s
, _transitionFunction :: i -> State -> GeneralMap State s
, _acceptingStates :: GeneralMap State s }
isEnd :: Semiring s => Regex i s -> s
isEnd (Regex _ as _ bs) = add (intersection as bs)
match :: Regex Char (Add Bool) -> String -> Bool
match r = getAdd . isEnd . foldl' run r where
run (Regex n (GeneralMap as) f bs) i = Regex n as' f bs
where as' = mconcat [ fmap (v<.>) (f i k) | (k,v) <- Map.assocs as ]
satisfy :: Semiring s => (i -> s) -> Regex i (Add s)
satisfy predicate = Regex 2 as f bs
where
as = singleton 0
bs = singleton 1
f i 0 = assoc 1 (predicate i) mempty
f _ _ = mempty
once :: Eq i => i -> Regex i (Add Bool)
once x = satisfy (== x)
shift :: Int -> GeneralMap State s -> GeneralMap State s
shift n = GeneralMap . Map.fromAscList . (map.first) (+ n) . Map.toAscList . getMap
instance (Semiring s, Monoid s) => Semiring (Regex i s) where
one = Regex 1 (singleton 0) (\_ _ -> mempty) (singleton 0)
zero = Regex 0 mempty (\_ _ -> mempty) mempty
Regex nL asL fL bsL <+> Regex nR asR fR bsR = Regex n as f bs
where
n = nL + nR
as = mappend asL (shift nL asR)
bs = mappend bsL (shift nL bsR)
f i s | s < nL = fL i s
| otherwise = shift nL (fR i (s - nL))
Regex nL asL fL bsL <.> Regex nR asR fR bsR = Regex n as f bs where
n = nL + nR
as = let ss = add (intersection asL bsL)
in mappend asL (fmap (ss<.>) (shift nL asR))
f i s =
if s < nL
then let ss = add (intersection r bsL)
in mappend r (fmap (ss<.>) (shift nL asR))
else shift nL (fR i (s - nL))
where
r = fL i s
bs = shift nL bsR
instance (StarSemiring s, Monoid s) => StarSemiring (Regex i s) where
star (Regex n as f bs) = Regex n as f' as
where
f' i s =
let r = f i s
ss = add (intersection r bs)
in mappend r (fmap (ss<.>) as)
plus (Regex n as f bs) = Regex n as f' bs
where
f' i s =
let r = f i s
ss = add (intersection r bs)
in mappend r (fmap (ss<.>) as)
instance IsString (Regex Char (Add Bool)) where
fromString = mul . map once
```

This begins to show some of the real power of using semirings and covectors. We have a normal regular expression implementation when we use the covector over bools. Use the probability semiring, and you’ve got probabilistic parsing.

Swap in the tropical semiring: a semiring over the reals where addition is the max function, and multiplication is addition of reals. Now you’ve got a depth-first parser.

That’s how you might swap in different interpretations. How about swapping in different *implementations*? Well, there might be some use to swapping in the CYK algorithm, or the Gauss-Jordan-Floyd-Warshall-McNaughton-Yamada algorithm (O’Connor 2011).

Alternatively, you can swap in the underlying data structure. Instead of a map, if you use an integer (each bit being a value, the keys being the bit position), you have a super-fast implementation (and the final implementation used in the original example). Finally, you could use a different representation of the state transfer function: a matrix.

A square matrix can be understood as a map from pairs of indices to values. This lets us use it to represent the state transfer function. Take, for instance, a regular expression with three possible states. Its state transfer function might look like this:

$transfer = \begin{cases} 1 \quad & \{ 2, 3 \} \\ 2 \quad & \{ 1 \} \\ 3 \quad & \emptyset \end{cases}$

It has the type of:

`State -> Set State`

Where `State`

is an integer. You can represent the set as a vector, where each position is a key, and each value is whether or not that key is present:

$transfer = \begin{cases} 1 \quad & \begin{array} ( 0 & 1 & 1 ) \end{array} \\ 2 \quad & \begin{array} ( 1 & 0 & 0 ) \end{array} \\ 3 \quad & \begin{array} ( 0 & 0 & 0 ) \end{array} \end{cases}$

Then, the matrix representation is obvious:

$transfer = \left( \begin{array}{ccc} 0 & 1 & 1 \\ 1 & 0 & 0 \\ 0 & 0 & 0 \end{array} \right)$

This is the semiring of square matrices. It is, of course, yet *another* covector. The “keys” are the transfers: `1 -> 2`

or `2 -> 3`

, represented by the indices of the matrix. The “values” are whether or not that transfer is permitted.

The algorithms for the usual semiring operations on matrices like this are well-known and well-optimized. I haven’t yet benchmarked them in Haskell using the matrix libraries, so I don’t know how they compare to the other approaches. In the meantime, there’s an elegant list-based implementation in Dolan (2013):

```
data Matrix a = Scalar a
| Matrix [[a]]
mjoin :: (Matrix a, Matrix a, Matrix a, Matrix a) -> Matrix a
mjoin (Matrix ws, Matrix xs, Matrix ys, Matrix zs) =
Matrix ((zipWith (++) ws xs) ++ (zipWith (++) ys zs))
msplit :: Matrix a -> (Matrix a, Matrix a, Matrix a, Matrix a)
msplit (Matrix (row:rows)) =
(Matrix [[first]], Matrix [top]
,Matrix left, Matrix rest )
where
(first:top) = row
(left,rest) = unzip (map (\(x:xs) -> ([x],xs)) rows)
instance Semiring a => Semiring (Matrix a) where
zero = Scalar zero
one = Scalar one
Scalar x <+> Scalar y = Scalar (x <+> y)
Matrix x <+> Matrix y =
Matrix (zipWith (zipWith (<+>)) x y)
Scalar x <+> m = m <+> Scalar x
Matrix [[x]] <+> Scalar y = Matrix [[x <+> y]]
x <+> y = mjoin (first <+> y, top, left, rest <+> y)
where (first, top, left, rest) = msplit x
Scalar x <.> Scalar y = Scalar (x <.> y)
Scalar x <.> Matrix y = Matrix ((map.map) (x<.>) y)
Matrix x <.> Scalar y = Matrix ((map.map) (<.>y) x)
Matrix x <.> Matrix y =
Matrix [ [ foldl1 (<+>) (zipWith (<.>) row col) | col <- cols ]
| row <- x ] where cols = transpose y
instance StarSemiring a => StarSemiring (Matrix a) where
star (Matrix [[x]]) = Matrix [[star x]]
star m = mjoin (first' <+> top' <.> rest' <.> left'
,top' <.> rest', rest' <.> left', rest')
where
(first, top, left, rest) = msplit m
first' = star first
top' = first' <.> top
left' = left <.> first'
rest' = star (rest <+> left' <.> top)
```

A lot of the use from semirings comes from “attaching” them to other values. Attaching a semiring to effects (in the form of an applicative) can give you *repetition* of those effects. The excellent ReplicateEffects library explores this concept in depth.

It’s based on this type:

```
data Replicate a b
= Nil
| Cons (Maybe b) (Replicate a (a -> b))
```

This type can be made to conform to `Semiring`

(and `Starsemiring`

, etc) trivially.

In the simplest case, it has the same behaviour as `replicateM`

. Even the more complex combinators, like `atLeast`

, can be built on `Alternative`

:

```
atLeast :: Alternative f => Int -> f a -> f [a]
atLeast m f = go (max 0 m) where
go 0 = many f
go n = liftA2 (:) f (go (n-1))
atMost :: Alternative f => Int -> f a -> f [a]
atMost m f = go (max 0 m) where
go 0 = pure []
go n = liftA2 (:) f (go (n-1)) <|> pure []
```

There are two main benefits over using the standard alternative implementation. First, you can choose greedy or lazy evaluation of the effects *after* the replication is built.

Secondly, the *order* of the effects doesn’t have to be specified. This allows you to execute permutations of the effects, in a permutation parser, for instance. The permutation is totally decoupled from the declaration of the repetition (it’s in a totally separate library, in fact: PermuteEffects). Its construction is reminiscent of the free alternative.

Having the replicate type conform to `Semiring`

is all well and good: what I’m interested in is seeing if its implementation is another semiring-based object in disguise. I’ll revisit this in a later post.

List comprehension notation is one of my all-time favourite bits of syntactic sugar. It seems almost *too* declarative to have a reasonable implementation strategy. The vast majority of the time, it actually works in a sensible way. There are exceptions, though. Take a reasonable definition of a list of Pythagorean triples:

`[ (x,y,z) | x <- [1..], y <- [1..], z <- [1..], x*x + y*y == z*z ]`

This expression will diverge without yielding a single triple. It will search through every possible value for `z`

before incrementing either `x`

or `y`

. Since there are infinite values for `z`

, it will never find a triple. In other words, vanilla list comprehensions in Haskell perform depth-first search.

In order to express other kinds of search (either breadth-first or depth-bounded), different monads are needed. These monads are explored in Fischer (2009) and Spivey (2009).

You can actually use the *exact* same notation as above with arbitrary alternative monads using `-XMonadComprehensions`

and `-XOverloadedLists`

.

```
trips :: ( Alternative m
, Monad m
, IsList (m Integer)
, Enum (Item (m Integer))
, Num (Item (m Integer)))
=> m (Integer,Integer,Integer)
trips = [ (x,y,z) | x <- [1..], y <- [1..], z <- [1..], x*x + y*y == z*z ]
```

So then, here’s the challenge: swap in different `m`

s via a type annotation, and prevent `trips`

from diverging before getting any triples.

As one example, here’s some code adapted from Fischer (2009):

```
instance (Monoid r, Applicative m) => Monoid (ContT r m a) where
mempty = ContT (const (pure mempty))
mappend (ContT f) (ContT g) = ContT (\x -> liftA2 mappend (f x) (g x))
newtype List a = List
{ runList :: forall m. Monoid m => Cont m a } deriving Functor
instance Foldable List where foldMap = flip (runCont.runList)
instance Show a => Show (List a) where show = show . foldr (:) []
instance Monoid (List a) where
mappend (List x) (List y) = List (mappend x y)
mempty = List mempty
instance Monoid a => Semiring (List a) where
zero = mempty
(<+>) = mappend
(<.>) = liftA2 mappend
one = pure mempty
bfs :: List a -> [a]
bfs = toList . fold . levels . anyOf
newtype Levels a = Levels { levels :: [List a] } deriving Functor
instance Applicative Levels where
pure x = Levels [pure x]
Levels fs <*> Levels xs = Levels [ f <*> x | f <- fs, x <- xs ]
instance Alternative Levels where
empty = Levels []
Levels x <|> Levels y = Levels (mempty : merge x y)
instance IsList (List a) where
type Item (List a) = a
fromList = anyOf
toList = foldr (:) []
instance Applicative List where
pure x = List (pure x)
(<*>) = ap
instance Alternative List where
empty = mempty
(<|>) = mappend
instance Monad List where
x >>= f = foldMap f x
anyOf :: (Alternative m, Foldable f) => f a -> m a
anyOf = getAlt . foldMap (Alt . pure)
merge :: [List a] -> [List a] -> [List a]
merge [] ys = ys
merge xs [] = xs
merge (x:xs) (y:ys) = mappend x y : merge xs ys
```

```
take 3 (bfs trips)
[(3,4,5),(4,3,5),(6,8,10)]
```

The only relevance to semirings is the merge function. The semiring over lists is the semiring over polynomials:

```
instance Semiring a => Semiring [a] where
one = [one]
zero = []
[] <+> ys = ys
xs <+> [] = xs
(x:xs) <+> (y:ys) = (x <+> y) : (xs <+> ys)
[] <.> _ = []
_ <.> [] = []
(x:xs) <.> (y:ys) =
(x <.> y) : (map (x <.>) ys <+> map (<.> y) xs <+> (xs <.> ys))
```

The `<+>`

is the same as the `merge`

function. I think the `<.>`

might be a more valid definition of the `<*>`

function, also.

```
instance Applicative Levels where
pure x = Levels [pure x]
Levels [] <*> _ = Levels []
_ <*> Levels [] = Levels []
Levels (f:fs) <*> Levels (x:xs) = Levels $
(f <*> x) : levels (Levels (fmap (f <*>) xs)
<|> Levels (fmap (<*> x) fs)
<|> (Levels fs <*> Levels xs))
```

I’ve only scratched the surface of this abstraction. There are several other interesting semirings: polynomials, logs, Viterbi, Łukasiewicz, languages, multisets, bidirectional parsers, etc. Hopefully I’ll eventually be able to put this stuff into a library or something. In the meantime, I definitely will write some posts on the application to context-free parsing, bidirectional parsing (I just read Breitner (2016)) and search.

Boom, H. J. 1981. “Further thoughts on Abstracto.” *Working Paper ELC-9, IFIP WG 2.1*. http://www.kestrel.edu/home/people/meertens/publications/papers/Abstracto_reader.pdf.

Breitner, Joachim. 2016. “Showcasing Applicative.” *Joachim Breitner’s Blog*. http://www.joachim-breitner.de/blog/710-Showcasing_Applicative.

Doel, Dan. 2015. “Free Monoids in Haskell.” *The Comonad.Reader*. http://comonad.com/reader/2015/free-monoids-in-haskell/.

Droste, Manfred, and Werner Kuich. 2009. “Semirings and Formal Power Series.” In *Handbook of Weighted Automata*, ed by. Manfred Droste, Werner Kuich, and Heiko Vogler, 1:3–28. Monographs in Theoretical Computer Science. An EATCS Series. Berlin, Heidelberg: Springer Berlin Heidelberg. http://staff.mmcs.sfedu.ru/~ulysses/Edu/Marktoberdorf_2009/working_material/Esparsa/Kuich.%20Semirings%20and%20FPS.pdf.

Erwig, Martin, and Steve Kollmansberger. 2006. “Functional pearls: Probabilistic functional programming in Haskell.” *Journal of Functional Programming* 16 (1): 21–34. doi:10.1017/S0956796805005721. http://web.engr.oregonstate.edu/~erwig/papers/abstracts.html#JFP06a.

Fischer, Sebastian. 2009. “Reinventing Haskell Backtracking.” In *Informatik 2009, Im Fokus das Leben (ATPS’09)*. GI Edition. http://www-ps.informatik.uni-kiel.de/~sebf/data/pub/atps09.pdf.

Hirschowitz, André, and Marco Maggesi. 2010. “Modules over monads and initial semantics.” *Information and Computation* 208 (5). Special Issue: 14th Workshop on Logic, Language, Information and Computation (WoLLIC 2007) (May): 545–564. doi:10.1016/j.ic.2009.07.003. https://pdfs.semanticscholar.org/3e0c/c79e8cda9246cb954da6fd8aaaa394fecdc3.pdf.

Kidd, Eric. 2007. “Build your own probability monads.” http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.129.9502&rep=rep1&type=pdf.

Kmett, Edward. 2011a. “Free Monads for Less (Part 2 of 3): Yoneda.” *The Comonad.Reader*. http://comonad.com/reader/2011/free-monads-for-less-2/.

———. 2011b. “Modules and Functional Linear Functionals.” *The Comonad.Reader*. http://comonad.com/reader/2011/free-modules-and-functional-linear-functionals/.

Larsen, Ken Friis. 2011. “Memory Efficient Implementation of Probability Monads.” http://www.diku.dk/~kflarsen/t/ProbMonad-unpublished.pdf.

O’Connor, Russell. 2011. “A Very General Method of Computing Shortest Paths.” *Russell O’Connor’s Blog*. http://r6.ca/blog/20110808T035622Z.html.

Piponi, Dan. 2009. “A Monad for Combinatorial Search with Heuristics.” *A Neighborhood of Infinity*. http://blog.sigfpe.com/2009/07/monad-for-combinatorial-search-with.html.

*Proceedings of the 17th International Symposium on Principles and Practice of Declarative Programming*, 196–207. ACM. doi:10.1145/2790449.2790514. http://www.fceia.unr.edu.ar/~mauro/pubs/FromMonoidstoNearsemirings.pdf.

*Journal of Functional Programming* 19 (3-4) (July): 469–487. doi:10.1017/S0956796809007321. https://pdfs.semanticscholar.org/db3e/373bb6e7e7837ebc524da0a25903958554ed.pdf.

Part 3 of a 3-part series on probability

Tags: Haskell

Previously, I tried to figure out how to make the probability monad more “listy”. I read a little more about the topic (especially Erwig and Kollmansberger 2006; and Kidd 2007).

I then thought about what a probability monad would look like if it was based on other data structures. I feel like the standard version really wants to be:

```
newtype ProperProb a = ProperProb
{ yes :: Map a (Product Rational) }
```

But of course a monad instance isn’t allowed.

Similar to a map, though, is a binary tree:

```
data BinaryTree a = Leaf
| Node (BinaryTree a) a (BinaryTree a)
```

And it feels better for probability - *flatter*, somehow. Transmuting it into a probability-thing:

```
data Odds a = Certain a
| Choice (Odds a) Rational (Odds a)
deriving (Eq, Functor, Foldable, Show)
```

That looks good to me. A choice between two different branches feels more natural than a choice between a head and a tail.

The fold is similar to before, with an unfold for good measure:

```
foldOdds :: (b -> Rational -> b -> b) -> (a -> b) -> Odds a -> b
foldOdds f b = r where
r (Certain x) = b x
r (Choice xs p ys) = f (r xs) p (r ys)
unfoldOdds :: (b -> Either a (b,Rational,b)) -> b -> Odds a
unfoldOdds f = r where
r b = case f b of
Left a -> Certain a
Right (x,p,y) -> Choice (r x) p (r y)
fi :: Bool -> a -> a -> a
fi True t _ = t
fi False _ f = f
```

I changed the pattern synonym a little:

```
unRatio :: Num a => Rational -> (a,a)
unRatio = numerator &&& denominator
>>> fromInteger *** fromInteger
pattern n :% d <- (unRatio -> (n,d))
```

Then, the `probOf`

function:

```
probOf :: Eq a => a -> Odds a -> Rational
probOf e = foldOdds f b where
b x = fi (e == x) 1 0
f x (n:%d) y = (x * n + y * d) / (n + d)
```

This version doesn’t have the option for short-circuiting on the first value it finds.

For generating from lists, you can try to evenly divide the list among each branch.

```
fromListOdds :: (([b], Int) -> Integer) -> (b -> a) -> [b] -> Maybe (Odds a)
fromListOdds fr e = r where
r [] = Nothing
r xs = Just (unfoldOdds f (xs, length xs))
f ([x],_) = Left (e x)
f (xs ,n) = Right ((ys,l), fr (ys,l) % fr (zs,r), (zs,r)) where
l = n `div` 2
r = n - l
(ys,zs) = splitAt l xs
equalOdds :: [a] -> Maybe (Odds a)
equalOdds = fromListOdds (fromIntegral . snd) id
fromDistrib :: [(a,Integer)] -> Maybe (Odds a)
fromDistrib = fromListOdds (sum . map snd . fst) fst
```

What’s really nice about this version is the fact that the old `append`

is just the `Choice`

constructor, leaving the instances to be really nice:

```
flatten :: Odds (Odds a) -> Odds a
flatten = foldOdds Choice id
instance Applicative Odds where
pure = Certain
fs <*> xs = flatten (fmap (<$> xs) fs)
instance Monad Odds where
x >>= f = flatten (f <$> x)
```

Finally, as a bonus, to remove duplicates:

```
lcd :: Foldable f => f Rational -> Integer
lcd = foldl' (\a e -> lcm a (denominator e)) 1
toDistrib :: Odds a -> [(a,Integer)]
toDistrib = factorOut . foldOdds f b where
b x = [(x,1)]
f l p r = (map.fmap) (n%t*) l ++ (map.fmap) (d%t*) r where
n = numerator p
d = denominator p
t = n + d
factorOut xs = (map.fmap) (numerator . (lcd'*)) xs where
lcd' = fromIntegral . lcd . map snd $ xs
counts :: (Ord a, Num n) => [(a,n)] -> [(a,n)]
counts =
Map.assocs .
Map.fromListWith (+)
compress :: Ord a => Odds a -> Odds a
compress xs = let Just ys = (fromDistrib . counts . toDistrib) xs in ys
```

After reading yet more on this, I found that the main issue with the monad is its performance. Two articles in particular: Larsen (2011), and Scibior, Ghahramani, and Gordon (2015), refer to a GADT implementation of the monad which maximises laziness.

Erwig, Martin, and Steve Kollmansberger. 2006. “Functional pearls: Probabilistic functional programming in haskell.” *Journal of Functional Programming* 16 (1): 21–34. http://icerote.net/doc/library/programming/fp/Probabilistic%20functional%20programming%20in%20Haskell.pdf.

Larsen, Ken Friis. 2011. “Memory efficient implementation of probability monads.” http://www.diku.dk/~kflarsen/t/ProbMonad-unpublished.pdf.

Scibior, A., Z. Ghahramani, and A. D. Gordon. 2015. “Practical probabilistic programming with monads.” In *2015 ACM SIGPLAN symposium on haskell*, 50:165–176. ACM. http://mlg.eng.cam.ac.uk/pub/pdf/SciGhaGor15.pdf.

Part 2 of a 3-part series on probability

Tags: Haskell

One of the more unusual monads is the “probability monad”:

```
newtype Probability a = Probability
{ runProb :: [(a,Rational)] }
data Coin = Heads | Tails
toss :: Probability Coin
toss = Probability [(Heads, 1 % 2), (Tails, 1 % 2)]
```

Although it’s a little inefficient, it’s an elegant representation. I’ve written about it before here.

It has some notable deficiencies, though. For instance: the user has to constantly check that all the probabilities add up to one. Its list can be empty, which doesn’t make sense. Also, individual outcomes can appear more than once in the same list.

A first go a fixing the problem might look something like this:

```
newtype Distrib a = Distrib
{ runDist :: [(a,Rational)] }
tossProb :: Distrib Coin
tossProb = Distrib [(Heads, 1), (Tails, 1)]
```

The type is the same as before: it’s the semantics which have changed. The second field of the tuples no longer have to add up to one. The list can still be empty, though, and now finding the probability of, say, the head, looks like this:

```
probHead :: Distrib a -> Rational
probHead (Distrib xs@((_,p):_)) = p / sum [ q | (_,q) <- xs ]
```

Infinite lists aren’t possible, either.

One other way to look at the problem is to mimic the structure of cons-lists. Something like this:

```
data Odds a = Certainly a
| Odds a Rational (Odds a)
deriving (Eq, Functor, Foldable, Show)
```

Here, the `Odds`

constructor (analogous to `(:)`

) contains the betting-style odds of the head element vs. *the rest of the list*. The coin from before is represented by:

```
tossOdds :: Odds Coin
tossOdds = Odds Heads (1 % 1) (Certainly Tails)
```

This representation has tons of nice properties. First, let’s use some pattern-synonym magic for rationals:

```
pattern (:%) :: Integer -> Integer -> Rational
pattern n :% d <- (numerator &&& denominator -> (n,d)) where
n :% d = n % d
```

Then, finding the probability of the head element is this:

```
probHeadOdds :: Odds a -> Rational
probHeadOdds (Certainly _) = 1
probHeadOdds (Odds _ (n :% d) _) = n :% (n + d)
```

The representation can handle infinite lists no problem:

```
probHeadOdds (Odds 'a' (1 :% 1) undefined)
1 % 2
```

Taking the tail preserves semantics, also. To do some more involved manipulation, a fold helper is handy:

```
foldOdds :: (a -> Rational -> b -> b) -> (a -> b) -> Odds a -> b
foldOdds f b = r where
r (Certainly x) = b x
r (Odds x p xs) = f x p (r xs)
```

You can use this function to find the probability of a given item:

```
probOfEvent :: Eq a => a -> Odds a -> Rational
probOfEvent e = foldOdds f b where
b x = if e == x then 1 else 0
f x n r = (if e == x then n else r) / (n + 1)
```

This assumes that each item only occurs once. A function which combines multiple events might look like this:

```
probOf :: (a -> Bool) -> Odds a -> Rational
probOf p = foldOdds f b where
b x = if p x then 1 else 0
f x n r = (if p x then r + n else r) / (n + 1)
```

Some utility functions to create `Odds`

:

```
equalOdds :: Foldable f => f a -> Maybe (Odds a)
equalOdds xs = case length xs of
0 -> Nothing
n -> Just (foldr f undefined xs (n - 1)) where
f y a 0 = Certainly y
f y a n = Odds y (1 % fromIntegral n) (a (n - 1))
fromDistrib :: [(a,Integer)] -> Maybe (Odds a)
fromDistrib [] = Nothing
fromDistrib xs = Just $ f (tot*lst) xs where
(tot,lst) = foldl' (\(!t,_) e -> (t+e,e)) (0,undefined) (map snd xs)
f _ [(x,_)] = Certainly x
f n ((x,p):xs) = Odds x (mp % np) (f np xs) where
mp = p * lst
np = n - mp
probOfEach :: Eq a => a -> Odds a -> Rational
probOfEach x xs = probOf (x==) xs
propOf :: Eq a => a -> [a] -> Maybe Rational
propOf _ [] = Nothing
propOf x xs = Just . uncurry (%) $
foldl' (\(!n,!m) e -> (if x == e then n+1 else n, m+1)) (0,0) xs
```

`propOf x xs == fmap (probOfEach x) (equalOdds xs)`

And finally, the instances:

```
append :: Odds a -> Rational -> Odds a -> Odds a
append = foldOdds f Odds where
f e r a p ys = Odds e ip (a op ys) where
ip = p * r / (p + r + 1)
op = p / (r + 1)
flatten :: Odds (Odds a) -> Odds a
flatten = foldOdds append id
instance Applicative Odds where
pure = Certainly
fs <*> xs = flatten (fmap (<$> xs) fs)
instance Monad Odds where
x >>= f = flatten (f <$> x)
```