Tags: Haskell, Dependent Types

A while ago I read this post on reddit (by David Feuer), about sorting traversables (which was a follow-up on this post by Will Fancher), and I was inspired to write some pseudo-dependently-typed Haskell. The post (and subsequent library) detailed how to use size-indexed heaps to perform fast, total sorting on any traversable. I ended up with a library which has five size-indexed heaps (Braun, pairing, binomial, skew, and leftist), each verified for structural correctness. I also included the non-indexed implementations of each for comparison (as well as benchmarks, tests, and all that good stuff).

The purpose of this post is to go through some of the tricks I used and problems I encountered writing a lot of type-level code in modern Haskell.

In order to index things by their size, we’ll need a type-level representation of size. We’ll use Peano numbers for now:

`data Peano = Z | S Peano`

`Z`

stands for zero, and `S`

for successor. The terseness is pretty necessary here, unfortunately: arithmetic becomes unreadable otherwise. The simplicity of this definition is useful for proofs and manipulation; however any runtime representation of these numbers is going to be woefully slow.

With the `DataKinds`

extension, the above is automatically promoted to the type-level, so we can write type-level functions (type families) on the `Peano`

type:

```
type family Plus (n :: Peano) (m :: Peano) :: Peano where
Plus Z m = m
Plus (S n) m = S (Plus n m)
```

Here the `TypeFamilies`

extension is needed. I’ll try and mention every extension I’m using as we go, but I might forget a few, so check the repository for all of the examples (quick aside: I *did* manage to avoid using `UndecidableInstances`

, but more on that later). One pragma that’s worth mentioning is:

`{-# OPTIONS_GHC -fno-warn-unticked-promoted-constructors #-}`

This suppresses warnings on the definition of `Plus`

above. Without it, GHC would want us to write:

```
type family Plus (n :: Peano) (m :: Peano) :: Peano where
Plus 'Z m = m
Plus ('S n) m = 'S (Plus n m)
```

I think that looks pretty ugly, and it can get much worse with more involved arithmetic. The only thing I have found the warnings useful for is `[]`

: the type-level empty list gives an error in its unticked form.

In the original post, a pairing heap (Fredman et al. 1986) was used, for its simplicity and performance. The implementation looked like this:

```
data Heap n a where
E :: Heap Z a
T :: a -> HVec n a -> Heap (S n) a
data HVec n a where
HNil :: HVec Z a
HCons :: Heap m a -> HVec n a -> HVec (Plus m n) a
```

You immediately run into trouble when you try to define merge:

```
merge :: Ord a => Heap m a -> Heap n a -> Heap (Plus m n) a
merge E ys = ys
merge xs E = xs
merge h1@(T x xs) h2@(T y ys)
| x <= y = T x (HCons h2 xs)
| otherwise = T y (HCons h1 ys)
```

Three errors show up here, but we’ll look at the first one:

`Could not deduce (m ~ (Plus m Z))`

GHC doesn’t know that $x = x + 0$. Somehow, we’ll have to *prove* that it does.

In a language with true dependent types, proving the proposition above is as simple as:

```
plusZeroNeutral : (n : Nat) -> n + 0 = n
plusZeroNeutral Z = Refl
plusZeroNeutral (S k) = cong (plusZeroNeutral k)
```

(this example is in Idris)

In Haskell, on the other hand, we can’t do the same: functions on the value-level `Peano`

have no relationship with functions on the type-level `Peano`

. There’s no way to automatically link or promote one to the other.

This is where singletons come in (Eisenberg and Weirich 2012). A singleton is a datatype which mirrors a type-level value exactly, except that it has a type parameter which matches the equivalent value on the type-level. In this way, we can write functions on the value-level which are linked to the type-level. Here’s a potential singleton for `Peano`

:

```
data Natty n where
Zy :: Natty Z
Sy :: Natty n -> Natty (S n)
```

(we need `GADTs`

for this example)

Now, when we pattern-match on `Natty`

, we get a proof of whatever its type parameter was. Here’s a trivial example:

```
isZero :: Natty n -> Maybe (n :~: Z)
isZero Zy = Just Refl
isZero (Sy _) = Nothing
```

When we match on `Zy`

, the *only value* which `n`

could have been is `Z`

, because the only way to construct `Zy`

is if the type parameter is `Z`

.

Using this technique, the `plusZeroNeutral`

proof looks reasonably similar to the Idris version:

```
plusZeroNeutral :: Natty n -> Plus n Z :~: n
plusZeroNeutral Zy = Refl
plusZeroNeutral (Sy n) = case plusZeroNeutral n of
Refl -> Refl
```

To generalize the singletons a little, we could probably use the singletons library, or we could roll our own:

```
data family The k :: k -> Type
data instance The Peano n where
Zy :: The Peano Z
Sy :: The Peano n -> The Peano (S n)
plusZeroNeutral :: The Peano n -> Plus n Z :~: n
plusZeroNeutral Zy = Refl
plusZeroNeutral (Sy n) = case plusZeroNeutral n of
Refl -> Refl
```

The `The`

naming is kind of cute, I think. It makes the signature look *almost* like the Idris version (`the`

is a function from the Idris standard library). The `The`

type family requires the `TypeInType`

extension, which I’ll talk a little more about later.

There’s an issue with these kinds of proofs: the proof code runs *every time* it is needed. Since the same value is coming out the other end each time (`Refl`

), this seems wasteful.

In a language like Idris, this problem is avoided by noticing that you’re only using the proof for its type information, and then erasing it at runtime. In Haskell, we can accomplish the same with a rule:

```
{-# NOINLINE plusZeroNeutral #-}
{-# RULES
"plusZeroNeutral" forall x. plusZeroNeutral x
= unsafeCoerce (Refl :: 'Z :~: 'Z)
#-}
```

This basically says “if this type-checks, then the proof must exist, and therefore the proof must be valid. So don’t bother running it”. Unfortunately, that’s a *little bit* of a lie. It’s pretty easy to write a proof which type-checks that *isn’t* valid:

```
falseIsTrue :: False :~: True
falseIsTrue = falseIsTrue
```

We won’t be able to perform computations which rely on this proof in Haskell, though: because the computation will never terminate, the proof will never provide an answer. This means that, while the proof isn’t valid, it *is* type safe. That is, of course, unless we use our manual proof-erasure technique. The `RULES`

pragma will happily replace it with the `unsafeCoerce`

version, effectively introducing unsoundness into our proofs. The reason that this doesn’t cause a problem for language like Idris is that Idris has a totality checker: you *can’t* write the above definition (with the totality checker turned on) in Idris.

So what’s the solution? Do we have to suffer through the slower proof code to maintain correctness? In reality, it’s usually OK to assume termination. It’s pretty easy to see that a proof like `plusZeroNeutral`

is total. It’s worth bearing in mind, though, that until Haskell gets a totality checker (likely never, apparently) these proofs aren’t “proper”.

One extra thing: while you’re proving things in one area of your code, you might not have the relevant singleton handy. To generate them on-demand, you’ll need a typeclass:

```
class KnownSing (x :: k) where
sing :: The k x
instance KnownSing Z where
sing = Zy
instance KnownSing n => KnownSing (S n) where
sing = Sy sing
```

This kind of drives home the inefficiency of singleton-based proofs, and why it’s important to erase them aggressively.

One other way to solve these problems is to try find a data structure which runs the proof code anyway. As an example, consider a length-indexed list:

```
infixr 5 :-
data List n a where
Nil :: List Z a
(:-) :: a -> List n a -> List (S n) a
```

You might worry that concatenation of two lists requires some expensive proof code, like `merge`

for the pairing heap. Maybe surprisingly, the default implementation just works:

```
infixr 5 ++
(++) :: List n a -> List m a -> List (Plus n m) a
(++) Nil ys = ys
(++) (x :- xs) ys = x :- xs ++ ys
```

Why? Well, if you look back to the definition of `Plus`

, it’s almost exactly the same as the definition of `(++)`

. In effect, we’re using *lists* as the singleton for `Peano`

here.

The question is, then: is there a heap which performs these proofs automatically for functions like merge? As far as I can tell: *almost*. First though:

The standard definition of `++`

on normal lists can be cleaned up a little with `foldr`

```
(++) :: [a] -> [a] -> [a]
(++) = flip (foldr (:))
```

Can we get a similar definition for our length-indexed lists? Turns out we can, but the type of `foldr`

needs to be a little different:

```
foldrList :: (forall x. a -> b x -> b (S x))
-> b m -> List n a -> b (n + m)
foldrList f b Nil = b
foldrList f b (x :- xs) = f x (foldrList f b xs)
newtype Flip (f :: t -> u -> Type) (a :: u) (b :: t)
= Flip { unFlip :: f b a }
foldrList1 :: (forall x. a -> b x c -> b (S x) c)
-> b m c -> List n a -> b (n + m) c
foldrList1 f b
= unFlip . foldrList (\e -> Flip . f e . unFlip) (Flip b)
infixr 5 ++
(++) :: List n a -> List m a -> List (n + m) a
(++) = flip (foldrList1 (:-))
```

So what’s the point of this more complicated version? Well, if this were normal Haskell, we might get some foldr-fusion or something (in reality we would probably use `augment`

if that were the purpose).

With this type-level business, though, there’s a similar application: loop unrolling. Consider the natural-number type again. We can write a typeclass which will perform induction over them:

```
class KnownPeano (n :: Peano) where
unrollRepeat :: Proxy n -> (a -> a) -> a -> a
instance KnownPeano Z where
unrollRepeat _ = const id
{-# INLINE unrollRepeat #-}
instance KnownPeano n =>
KnownPeano (S n) where
unrollRepeat (_ :: Proxy (S n)) f x =
f (unrollRepeat (Proxy :: Proxy n) f x)
{-# INLINE unrollRepeat #-}
```

Because the recursion here calls a different `unrollRepeat`

function in the “recursive” call, we get around the usual hurdle of not being able to inline recursive calls. That means that the whole loop will be unrolled, at compile-time. We can do the same for foldr:

```
class HasFoldr (n :: Peano) where
unrollFoldr
:: (forall x. a -> b x -> b (S x))
-> b m
-> List n a
-> b (n + m)
instance HasFoldr Z where
unrollFoldr _ b _ = b
{-# INLINE unrollFoldr #-}
instance HasFoldr n => HasFoldr (S n) where
unrollFoldr f b (x :- xs) = f x (unrollFoldr f b xs)
{-# INLINE unrollFoldr #-}
```

I can’t think of many uses for this technique, but one that comes to mind is an n-ary uncurry (like Lisp’s apply):

```
infixr 5 :-
data List (xs :: [*]) where
Nil :: List '[]
(:-) :: a -> List xs -> List (a ': xs)
class KnownList (xs :: [*]) where
foldrT
:: (forall y ys. y -> result ys -> result (y ': ys))
-> result '[]
-> List xs
-> result xs
instance KnownList ('[] :: [*]) where
foldrT _ = const
{-# INLINE foldrT #-}
instance KnownList xs =>
KnownList (x ': xs) where
foldrT f b (x :- xs) = f x (foldrT f b xs)
{-# INLINE foldrT #-}
type family Func (xs :: [*]) (y :: *) where
Func '[] y = y
Func (x ': xs) y = x -> Func xs y
newtype FunType y xs = FunType
{ runFun :: Func xs y -> y
}
uncurry
:: KnownList xs
=> Func xs y -> List xs -> y
uncurry f l =
runFun
(foldrT
(c (\x g h -> g (h x)))
(FunType id)
l)
f
where
c :: (a -> ((Func xs y -> y) -> (Func zs z -> z)))
-> (a -> (FunType y xs -> FunType z zs))
c = coerce
{-# INLINE c #-}
{-# INLINE uncurry #-}
```

I *think* that you can be guaranteed the above is inlined at compile-time, making it essentially equivalent to a handwritten `uncurry`

.

Anyway, back to the size-indexed heaps. The reason that `(++)`

worked so easily on lists is that a list can be thought of as the data-structure equivalent to Peano numbers. Another numeric-system-based data structure is the binomial heap, which is based on binary numbering (I’m going mainly off of the description from Hinze 1999).

So, to work with binary numbers, let’s get some preliminaries on the type-level out of the way:

```
data instance The Bool x where
Falsy :: The Bool False
Truey :: The Bool True
data instance The [k] xs where
Nily :: The [k] '[]
Cony :: The k x -> The [k] xs -> The [k] (x : xs)
instance KnownSing True where
sing = Truey
instance KnownSing False where
sing = Falsy
instance KnownSing '[] where
sing = Nily
instance (KnownSing xs, KnownSing x) =>
KnownSing (x : xs) where
sing = Cony sing sing
```

We’ll represent a binary number as a list of Booleans:

```
type family Sum (x :: Bool) (y :: Bool) (cin :: Bool) :: Bool where
Sum False False False = False
Sum False False True = True
Sum False True False = True
Sum False True True = False
Sum True False False = True
Sum True False True = False
Sum True True False = False
Sum True True True = True
type family Carry (x :: Bool) (y :: Bool) (cin :: Bool)
(xs :: [Bool]) (ys :: [Bool]) :: [Bool] where
Carry False False False xs ys = Add False xs ys
Carry False False True xs ys = Add False xs ys
Carry False True False xs ys = Add False xs ys
Carry False True True xs ys = Add True xs ys
Carry True False False xs ys = Add False xs ys
Carry True False True xs ys = Add True xs ys
Carry True True False xs ys = Add True xs ys
Carry True True True xs ys = Add True xs ys
type family Add (cin :: Bool) (xs :: [Bool]) (ys :: [Bool]) ::
[Bool] where
Add c (x : xs) (y : ys) = Sum x y c : Carry x y c xs ys
Add False '[] ys = ys
Add False xs '[] = xs
Add True '[] ys = CarryOne ys
Add True xs '[] = CarryOne xs
type family CarryOne (xs :: [Bool]) :: [Bool] where
CarryOne '[] = True : '[]
CarryOne (False : xs) = True : xs
CarryOne (True : xs) = False : CarryOne xs
```

The odd definition of `Carry`

is to avoid `UndecidableInstances`

: if we had written, instead:

```
type family Carry (x :: Bool) (y :: Bool) (cin :: Bool) :: Bool where
Carry False False False = False
Carry False False True = False
Carry False True False = False
Carry False True True = True
Carry True False False = False
Carry True False True = True
Carry True True False = True
Carry True True True = True
type family Add (cin :: Bool) (xs :: [Bool]) (ys :: [Bool]) ::
[Bool] where
Add c (x : xs) (y : ys) = Sum x y c : Add (Carry x y c) xs ys
Add False '[] ys = ys
Add False xs '[] = xs
Add True '[] ys = CarryOne ys
Add True xs '[] = CarryOne xs
```

We would have been warned about nested type-family application.

Now we can base the merge function very closely on these type families. First, though, we’ll have to implement the heap.

There are different potential properties you can verify in a data structure. In the sort-traversable post, the property of interest was that the number of elements in the structure would stay the same after adding and removing some number $n$ of elements. For this post, we’ll also verify structural invariants. I won’t, however, verify the heap property. Maybe in a later post.

When indexing a data structure by its size, you encode an awful lot of information into the type signature: the type becomes very *specific* to the structure in question. It is possible, though, to encode a fair few structural invariants *without* getting so specific. Here’s a signature for “perfect leaf tree”:

`data BalTree a = Leaf a | Node (BalTree (a,a))`

With that signature, it’s *impossible* to create a tree with more elements in its left branch than its right; the size of the tree, however, remains unspecified. You can use a similar trick to implement matrices which must be square (from Okasaki 1999): the usual trick (`type Matrix n a = List n (List n a)`

) is too specific, providing size information at compile-time. If you’re interested in this approach, there are several more examples in Hinze (2001).

It is possible to go from the size-indexed version back to the non-indexed version, with an existential (`RankNTypes`

for this example):

```
data ErasedSize f a = forall (n :: Peano). ErasedSize
{ runErasedSize :: f n a
}
```

This will let you prove invariants in your implementation using an index, while keeping the user-facing type signature general and non-indexed.

Wasserman (2010), was able to encode all of the structural invariants of the binomial heap *without* indexing by its size (well, all invariants except truncation, which turned out to be important a little later). I’ll be using a similar approach, except I’ll leverage some of the newer bells and whistles in GHC. Where Wasserman’s version used types like this for the numbering:

```
data Zero a = Zero
data Succ rk a = BinomTree rk a :< rk a
data BinomTree rk a = BinomTree a (rk a)
```

We can reuse the type-level Peano numbers with a GADT:

```
infixr 5 :-
data Binomial xs rk a where
Nil :: Binomial '[] n a
Skip :: Binomial xs (S rk) a -> Binomial (False : xs) rk a
(:-) :: Tree rk a
-> Binomial xs (S rk) a
-> Binomial (True : xs) rk a
data Tree rk a = Root a (Node rk a)
infixr 5 :<
data Node n a where
NilN :: Node Z a
(:<) :: Tree n a -> Node n a -> Node (S n) a
```

The definition of `Tree`

here ensures that any tree of rank $n$ has $2^n$ elements. The binomial heap, then, is a list of trees, in ascending order of size, with a `True`

at every point in its type-level list where a tree is present, and a `False`

wherever one is absent. In other words, the type-level list is a binary encoding of the number of elements it contains.

And here are the merge functions:

```
mergeTree :: Ord a => Tree rk a -> Tree rk a -> Tree (S rk) a
mergeTree xr@(Root x xs) yr@(Root y ys)
| x <= y = Root x (yr :< xs)
| otherwise = Root y (xr :< ys)
merge
:: Ord a
=> Binomial xs z a
-> Binomial ys z a
-> Binomial (Add False xs ys) z a
merge Nil ys = ys
merge xs Nil = xs
merge (Skip xs) (Skip ys) = Skip (merge xs ys)
merge (Skip xs) (y :- ys) = y :- merge xs ys
merge (x :- xs) (Skip ys) = x :- merge xs ys
merge (x :- xs) (y :- ys) = Skip (mergeCarry (mergeTree x y) xs ys)
mergeCarry
:: Ord a
=> Tree rk a
-> Binomial xs rk a
-> Binomial ys rk a
-> Binomial (Add True xs ys) rk a
mergeCarry t Nil ys = carryOne t ys
mergeCarry t xs Nil = carryOne t xs
mergeCarry t (Skip xs) (Skip ys) = t :- merge xs ys
mergeCarry t (Skip xs) (y :- ys) = Skip (mergeCarry (mergeTree t y) xs ys)
mergeCarry t (x :- xs) (Skip ys) = Skip (mergeCarry (mergeTree t x) xs ys)
mergeCarry t (x :- xs) (y :- ys) = t :- mergeCarry (mergeTree x y) xs ys
carryOne
:: Ord a
=> Tree rk a -> Binomial xs rk a -> Binomial (CarryOne xs) rk a
carryOne t Nil = t :- Nil
carryOne t (Skip xs) = t :- xs
carryOne t (x :- xs) = Skip (carryOne (mergeTree t x) xs)
```

You’ll notice that no proofs are needed: that’s because the merge function itself is the same as the type family, like the way `++`

for lists was the same as the `Plus`

type family.

Of course, this structure is only verified insofar as you believe the type families. It does provide a degree of double-entry, though: any mistake in the type family will have to be mirrored in the merge function to type-check. On top of that, we can write some proofs of properties we might expect:

```
addCommutes
:: The [Bool] xs
-> The [Bool] ys
-> Add False xs ys :~: Add False ys xs
addCommutes Nily _ = Refl
addCommutes _ Nily = Refl
addCommutes (Cony Falsy xs) (Cony Falsy ys) =
gcastWith (addCommutes xs ys) Refl
addCommutes (Cony Truey xs) (Cony Falsy ys) =
gcastWith (addCommutes xs ys) Refl
addCommutes (Cony Falsy xs) (Cony Truey ys) =
gcastWith (addCommutes xs ys) Refl
addCommutes (Cony Truey xs) (Cony Truey ys) =
gcastWith (addCommutesCarry xs ys) Refl
addCommutesCarry
:: The [Bool] xs
-> The [Bool] ys
-> Add True xs ys :~: Add True ys xs
addCommutesCarry Nily _ = Refl
addCommutesCarry _ Nily = Refl
addCommutesCarry (Cony Falsy xs) (Cony Falsy ys) =
gcastWith (addCommutes xs ys) Refl
addCommutesCarry (Cony Truey xs) (Cony Falsy ys) =
gcastWith (addCommutesCarry xs ys) Refl
addCommutesCarry (Cony Falsy xs) (Cony Truey ys) =
gcastWith (addCommutesCarry xs ys) Refl
addCommutesCarry (Cony Truey xs) (Cony Truey ys) =
gcastWith (addCommutesCarry xs ys) Refl
```

Unfortunately, though, this method *does* require proofs (ugly proofs) for the delete-min operation. One of the issues is truncation: since the binary digits are stored least-significant-bit first, the same number can be represented with any number of trailing zeroes. This kept causing problems for me when it came to subtraction, and adding the requirement of no trailing zeroes (truncation) to the constructors for the heap was a pain, requiring extra proofs on merge to show that it preserves truncation.

Since some of these properties are much easier to verify on the type-level Peano numbers, one approach might be to convert back and forth between Peano numbers and binary, and use the proofs on Peano numbers instead.

```
type family BintoPeano (xs :: [Bool]) :: Peano where
BintoPeano '[] = Z
BintoPeano (False : xs) = BintoPeano xs + BintoPeano xs
BintoPeano (True : xs) = S (BintoPeano xs + BintoPeano xs)
```

First problem: this requires `UndecidableInstances`

. I’d *really* rather not have that turned on, to be honest. In Idris (and Agda), you can *prove* decidability using a number of different methods, but this isn’t available in Haskell yet.

Regardless, we can push on.

To go in the other direction, we’ll need to calculate the parity of natural numbers. Taken from the Idris tutorial:

```
data Parity (n :: Peano) where
Even :: The Peano n -> Parity (n + n)
Odd :: The Peano n -> Parity (S (n + n))
parity :: The Peano n -> Parity n
parity Zy = Even Zy
parity (Sy Zy) = Odd Zy
parity (Sy (Sy n)) = case parity n of
Even m -> gcastWith (plusSuccDistrib m m) (Even (Sy m))
Odd m -> gcastWith (plusSuccDistrib m m) (Odd (Sy m))
plusSuccDistrib :: The Peano n -> proxy m -> n + S m :~: S (n + m)
plusSuccDistrib Zy _ = Refl
plusSuccDistrib (Sy n) p = gcastWith (plusSuccDistrib n p) Refl
```

We need this function on the type-level, though, not the value-level: here, again, we run into trouble. What does `gcastWith`

look like on the type-level? As far as I can tell, it doesn’t exist (yet. Although I haven’t looked deeply into the singletons library yet).

This idea of doing dependently-typed stuff on the type-level *started* to be possible with `TypeInType`

. For instance, we could have defined our binary type as:

```
data Binary :: Peano -> Type where
O :: Binary n -> Binary (n + n)
I :: Binary n -> Binary (S (n + n))
E :: Binary Z
```

And then the binomial heap as:

```
data Binomial (xs :: Binary n) (rk :: Peano) (a :: Type) where
Nil :: Binomial E n a
Skip :: Binomial xs (S rk) a -> Binomial (O xs) rk a
(:-) :: Tree rk a
-> Binomial xs (S rk) a
-> Binomial (I xs) rk a
```

What we’re doing here is indexing a type *by an indexed type*. This wasn’t possible in Haskell a few years ago. It still doesn’t get us a nice definition of subtraction, though.

It’s pretty clear that this approach gets tedious almost immediately. What’s more, if we want the proofs to be erased, we introduce potential for errors.

The solution? Beef up GHC’s typechecker with a plugin. I first came across this approach in Kenneth Foner’s talk at Compose. He used a plugin that called out to the Z3 theorem prover (from Diatchki 2015); I’ll use a simpler plugin which just normalizes type-literals.

From what I’ve used of these plugins so far, they seem to work really well. They’re very unobtrusive, only requiring a pragma at the top of your file:

`{-# OPTIONS_GHC -fplugin GHC.TypeLits.Normalise #-}`

The plugin is only called when GHC can’t unify two types: this means you don’t get odd-looking error messages in unrelated code (in fact, the error messages I’ve seen so far have been excellent—a real improvement on the standard error messages for type-level arithmetic). Another benefit is that we get to use type-level literals (`Nat`

imported from GHC.TypeLits), rather then the noisy-looking type-level Peano numbers.

```
data Tree n a = Root a (Node n a)
data Node :: Nat -> Type -> Type where
NilN :: Node 0 a
(:<) :: {-# UNPACK #-} !(Tree n a)
-> Node n a
-> Node (1 + n) a
mergeTree :: Ord a => Tree n a -> Tree n a -> Tree (1 + n) a
mergeTree xr@(Root x xs) yr@(Root y ys)
| x <= y = Root x (yr :< xs)
| otherwise = Root y (xr :< ys)
infixr 5 :-
data Binomial :: Nat -> Nat -> Type -> Type where
Nil :: Binomial n 0 a
(:-) :: {-# UNPACK #-} !(Tree z a)
-> Binomial (1 + z) xs a
-> Binomial z (1 + xs + xs) a
Skip :: Binomial (1 + z) (1 + xs) a
-> Binomial z (2 + xs + xs) a
```

This definition also ensures that the binomial heap has no trailing zeroes in its binary representation: the `Skip`

constructor can only be applied to a heap bigger than zero.

Since we’re going to be looking at several different heaps, we’ll need a class to represent all of them:

```
class IndexedQueue h a where
{-# MINIMAL insert, empty, minViewMay, minView #-}
empty
:: h 0 a
minView
:: h (1 + n) a -> (a, h n a)
singleton
:: a -> h 1 a
singleton = flip insert empty
insert
:: a -> h n a -> h (1 + n) a
minViewMay
:: h n a
-> (n ~ 0 => b)
-> (forall m. (1 + m) ~ n => a -> h m a -> b)
-> b
class IndexedQueue h a =>
MeldableIndexedQueue h a where
merge
:: h n a -> h m a -> h (n + m) a
```

You’ll need `MultiParamTypeClasses`

for this one.

```
mergeB
:: Ord a
=> Binomial z xs a -> Binomial z ys a -> Binomial z (xs + ys) a
mergeB Nil ys = ys
mergeB xs Nil = xs
mergeB (Skip xs) (Skip ys) = Skip (mergeB xs ys)
mergeB (Skip xs) (y :- ys) = y :- mergeB xs ys
mergeB (x :- xs) (Skip ys) = x :- mergeB xs ys
mergeB (x :- xs) (y :- ys) = Skip (mergeCarry (mergeTree x y) xs ys)
mergeCarry
:: Ord a
=> Tree z a
-> Binomial z xs a
-> Binomial z ys a
-> Binomial z (1 + xs + ys) a
mergeCarry !t Nil ys = carryOne t ys
mergeCarry !t xs Nil = carryOne t xs
mergeCarry !t (Skip xs) (Skip ys) = t :- mergeB xs ys
mergeCarry !t (Skip xs) (y :- ys) = Skip (mergeCarry (mergeTree t y) xs ys)
mergeCarry !t (x :- xs) (Skip ys) = Skip (mergeCarry (mergeTree t x) xs ys)
mergeCarry !t (x :- xs) (y :- ys) = t :- mergeCarry (mergeTree x y) xs ys
carryOne :: Ord a => Tree z a -> Binomial z xs a -> Binomial z (1 + xs) a
carryOne !t Nil = t :- Nil
carryOne !t (Skip xs) = t :- xs
carryOne !t (x :- xs) = Skip (carryOne (mergeTree t x) xs)
instance Ord a => MeldableIndexedQueue (Binomial 0) a where
merge = mergeB
{-# INLINE merge #-}
instance Ord a => IndexedQueue (Binomial 0) a where
empty = Nil
singleton x = Root x NilN :- Nil
insert = merge . singleton
```

(`BangPatterns`

for this example)

On top of that, it’s very easy to define delete-min:

```
minView xs = case minViewZip xs of
Zipper x _ ys -> (x, ys)
minViewMay q b f = case q of
Nil -> b
_ :- _ -> uncurry f (minView q)
Skip _ -> uncurry f (minView q)
data Zipper a n rk = Zipper !a (Node rk a) (Binomial rk n a)
skip :: Binomial (1 + z) xs a -> Binomial z (xs + xs) a
skip x = case x of
Nil -> Nil
Skip _ -> Skip x
_ :- _ -> Skip x
data MinViewZipper a n rk where
Infty :: MinViewZipper a 0 rk
Min :: {-# UNPACK #-} !(Zipper a n rk) -> MinViewZipper a (n+1) rk
slideLeft :: Zipper a n (1 + rk) -> Zipper a (1 + n + n) rk
slideLeft (Zipper m (t :< ts) hs)
= Zipper m ts (t :- hs)
pushLeft
:: Ord a
=> Tree rk a
-> Zipper a n (1 + rk)
-> Zipper a (2 + n + n) rk
pushLeft c (Zipper m (t :< ts) hs)
= Zipper m ts (Skip (carryOne (mergeTree c t) hs))
minViewZip :: Ord a => Binomial rk (1 + n) a -> Zipper a n rk
minViewZip (Skip xs) = slideLeft (minViewZip xs)
minViewZip (t@(Root x ts) :- f) = case minViewZipMay f of
Min ex@(Zipper minKey _ _) | minKey < x -> pushLeft t ex
_ -> Zipper x ts (skip f)
minViewZipMay :: Ord a => Binomial rk n a -> MinViewZipper a n rk
minViewZipMay (Skip xs) = Min (slideLeft (minViewZip xs))
minViewZipMay Nil = Infty
minViewZipMay (t@(Root x ts) :- f) = Min $ case minViewZipMay f of
Min ex@(Zipper minKey _ _) | minKey < x -> pushLeft t ex
_ -> Zipper x ts (skip f)
```

Similarly, compare the version of the pairing heap with the plugin:

```
data Heap n a where
E :: Heap 0 a
T :: a -> HVec n a -> Heap (1 + n) a
data HVec n a where
HNil :: HVec 0 a
HCons :: Heap m a -> HVec n a -> HVec (m + n) a
insert :: Ord a => a -> Heap n a -> Heap (1 + n) a
insert x xs = merge (T x HNil) xs
merge :: Ord a => Heap m a -> Heap n a -> Heap (m + n) a
merge E ys = ys
merge xs E = xs
merge h1@(T x xs) h2@(T y ys)
| x <= y = T x (HCons h2 xs)
| otherwise = T y (HCons h1 ys)
minView :: Ord a => Heap (1 + n) a -> (a, Heap n a)
minView (T x hs) = (x, mergePairs hs)
mergePairs :: Ord a => HVec n a -> Heap n a
mergePairs HNil = E
mergePairs (HCons h HNil) = h
mergePairs (HCons h1 (HCons h2 hs)) =
merge (merge h1 h2) (mergePairs hs)
```

To the version without the plugin:

```
data Heap n a where
E :: Heap Z a
T :: a -> HVec n a -> Heap (S n) a
data HVec n a where
HNil :: HVec Z a
HCons :: Heap m a -> HVec n a -> HVec (m + n) a
class Sized h where
size :: h n a -> The Peano n
instance Sized Heap where
size E = Zy
size (T _ xs) = Sy (size xs)
plus :: The Peano n -> The Peano m -> The Peano (n + m)
plus Zy m = m
plus (Sy n) m = Sy (plus n m)
instance Sized HVec where
size HNil = Zy
size (HCons h hs) = size h `plus` size hs
insert :: Ord a => a -> Heap n a -> Heap (S n) a
insert x xs = merge (T x HNil) xs
merge :: Ord a => Heap m a -> Heap n a -> Heap (m + n) a
merge E ys = ys
merge xs E = case plusZero (size xs) of Refl -> xs
merge h1@(T x xs) h2@(T y ys)
| x <= y = case plusCommutative (size h2) (size xs) of
Refl -> T x (HCons h2 xs)
| otherwise = case plusSuccDistrib (size xs) (size ys) of
Refl -> T y (HCons h1 ys)
minView :: Ord a => Heap (S n) a -> (a, Heap n a)
minView (T x hs) = (x, mergePairs hs)
mergePairs :: Ord a => HVec n a -> Heap n a
mergePairs HNil = E
mergePairs (HCons h HNil) = case plusZero (size h) of Refl -> h
mergePairs (HCons h1 (HCons h2 hs)) =
case plusAssoc (size h1) (size h2) (size hs) of
Refl -> merge (merge h1 h2) (mergePairs hs)
```

The typechecker plugin makes it relatively easy to implement several other heaps: skew, Braun, etc. You’ll need one extra trick to implement a leftist heap, though. Let’s take a look at the unverified version:

```
data Leftist a
= Leaf
| Node {-# UNPACK #-} !Int
a
(Leftist a)
(Leftist a)
rank :: Leftist s -> Int
rank Leaf = 0
rank (Node r _ _ _) = r
{-# INLINE rank #-}
mergeL :: Ord a => Leftist a -> Leftist a -> Leftist a
mergeL Leaf h2 = h2
mergeL h1 Leaf = h1
mergeL h1@(Node w1 p1 l1 r1) h2@(Node w2 p2 l2 r2)
| p1 < p2 =
if ll <= lr
then LNode (w1 + w2) p1 l1 (mergeL r1 h2)
else LNode (w1 + w2) p1 (mergeL r1 h2) l1
| otherwise =
if rl <= rr
then LNode (w1 + w2) p2 l2 (mergeL r2 h1)
else LNode (w1 + w2) p2 (mergeL r2 h1) l2
where
ll = rank r1 + w2
lr = rank l1
rl = rank r2 + w1
rr = rank l2
```

In a weight-biased leftist heap, the left branch in any tree must have at least as many elements as the right branch. Ideally, we would encode that in the representation of size-indexed leftist heap:

```
data Leftist n a where
Leaf :: Leftist 0 a
Node :: !(The Nat (n + m + 1))
-> a
-> Leftist n a
-> Leftist m a
-> !(m <= n)
-> Leftist (n + m + 1) a
rank :: Leftist n s -> The Nat n
rank Leaf = sing
rank (Node r _ _ _ _) = r
{-# INLINE rank #-}
```

Two problems, though: first of all, we need to be able to *compare* the sizes of two heaps, in the merge function. If we were using the type-level Peano numbers, this would be too slow. More importantly, though, we need the comparison to provide a *proof* of the ordering, so that we can use it in the resulting `Node`

constructor.

In Agda, the Peano type is actually backed by Haskell’s `Integer`

at runtime. This allows compile-time proofs to be written about values which are calculated efficiently. We can mimic the same thing in Haskell with a newtype wrapper *around* `Integer`

with a phantom `Peano`

parameter, if we promise to never put an integer in which has a different value to its phantom value. We can make this promise a little more trustworthy if we don’t export the newtype constructor.

```
newtype instance The Nat n where
NatSing :: Integer -> The Nat n
instance KnownNat n => KnownSing n where
sing = NatSing $ Prelude.fromInteger $ natVal (Proxy :: Proxy n)
```

`FlexibleInstances`

is needed for the instance. We can also encode all the necessary arithmetic:

```
infixl 6 +.
(+.) :: The Nat n -> The Nat m -> The Nat (n + m)
(+.) =
(coerce :: (Integer -> Integer -> Integer)
-> The Nat n -> The Nat m -> The Nat (n + m))
(+)
{-# INLINE (+.) #-}
```

Finally, the compare function (`ScopedTypeVariables`

for this):

```
infix 4 <=.
(<=.) :: The Nat n -> The Nat m -> The Bool (n <=? m)
(<=.) (NatSing x :: The Nat n) (NatSing y :: The Nat m)
| x <= y =
case (unsafeCoerce (Refl :: True :~: True) :: (n <=? m) :~: True) of
Refl -> Truey
| otherwise =
case (unsafeCoerce (Refl :: True :~: True) :: (n <=? m) :~: False) of
Refl -> Falsy
{-# INLINE (<=.) #-}
totalOrder :: p n -> q m -> (n <=? m) :~: False -> (m <=? n) :~: True
totalOrder (_ :: p n) (_ :: q m) Refl =
unsafeCoerce Refl :: (m <=? n) :~: True
type x <= y = (x <=? y) :~: True
```

It’s worth mentioning that all of these functions are somewhat axiomatic: there’s no checking of these definitions going on, and any later proofs are only correct in terms of these functions.

If we want our merge function to *really* look like the non-verified version, though, we’ll have to mess around with the syntax a little.

When matching on a singleton, *within* the case-match, proof of the singleton’s type is provided. For instance:

```
type family IfThenElse (c :: Bool) (true :: k) (false :: k) :: k
where
IfThenElse True true false = true
IfThenElse False true false = false
intOrString :: The Bool cond -> IfThenElse cond Int String
intOrString Truey = 1
intOrString Falsy = "abc"
```

In Haskell, since we can overload the if-then-else construct (with `RebindableSyntax`

), we can provide the same syntax, while hiding the dependent nature:

```
ifThenElse :: The Bool c -> (c :~: True -> a) -> (c :~: False -> a) -> a
ifThenElse Truey t _ = t Refl
ifThenElse Falsy _ f = f Refl
```

Finally, then, we can write the implementation for merge, which looks almost *exactly* the same as the non-verified merge:

```
instance Ord a => IndexedQueue Leftist a where
minView (Node _ x l r _) = (x, merge l r)
{-# INLINE minView #-}
singleton x = Node sing x Leaf Leaf Refl
{-# INLINE singleton #-}
empty = Leaf
{-# INLINE empty #-}
insert = merge . singleton
{-# INLINE insert #-}
minViewMay Leaf b _ = b
minViewMay (Node _ x l r _) _ f = f x (merge l r)
instance Ord a =>
MeldableIndexedQueue Leftist a where
merge Leaf h2 = h2
merge h1 Leaf = h1
merge h1@(Node w1 p1 l1 r1 _) h2@(Node w2 p2 l2 r2 _)
| p1 < p2 =
if ll <=. lr
then Node (w1 +. w2) p1 l1 (merge r1 h2)
else Node (w1 +. w2) p1 (merge r1 h2) l1 . totalOrder ll lr
| otherwise =
if rl <=. rr
then Node (w1 +. w2) p2 l2 (merge r2 h1)
else Node (w1 +. w2) p2 (merge r2 h1) l2 . totalOrder rl rr
where
ll = rank r1 +. w2
lr = rank l1
rl = rank r2 +. w1
rr = rank l2
{-# INLINE merge #-}
```

What’s cool about this implementation is that it has the same performance as the non-verified version (if `Integer`

is swapped out for `Int`

, that is), and it *looks* pretty much the same. This is very close to static verification for free.

The `Sort`

type used in the original blog post can be generalized to *any* indexed container.

```
data Parts f g a b r where
Parts :: (forall n. g (m + n) b -> (g n b, r))
-> !(f m a)
-> Parts f g a b r
instance Functor (Parts f g a b) where
fmap f (Parts g h) =
Parts (\h' -> case g h' of (remn, r) -> (remn, f r)) h
{-# INLINE fmap #-}
instance (IndexedQueue f x, MeldableIndexedQueue f x) =>
Applicative (Parts f g x y) where
pure x = Parts (\h -> (h, x)) empty
{-# INLINE pure #-}
(Parts f (xs :: f m x) :: Parts f g x y (a -> b)) <*>
Parts g (ys :: f n x) =
Parts h (merge xs ys)
where
h :: forall o . g ((m + n) + o) y -> (g o y, b)
h v = case f v of { (v', a) ->
case g v' of { (v'', b) ->
(v'', a b)}}
{-# INLINABLE (<*>) #-}
```

This version doesn’t insist that you order the elements of the heap in any particular way: we could use indexed difference lists to reverse a container, or indexed lists to calculate permutations of a container, for instance.

I’d be very interested to see any other uses of these indexed heaps, if anyone has any ideas. Potentially the could be used in any place where there is a need for some heap which is known to be of a certain size (a true prime sieve, for instance).

I’ve explored all of these ideas here. It has implementations of all the heaps I mentioned, as well as the index-erasing type, and a size-indexed list, for reversing traversables. In the future, I might add things like a Fibonacci heap, or the optimal Brodal/Okasaki heap (Brodal and Okasaki 1996).

Brodal, Gerth Stølting, and Chris Okasaki. 1996. “Optimal Purely Functional Priority Queues.” *Journal of Functional Programming* 6 (6) (November): 839–857. doi:10.1017/S095679680000201X. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.48.973.

Diatchki, Iavor S. 2015. “Improving Haskell Types with SMT.” In *Proceedings of the 2015 ACM SIGPLAN Symposium on Haskell*, 1–10. Haskell ’15. New York, NY, USA: ACM. doi:10.1145/2804302.2804307. http://yav.github.io/publications/improving-smt-types.pdf.

Eisenberg, Richard A., and Stephanie Weirich. 2012. “Dependently Typed Programming with Singletons.” In *Proceedings of the 2012 Haskell Symposium*, 117–130. Haskell ’12. New York, NY, USA: ACM. doi:10.1145/2364506.2364522. http://cs.brynmawr.edu/~rae/papers/2012/singletons/paper.pdf.

Fredman, Michael L., Robert Sedgewick, Daniel D. Sleator, and Robert E. Tarjan. 1986. “The pairing heap: A new form of self-adjusting heap.” *Algorithmica* 1 (1-4) (January): 111–129. doi:10.1007/BF01840439. http://www.cs.princeton.edu/courses/archive/fall09/cos521/Handouts/pairingheaps.pdf.

Hinze, Ralf. 1999. “Functional Pearls: Explaining Binomial Heaps.” *Journal of Functional Programming* 9 (1) (January): 93–104. doi:10.1017/S0956796899003317. http://www.cs.ox.ac.uk/ralf.hinze/publications/#J1.

———. 2001. “Manufacturing datatypes.” *Journal of Functional Programming* 11 (5) (September): 493–524. doi:10.1017/S095679680100404X. http://www.cs.ox.ac.uk/ralf.hinze/publications/#J6.

Okasaki, Chris. 1999. “From Fast Exponentiation to Square Matrices: An Adventure in Types.” In *Proceedings of the ACM SIGPLAN International Conference on Functional Programming (ICFP’99), Paris, France, September 27-29, 1999*, 34:28. ACM. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.456.357&rep=rep1&type=pdf.

Wasserman, Louis. 2010. “Playing with Priority Queues.” *The Monad.Reader* 16 (16) (May): 37. https://themonadreader.files.wordpress.com/2010/05/issue16.pdf.

Tags: Haskell

I saw this post on reddit recently, and it got me thinking about recursion schemes. One of the primary motivations behind them is the reduction of boilerplate. The classic example is evaluation of arithmetic expressions:

```
data ExprF a
= LitF Integer
| (:+:) a a
| (:*:) a a
deriving Functor
type Expr = Fix ExprF
eval :: Expr -> Integer
eval = unfix >>> \case
LitF n -> n
x :+: y -> eval x + eval y
x :*: y -> eval x * eval y
```

The calls to `eval`

are the boilerplate: this is where the main recursion scheme, `cata`

can help.

```
evalF :: Expr -> Integer
evalF = cata $ \case
LitF n -> n
x :+: y -> x + y
x :*: y -> x * y
```

I still feel like there’s boilerplate, though. Ideally I’d like to write this:

```
evalF :: Expr -> Integer
evalF = cata $ ??? $ \case
Lit -> id
Add -> (+)
Mul -> (*)
```

The `???`

needs to be filled in. It’s a little tricky, though: the type of the algebra changes depending on what expression it’s given. GADTs will allow us to attach types to cases:

```
data ExprI a r f where
Lit :: ExprI a b (Integer -> b)
Add :: ExprI a b (a -> a -> b)
Mul :: ExprI a b (a -> a -> b)
```

The first type parameter is the same as the first type parameter to `ExprF`

. The second is the output type of the algebra, and the third is the type of the fold required to produce that output type. The third type parameter *depends* on the case matched in the GADT. Using this, we can write a function which converts a fold/pattern match to a standard algebra:

```
foldAlg :: (forall f. ExprI a r f -> f) -> (ExprF a -> r)
foldAlg f (LitF i) = f Lit i
foldAlg f (x :+: y) = f Add x y
foldAlg f (x :*: y) = f Mul x y
```

And finally, we can write the nice evaluation algebra:

```
evalF :: Expr -> Integer
evalF = cata $ foldAlg $ \case
Lit -> id
Add -> (+)
Mul -> (*)
```

I hacked together some quick template Haskell to generate the matchers over here. It uses a class `AsPatternFold`

:

```
class AsPatternFold x f | x -> f where
foldMatch :: (forall a. f r a -> a) -> (x -> r)
```

And you generate the extra data type, with an instance, by doing this:

`makePatternFolds ''ExprF`

The code it generates can be used like this:

```
evalF :: Expr -> Integer
evalF = cata $ foldMatch $ \case
LitI -> id
(:+|) -> (+)
(:*|) -> (*)
```

It’s terribly hacky at the moment, I may clean it up later.

There’s another approach to the same idea that is slightly more sensible, using record wildcards. You define a handler for you datatype (an algebra):

```
data ExprAlg a r
= ExprAlg
{ litF :: Integer -> r
, (+:) :: a -> a -> r
, (*:) :: a -> a -> r }
```

Then, to use it, you define how to interact between the handler and the datatype, like before. The benefit is that record wildcard syntax allows you to piggy back on the function definition syntax, like so:

```
data ExprF a
= LitF Integer
| (:+:) a a
| (:*:) a a
makeHandler ''ExprF
exprAlg :: ExprF Integer -> Integer
exprAlg = index ExprFAlg {..} where
litF = id
(+:) = (+)
(*:) = (*)
```

This approach is much more principled: the `index`

function, for example, comes from the adjunctions package, from the `Representable`

class. That’s because those algebras are actually representable functors, with their representation being the thing they match. They also conform to a whole bunch of things automatically, letting you combine them interesting ways.

Properly printing expressions, with minimal parentheses, is a surprisingly difficult problem. Ramsey (1998) provides a solution of the form:

```
isParens side (Assoc ao po) (Assoc ai pi) =
pi <= po && (pi /= po || ai /= ao || ao /= side)
```

Using this, we can write an algebra for printing expressions. It should work in the general case, not just on the expression type defined above, so we need to make another unfixed functor to describe the printing of an expression:

```
data Side = L | R deriving Eq
data ShowExpr t e
= ShowLit { _repr :: t }
| Prefix { _repr :: t, _assoc :: (Int,Side), _child :: e }
| Postfix { _repr :: t, _assoc :: (Int,Side), _child :: e }
| Binary { _repr :: t, _assoc :: (Int,Side), _lchild :: e
, _rchild :: e }
deriving Functor
makeLenses ''ShowExpr
```

The lenses are probably overkill. For printing, we need not only the precedence of the current level, but also the precedence one level below. Seems like the perfect case for a zygomorphism:

```
showExprAlg :: Semigroup t
=> (t -> t)
-> ShowExpr t (Maybe (Int,Side), t)
-> t
showExprAlg prns = \case
ShowLit t -> t
Prefix t s (q,y) -> t <> ifPrns R s q y
Postfix t s (p,x) -> ifPrns L s p x <> t
Binary t s (p,x) (q,y) -> ifPrns L s p x <> t <> ifPrns R s q y
where
ifPrns sid (op,oa) (Just (ip,ia))
| ip < op || ip == op && (ia /= oa || sid /= oa) = prns
ifPrns _ _ _ = id
```

The first argument to this algebra is the parenthesizing function. This algebra works fine for when the `ShowExpr`

type is already constructed:

```
showExpr' :: Semigroup t => (t -> t) -> Fix (ShowExpr t) -> t
showExpr' = zygo (preview assoc) . showExprAlg
```

But we still need to construct the `ShowExpr`

from something else first. `hylo`

might be a good fit:

`hylo :: Functor f => (f b -> b) -> (a -> f a) -> a -> b`

But that performs a catamorphism after an anamorphism, and we want a zygomorphism after an anamorphism. Luckily, the recursion-schemes library is constructed in such a way that different schemes can be stuck together relatively easily:

```
hylozygo
:: Functor f
=> (f a -> a) -> (f (a, b) -> b) -> (c -> f c) -> c -> b
hylozygo x y z = ghylo (distZygo x) distAna y (fmap Identity . z)
showExpr :: Semigroup t
=> (t -> t)
-> (e -> ShowExpr t e)
-> e -> t
showExpr = hylozygo (preview assoc) . showExprAlg
```

Let’s try it out, with a right-associative operator this time to make things more difficult:

```
data ExprF a
= LitF Integer
| (:+:) a a
| (:*:) a a
| (:^:) a a
deriving Functor
makeHandler ''ExprF
newtype Expr = Expr { runExpr :: ExprF Expr }
instance Num Expr where
fromInteger = Expr . LitF
x + y = Expr (x :+: y)
x * y = Expr (x :*: y)
infixr 8 ^*
(^*) :: Expr -> Expr -> Expr
x ^* y = Expr (x :^: y)
instance Show Expr where
show =
showExpr
(\x -> "(" ++ x ++ ")")
(index ExprFAlg {..} . runExpr)
where
litF = ShowLit . show
(+:) = Binary " + " (6,L)
(*:) = Binary " * " (7,L)
(^:) = Binary " ^ " (8,R)
```

Since we only specified `Semigroup`

in the definition of `showExpr`

, we can use the more efficient difference-list definition of `Show`

:

```
instance Show Expr where
showsPrec _ =
appEndo . showExpr
(Endo . showParen True . appEndo)
(index ExprFAlg {..} . runExpr)
where
litF = ShowLit . Endo . shows
(+:) = Binary (Endo (" + " ++)) (6,L)
(*:) = Binary (Endo (" * " ++)) (7,L)
(^:) = Binary (Endo (" ^ " ++)) (8,R)
1 ^* 2 ^* 3 -- 1 ^ 2 ^ 3
(1 ^* 2) ^* 3 -- (1 ^ 2) ^ 3
1 * 2 + 3 :: Expr -- 1 * 2 + 3
1 * (2 + 3) :: Expr -- 1 * (2 + 3)
```

Ramsey, Norman. 1998. “Unparsing Expressions With Prefix and Postfix Operators.” *Software—Practice & Experience* 28 (12): 1327–1356. http://www.cs.tufts.edu/%7Enr/pubs/unparse-abstract.html.

Tags: Haskell

In Haskell restricted monads are monads which can’t contain every type. `Set`

is a good example. If you look in the documentation for Data.Set you’ll see several functions which correspond to functions in the Functor/Applicative/Monad typeclass hierarchy:

```
map :: Ord b => (a -> b) -> Set a -> Set b
singleton :: a -> Set a
foldMap :: Ord b => (a -> Set b) -> Set a -> Set b -- specialized
```

Unfortunately, though, `Set`

can’t conform to `Functor`

, because the signature of `fmap`

looks like this:

`fmap :: Functor f => (a -> b) -> f a -> f b`

It doesn’t have an `Ord`

constraint.

This is annoying: when using `Set`

, lots of things have to be imported qualified, and you have to remember the slightly different names of extra functions like `map`

. More importantly, you’ve lost the ability to write generic code over `Functor`

or `Monad`

which will work on `Set`

.

There are a number of ways to get around this problem. Here, an approach using reflection-reification is explored. These are the types involved:

```
newtype SetC a =
SetC{unSetC :: forall r. Ord r => (a -> Set r) -> Set r}
reifySet :: Ord r => SetC r -> Set r
reifySet m = unSetC m singleton
reflectSet :: Ord r => Set r -> SetC r
reflectSet s = SetC $ \k -> S.foldr (\x r -> k x `union` r) S.empty s
```

`SetC`

is just `Cont`

in disguise. In fact, we can generalize this pattern, using Constraint Kinds:

```
newtype FreeT c m a =
FreeT { runFreeT :: forall r. c r => (a -> m r) -> m r}
reifySet :: Ord a => FreeT Ord Set a -> Set a
reifySet m = runFreeT m singleton
reflectSet :: Set r -> FreeT Ord Set r
reflectSet s = FreeT $ \k -> S.foldr (\x r -> k x `union` r) S.empty s
```

`FreeT`

looks an *awful lot* like `ContT`

by now. The type has some other interesting applications, though. For instance, this type:

`type FM = FreeT Monoid Identity`

Is the free monoid. If we use a transformers-style type synonym, the naming becomes even nicer:

```
type Free c = FreeT c Identity
runFree :: c r => Free c a -> (a -> r) -> r
runFree xs f = runIdentity (runFreeT xs (pure . f))
instance Foldable (Free Monoid) where
foldMap = flip runFree
```

Check out this package for an implementation of the non-transformer `Free`

.

This is still unsatisfying, though. Putting annotations around your code feels inelegant. The next solution is to replace the monad class altogether with our own, and turn on `-XRebindableSyntax`

. There are a few ways to design this new class. One option is to use multi-parameter type classes. Another solution is with an associated type:

```
class Functor f where
type Suitable f a :: Constraint
fmap :: Suitable f b => (a -> b) -> f a -> f b
```

This is similar to the approach taken in the rmonad library, except that library doesn’t use constraint kinds (they weren’t available when the library was made), so it has to make do with a `Suitable`

class. Also, the signature for `fmap`

in rmonad is:

`fmap :: (Suitable f a, Suitable f b) => (a -> b) -> f a -> f b`

I don’t want to constrain `a`

: I figure if you can get something *into* your monad, it *must* be suitable. And I really want to reduce the syntactic overhead of writing extra types next to your functions.

There’s also the supermonad library out there which is much more general than any of these examples: it supports indexed monads as well as constrained.

Anyway,`Monad`

is defined similarly to `Functor`

:

```
class Functor m => Monad m where
return :: Suitable m a => a -> m a
(>>=) :: Suitable m b => m a -> (a -> m b) -> m b
```

Again, I want to minimize the use of `Suitable`

, so for `>>=`

there’s only a constraint on `b`

.

Finally, here’s the `Set`

instance:

```
instance Functor Set where
type Suitable Set a = Ord a
fmap = Set.map
```

With equality constraints, you can actually make *monomorphic* containers conform to these classes (or, at least, wrappers around them).

```
import qualified Data.Text as Text
data Text a where
Text :: Text.Text -> Text Char
instance Functor Text where
type Suitable Text a = a ~ Char
fmap f (Text xs) = Text (Text.map f xs)
```

This pattern can be generalized with some more GADT magic:

```
data Monomorphic xs a b where
Monomorphic :: (a ~ b) => xs -> Monomorphic xs a b
instance (MonoFunctor xs, a ~ Element xs) => Functor (Monomorphic xs a) where
type Suitable (Monomorphic xs a) b = a ~ b
fmap f (Monomorphic xs) = Monomorphic (omap f xs)
```

Where `omap`

comes from the mono-traversable package. You could go a little further, to `Foldable`

:

```
instance (MonoFoldable xs, element ~ Element xs) =>
Foldable (Monomorphic xs element) where
foldr f b (Monomorphic xs) = ofoldr f b xs
foldMap f (Monomorphic xs) = ofoldMap f xs
foldl' f b (Monomorphic xs) = ofoldl' f b xs
toList (Monomorphic xs) = otoList xs
null (Monomorphic xs) = onull xs
length (Monomorphic xs) = olength xs
foldr1 f (Monomorphic xs) = ofoldr1Ex f xs
elem x (Monomorphic xs) = oelem x xs
maximum (Monomorphic xs) = maximumEx xs
minimum (Monomorphic xs) = minimumEx xs
sum (Monomorphic xs) = osum xs
product (Monomorphic xs) = oproduct xs
```

Changing the `FreeT`

type above a little, we can go back to normal functors and monads, and write more general reify and reflect functions:

```
newtype FreeT m a =
FreeT { runFreeT :: forall r. Suitable m r => (a -> m r) -> m r}
reify :: (Monad m, Suitable m a) => FreeT m a -> m a
reify = flip runFreeT return
reflect :: Monad m => m a -> FreeT m a
reflect x = FreeT (x >>=)
```

So now our types, when wrapped, can conform to the Prelude’s `Functor`

. It would be nice if this type could be written like so:

```
reify :: Monad m => FreeT (Suitable m) m a -> m a
reify = flip runFreeT return
reflect :: Monad m => m a -> FreeT (Suitable m) m a
reflect x = FreeT (x >>=)
```

But unfortunately type families cannot be partially applied.

The classes above aren’t very modern: they’re missing applicative. This one is tricky:

```
class Functor f => Applicative f where
pure :: Suitable a => a -> f a
(<*>) :: Suitable f b => f (a -> b) -> f a -> f b
```

The issue is `f (a -> b)`

. There’s no *way* you’re getting some type like that into `Set`

. This means that `<*>`

is effectively useless. No problem, you think: define `liftA2`

instead:

```
class Functor f => Applicative f where
pure :: Suitable a => a -> f a
liftA2 :: Suitable f c => (a -> b -> c) -> f a -> f b -> f c
(<*>) :: (Applicative f, Suitable f b) => f (a -> b) -> f a -> f b
(<*>) = liftA2 ($)
```

Great! Now we can use it with set. However, there’s no way (that I can see) to define the other lift functions: `liftA3`

, etc. Of course, if `>>=`

is available, it’s as simple as:

```
liftA3 f xs ys zs = do
x <- xs
y <- ys
z <- zs
pure (f x y z)
```

But now we can’t define it for non-monadic applicatives (square matrices, ZipLists, etc.). This also forces us to use `>>=`

when `<*>`

may have been more efficient.

The functions we’re interested in defining look like this:

```
liftA2 :: Suitable f c => (a -> b -> c) -> f a -> f b -> f c
liftA3 :: Suitable f d => (a -> b -> c -> d) -> f a -> f b -> f c -> f d
liftA4 :: Suitable f e => (a -> b -> c -> d -> e) -> f a -> f b -> f c -> f d -> f e
```

There’s a clear pattern, but no obvious way to abstract over it. Type-level shenanigans to the rescue!

The pattern might be expressed like this:

`liftA :: Func args -> Func lifted args`

We can store these types as heterogeneous lists:

```
infixr 5 :-
data Vect xs where
Nil :: Vect '[]
(:-) :: x -> Vect xs -> Vect (x ': xs)
infixr 5 :*
data AppVect f xs where
NilA :: AppVect f '[]
(:*) :: f x -> AppVect f xs -> AppVect f (x ': xs)
```

And `liftA`

can be represented like this:

```
liftA
:: Suitable f b
=> (Vect xs -> b) -> AppVect f xs -> f b
liftA2
:: Suitable f c
=> (a -> b -> c) -> f a -> f b -> f c
liftA2 f xs ys =
liftA
(\(x :- y :- Nil) ->
f x y)
(xs :* ys :* NilA)
liftA3
:: Suitable f d
=> (a -> b -> c -> d) -> f a -> f b -> f c -> f d
liftA3 f xs ys zs =
liftA
(\(x :- y :- z :- Nil) ->
f x y z)
(xs :* ys :* zs :* NilA)
```

Cool! For unrestricted applicatives, we can define `liftA`

in terms of `<*>`

:

```
liftAP :: (Prelude.Applicative f)
=> (Vect xs -> b) -> (AppVect f xs -> f b)
liftAP f NilA = Prelude.pure (f Nil)
liftAP f (x :* NilA)
= Prelude.fmap (f . (:-Nil)) x
liftAP f (x :* xs)
= ((f .) . (:-)) Prelude.<$> x Prelude.<*> liftAP id xs
```

And for types with a monad instance, we can define it in terms of `>>=`

:

```
liftAM :: (Monad f, Suitable f b)
=> (Vect xs -> b) -> (AppVect f xs -> f b)
liftAM f NilA = pure (f Nil)
liftAM f (x :* NilA) = fmap (f . (:-Nil)) x
liftAM f (x :* xs) = x >>= \y -> liftAM (f . (y:-)) xs
```

This approach is *really* slow. Every function wraps up its arguments in a `Vect`

, and it’s just generally awful.

What about *not* wrapping up the function? Type families can help here:

```
type family FunType (xs :: [*]) (y :: *) :: * where
FunType '[] y = y
FunType (x ': xs) y = x -> FunType xs y
```

It gets really difficult to define `liftA`

using `<*>`

now, though. `liftAM`

, on the other hand, is a breeze:

```
liftAM :: Monad f => FunType xs a -> AppVect f xs -> f a
liftAM f Nil = pure f
liftAM f (x :< xs) = x >>= \y -> liftAM (f y) xs
```

And no vector constructors on the right of the bind!

Still, no decent definition using `<*>`

. The problem is that we’re using a cons-list to represent a function’s arguments, but `<*>`

is left-associative, so it builds up arguments as a snoc list. Lets try using a snoc-list as the type family:

```
infixl 5 :>
data AppVect f xs where
Nil :: AppVect f '[]
(:>) :: AppVect f xs -> f x -> AppVect f (x ': xs)
type family FunType (xs :: [*]) (y :: *) :: * where
FunType '[] y = y
FunType (x ': xs) y = FunType xs (x -> y)
liftA
:: Suitable f a
=> FunType xs a -> AppVect f xs -> f a
```

`liftAP`

now gets a natural definition:

```
liftAP :: Prelude.Applicative f => FunType xs a -> AppVect f xs -> f a
liftAP f Nil = Prelude.pure f
liftAP f (Nil :> xs) = Prelude.fmap f xs
liftAP f (ys :> xs) = liftAP f ys Prelude.<*> xs
```

But what about `liftAM`

? It’s much more difficult, fundamentally because `>>=`

builds up arguments as a cons-list. To convert between the two efficiently, we need to use the trick for reversing lists efficiently: build up the reversed list as you go.

```
liftAM :: (Monad f, Suitable f a) => FunType xs a -> AppVect f xs -> f a
liftAM = go pure where
go :: (Suitable f b, Monad f)
=> (a -> f b) -> FunType xs a -> AppVect f xs -> f b
go f g Nil = f g
go f g (xs :> x) = go (\c -> x >>= f . c) g xs
```

Using these definitions, we can make `Set`

, `Text`

, and all the rest of them applicatives, while preserving the applicative operations. Also, from my preliminary testing, there seems to be *no* overhead in using these new definitions for `<*>`

.

In Sculthorpe et al. (2013), there’s discussion of this type:

```
data NM :: (* -> Constraint) -> (* -> *) -> * -> * where
Return :: a -> NM c t a
Bind :: c x => t x -> (x -> NM c t a) -> NM c t a
```

This type allows constrained monads to become normal monads. It can be used for the same purpose as the `FreeT`

type from above. In the paper, the free type is called `RCodT`

.

One way to look at the type is as a concrete representation of the monad class, with each method being a constructor.

You might wonder if there are similar constructs for functor and applicative. Functor is simple:

```
data NF :: (* -> Constraint) -> (* -> *) -> * -> * where
FMap :: c x => (x -> a) -> t x -> NF c t a
```

Again, this can conform to functor (and *only* functor), and can be interpreted when the final type is `Suitable`

.

Like above, it has a continuation version, Yoneda.

For applicatives, though, the situation is different. In the paper, they weren’t able to define a transformer for applicatives that could be interpreted in some restricted applicative. I needed one because I wanted to use `-XApplicativeDo`

notation: the desugaring uses `<*>`

, not the `liftAn`

functions, so I wanted to construct a free applicative using `<*>`

, and run it using the lift functions. What I managed to cobble to gether doesn’t *really* solve the problem, but it works for `-XApplicativeDo`

!

The key with a lot of this was realizing that `<*>`

is *snoc*, not cons. Using a free applicative:

```
data Free f a where
Pure :: a -> Free f a
Ap :: Free f (a -> b) -> f a -> Free f b
instance Prelude.Functor (Free f) where
fmap f (Pure a) = Pure (f a)
fmap f (Ap x y) = Ap ((f .) Prelude.<$> x) y
instance Prelude.Applicative (Free f) where
pure = Pure
Pure f <*> y = Prelude.fmap f y
Ap x y <*> z = Ap (flip Prelude.<$> x Prelude.<*> z) y
```

This type can conform to `Applicative`

and `Functor`

no problem. And all it needs to turn back into a constrained applicative is for the outer type to be suitable:

```
lift :: f a -> Free f a
lift = Ap (Pure id)
lower
:: forall f a c.
Free f a
-> (forall xs. FunType xs a -> AppVect f xs -> f c)
-> f c
lower (Pure x) f = f x Nil
lower (Ap fs x :: Free f a) f =
lower fs (\ft av -> f ft (av :> x))
lowerConstrained
:: (Constrained.Applicative f, Suitable f a)
=> Free f a -> f a
lowerConstrained x = lower x liftA
```

There’s probably a more efficient way to encode it, though.

Sculthorpe, Neil, Jan Bracker, George Giorgidze, and Andy Gill. 2013. “The Constrained-monad Problem.” In *Proceedings of the 18th ACM SIGPLAN International Conference on Functional Programming*, 287–298. ICFP ’13. New York, NY, USA: ACM. doi:10.1145/2500365.2500602. http://ku-fpg.github.io/files/Sculthorpe-13-ConstrainedMonad.pdf.

Tags: Haskell

I’ve been playing around a lot with semirings recently. A semiring is anything with addition, multiplication, zero and one. You can represent that in Haskell as:

```
class Semiring a where
zero :: a
one :: a
infixl 7 <.>
(<.>) :: a -> a -> a
infixl 6 <+>
(<+>) :: a -> a -> a
```

It’s kind of like a combination of two monoids. It has the normal monoid laws:

```
x <+> (y <+> z) = (x <+> y) <+> z
x <.> (y <.> z) = (x <.> y) <.> z
x <+> zero = zero <+> x = x
x <.> one = one <.> x = x
```

And a few extra:

```
x <+> y = y <+> x
x <.> (y <+> z) = (x <.> y) <+> (x <.> z)
(x <+> y) <.> z = (x <.> z) <+> (y <.> z)
zero <.> a = a <.> zero = zero
```

I should note that what I’m calling a semiring here is often called a rig. I actually prefer the name “rig”: a rig is a ring without **n**egatives (cute!); whereas a *semi*ring is a rig without neutral elements, which mirrors the definition of a semigroup. The nomenclature in this area is a bit of a mess, though, so I went with the more commonly-used name for the sake of googleability.

At first glance, it looks quite numeric. Indeed, PureScript uses it as the basis for its numeric hierarchy. (In my experience so far, it’s nicer to use than Haskell’s `Num`

)

```
instance Semiring Integer where
zero = 0
one = 1
(<+>) = (+)
(<.>) = (*)
instance Semiring Double where
zero = 0
one = 1
(<+>) = (+)
(<.>) = (*)
```

However, there are far more types which can form a valid `Semiring`

instance than can form a valid `Num`

instance: the `negate`

method, for example, excludes types representing the natural numbers:

```
newtype ChurchNat = ChurchNat
{ runNat :: forall a. (a -> a) -> a -> a}
data Nat = Zero | Succ Nat
```

These form perfectly sensible semirings, though:

```
instance Semiring ChurchNat where
zero = ChurchNat (const id)
one = ChurchNat ($)
ChurchNat n <+> ChurchNat m = ChurchNat (\f -> n f . m f)
ChurchNat n <.> ChurchNat m = ChurchNat (n . m)
instance Semiring Nat where
zero = Zero
one = Succ Zero
Zero <+> x = x
Succ x <+> y = Succ (x <+> y)
Zero <.> _ = Zero
Succ Zero <.> x =x
Succ x <.> y = y <+> (x <.> y)
```

The other missing method is `fromInteger`

, which means decidedly non-numeric types are allowed:

```
instance Semiring Bool where
zero = False
one = True
(<+>) = (||)
(<.>) = (&&)
```

We can provide a more general definition of the `Sum`

and `Product`

newtypes from Data.Monoid:

```
newtype Add a = Add
{ getAdd :: a
} deriving (Eq, Ord, Read, Show, Semiring)
newtype Mul a = Mul
{ getMul :: a
} deriving (Eq, Ord, Read, Show, Semiring)
instance Functor Add where
fmap f (Add x) = Add (f x)
instance Applicative Add where
pure = Add
Add f <*> Add x = Add (f x)
```

I’m using `Add`

and `Mul`

here to avoid name clashing.

```
instance Semiring a => Monoid (Add a) where
mempty = Add zero
Add x `mappend` Add y = Add (x <+> y)
instance Semiring a => Monoid (Mul a) where
mempty = Mul one
Mul x `mappend` Mul y = Mul (x <.> y)
add :: (Semiring a, Foldable f) => f a -> a
add = getAdd . foldMap Add
mul :: (Semiring a, Foldable f) => f a -> a
mul = getMul . foldMap Mul
```

`add`

and `mul`

are equivalent to `sum`

and `product`

:

`add xs == sum (xs :: [Integer])`

`mul xs == product (xs :: [Integer])`

But they now work with a wider array of types: non-negative numbers, as we’ve seen, but specialised to `Bool`

we get the familiar `Any`

and `All`

newtypes (and their corresponding folds).

`add xs == or (xs :: [Bool])`

`mul xs == and (xs :: [Bool])`

So far, nothing amazing. We avoid a little bit of code duplication, that’s all.

In older versions of Python, there was no native set type. In its place, dictionaries were used, where the values would be booleans. In a similar fashion, before the Counter type was added in 2.7, the traditional way of representing a multiset was using a dictionary where the values were integers.

Using semirings, both of these data structures can have the same type:

```
newtype GeneralMap a b = GeneralMap
{ getMap :: Map a b
} deriving (Functor, Foldable, Show, Eq, Ord)
```

If operations are defined in terms of the `Semiring`

class, the same code will work on a set *and* a multiset:

```
insert :: (Ord a, Semiring b) => a -> GeneralMap a b -> GeneralMap a b
insert x = GeneralMap . Map.insertWith (<+>) x one . getMap
delete :: Ord a => a -> GeneralMap a b -> GeneralMap a b
delete x = GeneralMap . Map.delete x . getMap
```

How to get back the dictionary-like behaviour, then? Well, operations like `lookup`

and `assoc`

are better suited to a `Monoid`

constraint, rather than `Semiring`

:

```
lookup :: (Ord a, Monoid b) => a -> GeneralMap a b -> b
lookup x = fold . Map.lookup x . getMap
assoc :: (Ord a, Applicative f, Monoid (f b))
=> a -> b -> GeneralMap a (f b) -> GeneralMap a (f b)
assoc k v = GeneralMap . Map.insertWith mappend k (pure v) . getMap
```

`lookup`

is a function which should work on sets and multisets: however `Bool`

and `Integer`

don’t have `Monoid`

instances. To fix this, we can use the `Add`

newtype from earlier. The interface for each of these data structures can now be expressed like this:

```
type Set a = GeneralMap a (Add Bool)
type MultiSet a = GeneralMap a (Add Integer)
type Map a b = GeneralMap a (First b)
type MultiMap a b = GeneralMap a [b]
```

And each of the functions on the `GeneralMap`

specialises like this:

```
-- Set
insert :: Ord a => a -> Set a -> Set a
lookup :: Ord a => a -> Set a -> Add Bool
delete :: Ord a => a -> Set a -> Set a
-- MultiSet
insert :: Ord a => a -> MultiSet a -> MultiSet a
lookup :: Ord a => a -> MultiSet a -> Add Integer
delete :: Ord a => a -> MultiSet a -> MultiSet a
-- Map
assoc :: Ord a => a -> b -> Map a b -> Map a b
lookup :: Ord a => a -> Map a b -> First b
delete :: Ord a => a -> Map a b -> Map a b
-- MultiMap
assoc :: Ord a => a -> b -> MultiMap a b -> MultiMap a b
lookup :: Ord a => a -> MultiMap a b -> [b]
delete :: Ord a => a -> MultiMap a b -> MultiMap a b
```

This was actually where I first came across semirings: I was trying to avoid code duplication for a trie implementation. I wanted to get the Boom Hierarchy (1981) (plus maps) from the same underlying implementation.

It works *okay*. On the one hand, it’s nice that you don’t have to wrap the map type itself to get the different behaviour. There’s only one `delete`

function, which works on sets, maps, multisets, etc. I don’t need to import the `TrieSet`

module qualified, to differentiate between the *four* `delete`

functions I’ve written.

On the other hand, the `Add`

wrapper is a pain: having `lookup`

return the wrapped values is ugly, and the `Applicative`

constraint is unwieldy (we only use it for `pure`

). Both of those problems could be solved by using something like the `Newtype`

or `Wrapped`

class, which provide facilities for wrapping and unwrapping, but that might be overkill.

While `Monoid`

and `Semiring`

can take you pretty far, even to a `Monoid`

instance:

```
fromList :: (Ord a, Semiring b, Foldable f) => f a -> GeneralMap a b
fromList = foldr insert (GeneralMap Map.empty)
fromAssocs :: (Ord a, Applicative f, Monoid (f b), Foldable t)
=> t (a, b) -> GeneralMap a (f b)
fromAssocs = foldr (uncurry assoc) (GeneralMap Map.empty)
instance (Ord a, Monoid b) => Monoid (GeneralMap a b) where
mempty = GeneralMap Map.empty
mappend (GeneralMap x) (GeneralMap y) =
GeneralMap (Map.unionWith mappend x y)
singleton :: Semiring b => a -> GeneralMap a b
singleton x = GeneralMap (Map.singleton x one)
```

They seem to fall down around functions like `intersection`

:

```
intersection :: (Ord a, Semiring b)
=> GeneralMap a b -> GeneralMap a b -> GeneralMap a b
intersection (GeneralMap x) (GeneralMap y) =
GeneralMap (Map.intersectionWith (<.>) x y)
```

It works for sets, but it doesn’t make sense for multisets, and it doesn’t work for maps.

I couldn’t find a semiring for the map-like types which would give me a sensible intersection. I’m probably after a different algebraic structure.

While looking for a semiring to represent a valid intersection, I came across the probability semiring. It’s just the normal semiring over the rationals, with a lower bound of 0, and an upper of 1.

It’s useful in some cool ways: you can combine it with a list to get the probability monad (Erwig and Kollmansberger 2006). There’s an example in PureScript’s Distributions package.

`newtype Prob s a = Prob { runProb :: [(a,s)] }`

There are some drawbacks to this representation, performance-wise. In particular, there’s a combinatorial explosion on every monadic bind. One of the strategies to reduce this explosion is to use a map:

`newtype Prob s a = Prob { runProb :: Map a s }`

Because this doesn’t allow duplicate keys, it will flatten the association list on every bind. Unfortunately, the performance gain doesn’t always materialize, and in some cases there’s a performance *loss* (Larsen 2011). Also, the `Ord`

constraint on the keys prevents it from conforming to `Monad`

(at least not without difficulty).

Interestingly, this type is exactly the same as the `GeneralMap`

from before. This is a theme I kept running into, actually: the `GeneralMap`

type represents not just maps, multimaps, sets, multisets, but also a whole host of other data structures.

Edward Kmett had an interesting blog post about “Free Modules and Functional Linear Functionals” (2011b). In it, he talked about this type:

```
infixr 0 $*
newtype Linear r a = Linear { ($*) :: (a -> r) -> r }
```

Also known as `Cont`

, the continuation monad. It can encode the probability monad:

```
fromProbs :: (Semiring s, Applicative m) => [(a,s)] -> ContT s m a
fromProbs xs = ContT $ \k ->
foldr (\(x,s) a -> liftA2 (<+>) (fmap (s<.>) (k x)) a) (pure zero) xs
probOfT :: (Semiring r, Applicative m) => (a -> Bool) -> ContT r m a -> m r
probOfT e c = runContT c (\x -> if e x then pure one else pure zero)
probOf :: Semiring r => (a -> Bool) -> Cont r a -> r
probOf e = runIdentity . probOfT e
uniform :: Applicative m => [a] -> ContT Double m a
uniform xs =
let s = 1.0 / fromIntegral (length xs)
in fromProbs (map (flip (,) s) xs)
```

Multiplication isn’t paid for on every bind, making this (potentially) a more efficient implementation than both the map and the association list.

You can actually make the whole thing a semiring:

```
instance (Semiring r, Applicative m) => Semiring (ContT r m a) where
one = ContT (const (pure one))
zero = ContT (const (pure zero))
f <+> g = ContT (\k -> liftA2 (<+>) (runContT f k) (runContT g k))
f <.> g = ContT (\k -> liftA2 (<.>) (runContT f k) (runContT g k))
```

Which gives you a lovely `Alternative`

instance:

```
instance (Semiring r, Applicative m) => Alternative (ContT r m) where
(<|>) = (<+>)
empty = zero
```

This sheds some light on what was going on with the unsatisfactory `intersection`

function on `GeneralMap`

: it’s actually *multiplication*. If you wanted to stretch the analogy and make `GeneralMap`

conform to `Semiring`

, you could use the empty map for `zero`

, `mappend`

for `<+>`

, but you’d run into trouble for `one`

. `one`

is the map where every possible key has a value of one. In other words, you’d have to enumerate over every possible value for the keys. Interestingly, there’s kind of the inverse problem for Cont: while it has an easy `Semiring`

instance, in order to *inspect* the values you have to enumerate over all the possible keys.

I now have a name for the probability monad / general map / Cont thing: a *covector*.

I think that the transformer version of Cont has a valid interpretation, also. If I ever understand Hirschowitz and Maggesi (2010) I’ll put it into a later follow-up post.

As a short digression, you can beef up the `<|>`

operator a little, with something like the conditional choice operator:

```
data BiWeighted s = s :|: s
infixl 8 :|:
(|>) :: (Applicative m, Semiring s)
=> BiWeighted s
-> ContT s m a
-> ContT s m a
-> ContT s m a
((lp :|: rp) |> r) l =
(mapContT.fmap.(<.>)) lp l <|> (mapContT.fmap.(<.>)) rp r
--
(<|) :: ContT s m a
-> (ContT s m a -> ContT s m a)
-> ContT s m a
l <| r = r l
infixr 0 <|
infixr 0 |>
```

```
probOf ('a'==) (uniform "a" <| 0.4 :|: 0.6 |> uniform "b")
0.4
```

If you fiddle around with the probability monad, you can break it apart in interesting ways. For instance, extracting the `WriterT`

monad transformer gives you:

`WriterT (Product Double) []`

Eric Kidd describes it as `PerhapsT`

: a `Maybe`

with attached probability in his excellent blog post (and his paper in 2007).

Straight away, we can optimise this representation by transforming the leaky `WriterT`

into a state monad:

```
newtype WeightedT s m a = WeightedT
{ getWeightedT :: s -> m (a, s)
} deriving Functor
instance Monad m => Applicative (WeightedT s m) where
pure x = WeightedT $ \s -> pure (x,s)
WeightedT fs <*> WeightedT xs = WeightedT $ \s -> do
(f, p) <- fs s
(x, t) <- xs p
pure (f x, t)
instance Monad m => Monad (WeightedT s m) where
WeightedT x >>= f = WeightedT $ \s -> do
(x, p) <- x s
getWeightedT (f x) p
```

I’m not sure yet, but I think this might have something to do with the isomorphism between `Cont ((->) s)`

and `State s`

(Kmett 2011a).

You can even make it look like a normal (non-transformer) writer with some pattern synonyms:

```
type Weighted s = WeightedT s Identity
pattern Weighted w <- (runIdentity . flip getWeightedT zero -> w) where
Weighted (x,w) = WeightedT (\s -> Identity (x, s <.> w) )
```

And you can pretend that you’ve just got a normal tuple:

```
half :: a -> Weighted Double a
half x = Weighted (x, 0.5)
runWeighted :: Semiring s => Weighted s a -> (a, s)
runWeighted (Weighted w) = w
evalWeighted :: Semiring s => Weighted s a -> a
evalWeighted (Weighted (x,_)) = x
execWeighted :: Semiring s => Weighted s a -> s
execWeighted (Weighted (_,s)) = s
```

Looking back at Cont, it is reminiscent of a particular encoding of the free monoid from Doel (2015):

```
newtype FreeMonoid a = FreeMonoid
{ forall m. Monoid m => (a -> m) -> m }
```

So possibly covectors represent the free semiring, in some way.

Another encoding which looks free-ish is one of the efficient implementations of the probability monad from Larsen (2011):

```
data Dist a where
Certainly :: a -> Dist a -- only possible value
Choice :: Probability -> Dist a -> Dist a -> Dist a
Fmap :: (a -> b) -> Dist a -> Dist b
Join :: Dist (Dist a) -> Dist a
```

This looks an awful lot like a weighted free alternative. Is it a free semiring, then?

Maybe. There’s a parallel between the relationship between monoids and semirings and applicatives and `Alternative`

s (Rivas, Jaskelioff, and Schrijvers 2015). In a way, where monads are monoids in the category of endofunctors, alternatives are *semirings* in the category of endofunctors.

This parallel probably isn’t what I first thought it was. First of all, the above paper uses near-semirings, not semirings. A near-semiring is a semiring where the requirements for left distribution of multiplication over addition and commutative addition are dropped. Secondly, the class which most mirrors near-semirings is `MonadPlus`

, not alternative. (alternative doesn’t have annihilation) Thirdly, right distribution of multiplication over addition *isn’t* required `MonadPlus`

: it’s a further law required on top of the existing laws. Fourthly, most types in the Haskell ecosystem today which conform to `MonadPlus`

*don’t* conform to this extra law: in fact, those that do seem to be lists of some kind or another.

A further class is probably needed on top of the two already there, with the extra laws (called `Nondet`

in Fischer 2009).

An actual free near-semiring looks like this:

```
data Free f x = Free { unFree :: [FFree f x] }
data FFree f x = Pure x | Con (f (Free f x))
```

Specialised to the `Identity`

monad, that becomes:

```
data Forest a = Forest { unForest :: [Tree x] }
data Tree x = Leaf x | Branch (Forest x)
```

De-specialised to the free monad transformer, it becomes:

```
newtype FreeT f m a = FreeT
{ runFreeT :: m (FreeF f a (FreeT f m a)) }
data FreeF f a b
= Pure a
| Free (f b)
type FreeNearSemiring f = FreeT f []
```

These definitions all lend themselves to combinatorial search (Spivey 2009, Fischer (2009), Piponi (2009)), with one extra operation needed: `wrap`

.

Does the odds monad fit in to any of this?

While `WriterT (Product Rational) []`

is a valid definition of the traditional probability monad, it’s *not* the same as the odds monad. If you take the odds monad, and parameterize it over the weight of the tail, you get this:

`data Odds m a = Certain a | Choice (m (a, Odds a))`

Which looks remarkably like `ListT`

done right:

```
newtype ListT m a = ListT { next :: m (Step m a) }
data Step m a = Cons a (ListT m a) | Nil
```

That suggests a relationship between probability and odds:

```
WriterT (Product Rational) [] = Probability
ListT (Weighted Rational) = Odds
```

`ListT`

isn’t a perfect match, though: it allows empty lists. To correct this, you could use the Cofree Comonad:

`data Cofree f a = a :< (f (Cofree f a))`

Subbing in `Maybe`

for `f`

, you get a non-empty list. A *weighted* `Maybe`

is basically `PerhapsT`

, as was mentioned earlier.

Types in haskell also form a semiring.

```
(<.>) = (,)
one = ()
(<+>) = Either
zero = Void
```

There’s a subset of semirings which are star semirings. They have an operation $*$ such that:

$a* = 1 + aa* = 1 + a*a$

Or, as a class:

```
class Semiring a => StarSemiring a where
star :: a -> a
star x = one <+> plus x
plus :: a -> a
plus x = x <.> star x
```

Using this on types, you get:

`star a = Either () (a, star a)`

Which is just a standard list! Some pseudo-haskell on alternatives will give you:

```
star :: (Alternative f, Monoid a) => f a -> f a
star x = (x <.> star x) <+> pure mempty where
(<.>) = liftA2 mappend
(<+>) = <|>
```

Also known as `many`

. (although note that this breaks all the laws)

The $*$ for rationals is defined as (Droste and Kuich 2009, p8):

$a* = \begin{cases} \frac{1}{1 - a} & \quad \text{if } & 0 \leq a \lt 1, \\ \infty & \quad \text{if } & a \geq 1. \end{cases}$

So, combining the probability with the type-level business, the star of `Writer s a`

is:

`Either (1, a) (a, s / (1 - s), star (Writer s a))`

Or, to put it another way: the odds monad!

An endomorphism is a morphism from an object to itself. A less general definition (and the one most often used in Haskell) is a function of the type `a -> a`

:

`newtype Endo a = Endo { appEndo :: a -> a }`

It forms a monoid under composition:

```
instance Monoid (Endo a) where
mempty = Endo id
mappend (Endo f) (Endo g) = Endo (f . g)
```

If the underlying type is itself a commutative monoid, it also forms near-semiring:

```
instance Monoid a => Semiring (Endo a) where
Endo f <+> Endo g = Endo (\x -> f x <> g x)
zero = Endo (const mempty)
one = Endo id
Endo f <.> Endo g = Endo (f . g)
instance (Monoid a, Eq a) => StarSemiring (Endo a) where
star (Endo f) = Endo converge where
converge x = x <> (if y == mempty then y else converge y) where
y = f x
```

Here’s something interesting: there’s a similarity here to the semiring for church numerals. In fact, as far as I can tell, the functions are *exactly* the same when applied to endomorphisms of endomorphisms. To the extent that you could define church numerals with something as simple as this:

`type ChurchEndoNat = forall a. Endo (Endo a)`

And it works!

```
two, three :: ChurchEndoNat
two = one <+> one
three = one <+> two
unChurch :: Num a => ChurchEndoNat -> a
unChurch f = appEndo (appEndo f (Endo (1+))) 0
```

```
unChurch (two <.> three)
6
```

One of the most important applications (and a source of much of the notation) is regular expressions. In fact, the free semiring looks like a haskell datatype for regular expressions:

```
data FreeStar a
= Gen a
| Zer
| One
| FreeStar a :<+> FreeStar a
| FreeStar a :<.> FreeStar a
| Star (FreeStar a)
instance Semiring (FreeStar a) where
(<+>) = (:<+>)
(<.>) = (:<.>)
zero = Zer
one = One
instance StarSemiring (FreeStar a) where
star = Star
interpret :: StarSemiring s => (a -> s) -> FreeStar a -> s
interpret f = \case
Gen x -> f x
Zer -> zero
One -> one
l :<+> r -> interpret f l <+> interpret f r
l :<.> r -> interpret f l <.> interpret f r
Star x -> star (interpret f x)
```

Then, interpreting the regex is as simple as writing an interpreter (with some help from `Endo`

):

```
asRegex :: Eq a => FreeStar (a -> Bool) -> [a] -> Bool
asRegex fs = any null . appEndo (interpret f fs) . pure where
f p = Endo . mapMaybe $ \case
(x:xs) | p x -> Just xs
_ -> Nothing
char' :: Eq a => a -> FreeStar (a -> Bool)
char' c = Gen (c==)
```

Actually, you don’t need the free version at all!

```
runRegex :: Eq a => Endo [[a]] -> [a] -> Bool
runRegex fs = any null . appEndo fs . pure
char :: Eq a => a -> Endo [[a]]
char c = Endo . mapMaybe $ \case
(x:xs) | c == x -> Just xs
_ -> Nothing
```

With some `-XOverloadedStrings`

magic, you get a pretty nice interface:

```
instance IsString (Endo [String]) where
fromString = mul . map char . reverse
(<^>) :: Semiring s => s -> s -> s
(<^>) = flip (<.>)
greet :: Endo [String]
greet = "H" <^> ("a" <+> "e") <^> "llo"
```

```
runRegex greet "Hello"
True
```

```
runRegex greet "Hallo"
True
```

```
runRegex greet "Halo"
False
```

Of course, that’s about as slow as it gets when it comes to regexes. A faster representation is a nondeterministic finite automaton. One such implementation in haskell is Gabriel Gonzalez’s.

The regex type in that example can be immediately made to conform to `Semiring`

and `StarSemiring`

. However, it might be more interesting to translate the *implementation* into using semirings. The type of a regex looks like this:

```
type State = Int
{ _startingStates :: Set State
, _transitionFunction :: Char -> State -> Set State
, _acceptingStates :: Set State }
```

The set data structure jumps out as an opportunity to sub in arbitrary semirings.Swapping in the `GeneralMap`

is reasonably easy:

```
type State = Int
data Regex i s = Regex
{ _numberOfStates :: Int
, _startingStates :: GeneralMap State s
, _transitionFunction :: i -> State -> GeneralMap State s
, _acceptingStates :: GeneralMap State s }
isEnd :: Semiring s => Regex i s -> s
isEnd (Regex _ as _ bs) = add (intersection as bs)
match :: Regex Char (Add Bool) -> String -> Bool
match r = getAdd . isEnd . foldl' run r where
run (Regex n (GeneralMap as) f bs) i = Regex n as' f bs
where as' = mconcat [ fmap (v<.>) (f i k) | (k,v) <- Map.assocs as ]
satisfy :: Semiring s => (i -> s) -> Regex i (Add s)
satisfy predicate = Regex 2 as f bs
where
as = singleton 0
bs = singleton 1
f i 0 = assoc 1 (predicate i) mempty
f _ _ = mempty
once :: Eq i => i -> Regex i (Add Bool)
once x = satisfy (== x)
shift :: Int -> GeneralMap State s -> GeneralMap State s
shift n = GeneralMap . Map.fromAscList . (map.first) (+ n) . Map.toAscList . getMap
instance (Semiring s, Monoid s) => Semiring (Regex i s) where
one = Regex 1 (singleton 0) (\_ _ -> mempty) (singleton 0)
zero = Regex 0 mempty (\_ _ -> mempty) mempty
Regex nL asL fL bsL <+> Regex nR asR fR bsR = Regex n as f bs
where
n = nL + nR
as = mappend asL (shift nL asR)
bs = mappend bsL (shift nL bsR)
f i s | s < nL = fL i s
| otherwise = shift nL (fR i (s - nL))
Regex nL asL fL bsL <.> Regex nR asR fR bsR = Regex n as f bs where
n = nL + nR
as = let ss = add (intersection asL bsL)
in mappend asL (fmap (ss<.>) (shift nL asR))
f i s =
if s < nL
then let ss = add (intersection r bsL)
in mappend r (fmap (ss<.>) (shift nL asR))
else shift nL (fR i (s - nL))
where
r = fL i s
bs = shift nL bsR
instance (StarSemiring s, Monoid s) => StarSemiring (Regex i s) where
star (Regex n as f bs) = Regex n as f' as
where
f' i s =
let r = f i s
ss = add (intersection r bs)
in mappend r (fmap (ss<.>) as)
plus (Regex n as f bs) = Regex n as f' bs
where
f' i s =
let r = f i s
ss = add (intersection r bs)
in mappend r (fmap (ss<.>) as)
instance IsString (Regex Char (Add Bool)) where
fromString = mul . map once
```

This begins to show some of the real power of using semirings and covectors. We have a normal regular expression implementation when we use the covector over bools. Use the probability semiring, and you’ve got probabilistic parsing.

Swap in the tropical semiring: a semiring over the reals where addition is the max function, and multiplication is addition of reals. Now you’ve got a depth-first parser.

That’s how you might swap in different interpretations. How about swapping in different *implementations*? Well, there might be some use to swapping in the CYK algorithm, or the Gauss-Jordan-Floyd-Warshall-McNaughton-Yamada algorithm (O’Connor 2011).

Alternatively, you can swap in the underlying data structure. Instead of a map, if you use an integer (each bit being a value, the keys being the bit position), you have a super-fast implementation (and the final implementation used in the original example). Finally, you could use a different representation of the state transfer function: a matrix.

A square matrix can be understood as a map from pairs of indices to values. This lets us use it to represent the state transfer function. Take, for instance, a regular expression with three possible states. Its state transfer function might look like this:

$transfer = \begin{cases} 1 \quad & \{ 2, 3 \} \\ 2 \quad & \{ 1 \} \\ 3 \quad & \emptyset \end{cases}$

It has the type of:

`State -> Set State`

Where `State`

is an integer. You can represent the set as a vector, where each position is a key, and each value is whether or not that key is present:

$transfer = \begin{cases} 1 \quad & \begin{array} ( 0 & 1 & 1 ) \end{array} \\ 2 \quad & \begin{array} ( 1 & 0 & 0 ) \end{array} \\ 3 \quad & \begin{array} ( 0 & 0 & 0 ) \end{array} \end{cases}$

Then, the matrix representation is obvious:

$transfer = \left( \begin{array}{ccc} 0 & 1 & 1 \\ 1 & 0 & 0 \\ 0 & 0 & 0 \end{array} \right)$

This is the semiring of square matrices. It is, of course, yet *another* covector. The “keys” are the transfers: `1 -> 2`

or `2 -> 3`

, represented by the indices of the matrix. The “values” are whether or not that transfer is permitted.

The algorithms for the usual semiring operations on matrices like this are well-known and well-optimized. I haven’t yet benchmarked them in Haskell using the matrix libraries, so I don’t know how they compare to the other approaches. In the meantime, there’s an elegant list-based implementation in Dolan (2013):

```
data Matrix a = Scalar a
| Matrix [[a]]
mjoin :: (Matrix a, Matrix a, Matrix a, Matrix a) -> Matrix a
mjoin (Matrix ws, Matrix xs, Matrix ys, Matrix zs) =
Matrix ((zipWith (++) ws xs) ++ (zipWith (++) ys zs))
msplit :: Matrix a -> (Matrix a, Matrix a, Matrix a, Matrix a)
msplit (Matrix (row:rows)) =
(Matrix [[first]], Matrix [top]
,Matrix left, Matrix rest )
where
(first:top) = row
(left,rest) = unzip (map (\(x:xs) -> ([x],xs)) rows)
instance Semiring a => Semiring (Matrix a) where
zero = Scalar zero
one = Scalar one
Scalar x <+> Scalar y = Scalar (x <+> y)
Matrix x <+> Matrix y =
Matrix (zipWith (zipWith (<+>)) x y)
Scalar x <+> m = m <+> Scalar x
Matrix [[x]] <+> Scalar y = Matrix [[x <+> y]]
x <+> y = mjoin (first <+> y, top, left, rest <+> y)
where (first, top, left, rest) = msplit x
Scalar x <.> Scalar y = Scalar (x <.> y)
Scalar x <.> Matrix y = Matrix ((map.map) (x<.>) y)
Matrix x <.> Scalar y = Matrix ((map.map) (<.>y) x)
Matrix x <.> Matrix y =
Matrix [ [ foldl1 (<+>) (zipWith (<.>) row col) | col <- cols ]
| row <- x ] where cols = transpose y
instance StarSemiring a => StarSemiring (Matrix a) where
star (Matrix [[x]]) = Matrix [[star x]]
star m = mjoin (first' <+> top' <.> rest' <.> left'
,top' <.> rest', rest' <.> left', rest')
where
(first, top, left, rest) = msplit m
first' = star first
top' = first' <.> top
left' = left <.> first'
rest' = star (rest <+> left' <.> top)
```

A lot of the use from semirings comes from “attaching” them to other values. Attaching a semiring to effects (in the form of an applicative) can give you *repetition* of those effects. The excellent ReplicateEffects library explores this concept in depth.

It’s based on this type:

```
data Replicate a b
= Nil
| Cons (Maybe b) (Replicate a (a -> b))
```

This type can be made to conform to `Semiring`

(and `Starsemiring`

, etc) trivially.

In the simplest case, it has the same behaviour as `replicateM`

. Even the more complex combinators, like `atLeast`

, can be built on `Alternative`

:

```
atLeast :: Alternative f => Int -> f a -> f [a]
atLeast m f = go (max 0 m) where
go 0 = many f
go n = liftA2 (:) f (go (n-1))
atMost :: Alternative f => Int -> f a -> f [a]
atMost m f = go (max 0 m) where
go 0 = pure []
go n = liftA2 (:) f (go (n-1)) <|> pure []
```

There are two main benefits over using the standard alternative implementation. First, you can choose greedy or lazy evaluation of the effects *after* the replication is built.

Secondly, the *order* of the effects doesn’t have to be specified. This allows you to execute permutations of the effects, in a permutation parser, for instance. The permutation is totally decoupled from the declaration of the repetition (it’s in a totally separate library, in fact: PermuteEffects). Its construction is reminiscent of the free alternative.

Having the replicate type conform to `Semiring`

is all well and good: what I’m interested in is seeing if its implementation is another semiring-based object in disguise. I’ll revisit this in a later post.

List comprehension notation is one of my all-time favourite bits of syntactic sugar. It seems almost *too* declarative to have a reasonable implementation strategy. The vast majority of the time, it actually works in a sensible way. There are exceptions, though. Take a reasonable definition of a list of Pythagorean triples:

`[ (x,y,z) | x <- [1..], y <- [1..], z <- [1..], x*x + y*y == z*z ]`

This expression will diverge without yielding a single triple. It will search through every possible value for `z`

before incrementing either `x`

or `y`

. Since there are infinite values for `z`

, it will never find a triple. In other words, vanilla list comprehensions in Haskell perform depth-first search.

In order to express other kinds of search (either breadth-first or depth-bounded), different monads are needed. These monads are explored in Fischer (2009) and Spivey (2009).

You can actually use the *exact* same notation as above with arbitrary alternative monads using `-XMonadComprehensions`

and `-XOverloadedLists`

.

```
trips :: ( Alternative m
, Monad m
, IsList (m Integer)
, Enum (Item (m Integer))
, Num (Item (m Integer)))
=> m (Integer,Integer,Integer)
trips = [ (x,y,z) | x <- [1..], y <- [1..], z <- [1..], x*x + y*y == z*z ]
```

So then, here’s the challenge: swap in different `m`

s via a type annotation, and prevent `trips`

from diverging before getting any triples.

As one example, here’s some code adapted from Fischer (2009):

```
instance (Monoid r, Applicative m) => Monoid (ContT r m a) where
mempty = ContT (const (pure mempty))
mappend (ContT f) (ContT g) = ContT (\x -> liftA2 mappend (f x) (g x))
newtype List a = List
{ runList :: forall m. Monoid m => Cont m a } deriving Functor
instance Foldable List where foldMap = flip (runCont.runList)
instance Show a => Show (List a) where show = show . foldr (:) []
instance Monoid (List a) where
mappend (List x) (List y) = List (mappend x y)
mempty = List mempty
instance Monoid a => Semiring (List a) where
zero = mempty
(<+>) = mappend
(<.>) = liftA2 mappend
one = pure mempty
bfs :: List a -> [a]
bfs = toList . fold . levels . anyOf
newtype Levels a = Levels { levels :: [List a] } deriving Functor
instance Applicative Levels where
pure x = Levels [pure x]
Levels fs <*> Levels xs = Levels [ f <*> x | f <- fs, x <- xs ]
instance Alternative Levels where
empty = Levels []
Levels x <|> Levels y = Levels (mempty : merge x y)
instance IsList (List a) where
type Item (List a) = a
fromList = anyOf
toList = foldr (:) []
instance Applicative List where
pure x = List (pure x)
(<*>) = ap
instance Alternative List where
empty = mempty
(<|>) = mappend
instance Monad List where
x >>= f = foldMap f x
anyOf :: (Alternative m, Foldable f) => f a -> m a
anyOf = getAlt . foldMap (Alt . pure)
merge :: [List a] -> [List a] -> [List a]
merge [] ys = ys
merge xs [] = xs
merge (x:xs) (y:ys) = mappend x y : merge xs ys
```

```
take 3 (bfs trips)
[(3,4,5),(4,3,5),(6,8,10)]
```

The only relevance to semirings is the merge function. The semiring over lists is the semiring over polynomials:

```
instance Semiring a => Semiring [a] where
one = [one]
zero = []
[] <+> ys = ys
xs <+> [] = xs
(x:xs) <+> (y:ys) = (x <+> y) : (xs <+> ys)
[] <.> _ = []
_ <.> [] = []
(x:xs) <.> (y:ys) =
(x <.> y) : (map (x <.>) ys <+> map (<.> y) xs <+> (xs <.> ys))
```

The `<+>`

is the same as the `merge`

function. I think the `<.>`

might be a more valid definition of the `<*>`

function, also.

```
instance Applicative Levels where
pure x = Levels [pure x]
Levels [] <*> _ = Levels []
_ <*> Levels [] = Levels []
Levels (f:fs) <*> Levels (x:xs) = Levels $
(f <*> x) : levels (Levels (fmap (f <*>) xs)
<|> Levels (fmap (<*> x) fs)
<|> (Levels fs <*> Levels xs))
```

I’ve only scratched the surface of this abstraction. There are several other interesting semirings: polynomials, logs, Viterbi, Łukasiewicz, languages, multisets, bidirectional parsers, etc. Hopefully I’ll eventually be able to put this stuff into a library or something. In the meantime, I definitely will write some posts on the application to context-free parsing, bidirectional parsing (I just read Breitner (2016)) and search.

Boom, H. J. 1981. “Further thoughts on abstracto.” *Working Paper ELC-9, IFIP WG 2.1*. http://www.kestrel.edu/home/people/meertens/publications/papers/Abstracto_reader.pdf.

Breitner, Joachim. 2016. “Showcasing applicative. Joachim breitner’s blog.” http://www.joachim-breitner.de/blog/710-Showcasing_Applicative.

Doel, Dan. 2015. “Free monoids in haskell. The comonad.Reader.” http://comonad.com/reader/2015/free-monoids-in-haskell/.

Dolan, Stephen. 2013. “Fun with semirings: A functional pearl on the abuse of linear algebra.” In, 48:101. ACM Press. doi:10.1145/2500365.2500613. https://www.cl.cam.ac.uk/~sd601/papers/semirings.pdf.

Droste, Manfred, and Werner Kuich. 2009. “Semirings and formal power series.” In *Handbook of weighted automata*, ed by. Manfred Droste, Werner Kuich, and Heiko Vogler, 1:3–28. Monographs in theoretical computer science. an EATCS series. Berlin, Heidelberg: Springer Berlin Heidelberg. http://staff.mmcs.sfedu.ru/~ulysses/Edu/Marktoberdorf_2009/working_material/Esparsa/Kuich.%20Semirings%20and%20FPS.pdf.

Erwig, Martin, and Steve Kollmansberger. 2006. “Functional pearls: Probabilistic functional programming in haskell.” *Journal of Functional Programming* 16 (1): 21–34. http://web.engr.oregonstate.edu/~erwig/papers/abstracts.html#JFP06a.

Fischer, Sebastian. 2009. “Reinventing haskell backtracking.” In *Informatik 2009, im fokus das leben (ATPS’09)*. GI Edition. http://www-ps.informatik.uni-kiel.de/~sebf/data/pub/atps09.pdf.

Hirschowitz, André, and Marco Maggesi. 2010. “Modules over monads and initial semantics.” *Information and Computation* 208 (5). Special issue: 14th workshop on logic, language, information and computation (WoLLIC 2007) (May 1): 545–564. doi:10.1016/j.ic.2009.07.003. https://pdfs.semanticscholar.org/3e0c/c79e8cda9246cb954da6fd8aaaa394fecdc3.pdf.

Kidd, Eric. 2007. “Build your own probability monads.” http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.129.9502&rep=rep1&type=pdf.

Kmett, Edward. 2011a. “Free monads for less (part 2 of 3): Yoneda. The comonad.Reader.” http://comonad.com/reader/2011/free-monads-for-less-2/.

———. 2011b. “Modules and functional linear functionals. The comonad.Reader.” http://comonad.com/reader/2011/free-modules-and-functional-linear-functionals/.

Larsen, Ken Friis. 2011. “Memory efficient implementation of probability monads.” http://www.diku.dk/~kflarsen/t/ProbMonad-unpublished.pdf.

O’Connor, Russell. 2011. “A very general method of computing shortest paths. Russell o’Connor’s blog.” http://r6.ca/blog/20110808T035622Z.html.

Piponi, Dan. 2009. “A monad for combinatorial search with heuristics. A neighborhood of infinity.” http://blog.sigfpe.com/2009/07/monad-for-combinatorial-search-with.html.

Rivas, Exequiel, Mauro Jaskelioff, and Tom Schrijvers. 2015. “From monoids to near-semirings: The essence of MonadPlus and alternative.” In *Proceedings of the 17th international symposium on principles and practice of declarative programming*, 196–207. ACM. doi:10.1145/2790449.2790514. http://www.fceia.unr.edu.ar/~mauro/pubs/FromMonoidstoNearsemirings.pdf.

Spivey, J. Michael. 2009. “Algebras for combinatorial search.” *Journal of Functional Programming* 19 (3): 469–487. doi:10.1017/S0956796809007321. http://www.journals.cambridge.org/abstract_S0956796809007321.

Part 3 of a 3-part series on probability

Tags: Haskell

Previously, I tried to figure out how to make the probability monad more “listy”. I read a little more about the topic (especially Erwig and Kollmansberger 2006; and Kidd 2007).

I then thought about what a probability monad would look like if it was based on other data structures. I feel like the standard version really wants to be:

```
newtype ProperProb a = ProperProb
{ yes :: Map a (Product Rational) }
```

But of course a monad instance isn’t allowed.

Similar to a map, though, is a binary tree:

```
data BinaryTree a = Leaf
| Node (BinaryTree a) a (BinaryTree a)
```

And it feels better for probability - *flatter*, somehow. Transmuting it into a probability-thing:

```
data Odds a = Certain a
| Choice (Odds a) Rational (Odds a)
deriving (Eq, Functor, Foldable, Show)
```

That looks good to me. A choice between two different branches feels more natural than a choice between a head and a tail.

The fold is similar to before, with an unfold for good measure:

```
foldOdds :: (b -> Rational -> b -> b) -> (a -> b) -> Odds a -> b
foldOdds f b = r where
r (Certain x) = b x
r (Choice xs p ys) = f (r xs) p (r ys)
unfoldOdds :: (b -> Either a (b,Rational,b)) -> b -> Odds a
unfoldOdds f = r where
r b = case f b of
Left a -> Certain a
Right (x,p,y) -> Choice (r x) p (r y)
fi :: Bool -> a -> a -> a
fi True t _ = t
fi False _ f = f
```

I changed the pattern synonym a little:

```
unRatio :: Num a => Rational -> (a,a)
unRatio = numerator &&& denominator
>>> fromInteger *** fromInteger
pattern n :% d <- (unRatio -> (n,d))
```

Then, the `probOf`

function:

```
probOf :: Eq a => a -> Odds a -> Rational
probOf e = foldOdds f b where
b x = fi (e == x) 1 0
f x (n:%d) y = (x * n + y * d) / (n + d)
```

This version doesn’t have the option for short-circuiting on the first value it finds.

For generating from lists, you can try to evenly divide the list among each branch.

```
fromListOdds :: (([b], Int) -> Integer) -> (b -> a) -> [b] -> Maybe (Odds a)
fromListOdds fr e = r where
r [] = Nothing
r xs = Just (unfoldOdds f (xs, length xs))
f ([x],_) = Left (e x)
f (xs ,n) = Right ((ys,l), fr (ys,l) % fr (zs,r), (zs,r)) where
l = n `div` 2
r = n - l
(ys,zs) = splitAt l xs
equalOdds :: [a] -> Maybe (Odds a)
equalOdds = fromListOdds (fromIntegral . snd) id
fromDistrib :: [(a,Integer)] -> Maybe (Odds a)
fromDistrib = fromListOdds (sum . map snd . fst) fst
```

What’s really nice about this version is the fact that the old `append`

is just the `Choice`

constructor, leaving the instances to be really nice:

```
flatten :: Odds (Odds a) -> Odds a
flatten = foldOdds Choice id
instance Applicative Odds where
pure = Certain
fs <*> xs = flatten (fmap (<$> xs) fs)
instance Monad Odds where
x >>= f = flatten (f <$> x)
```

Finally, as a bonus, to remove duplicates:

```
lcd :: Foldable f => f Rational -> Integer
lcd = foldl' (\a e -> lcm a (denominator e)) 1
toDistrib :: Odds a -> [(a,Integer)]
toDistrib = factorOut . foldOdds f b where
b x = [(x,1)]
f l p r = (map.fmap) (n%t*) l ++ (map.fmap) (d%t*) r where
n = numerator p
d = denominator p
t = n + d
factorOut xs = (map.fmap) (numerator . (lcd'*)) xs where
lcd' = fromIntegral . lcd . map snd $ xs
counts :: (Ord a, Num n) => [(a,n)] -> [(a,n)]
counts =
Map.assocs .
Map.fromListWith (+)
compress :: Ord a => Odds a -> Odds a
compress xs = let Just ys = (fromDistrib . counts . toDistrib) xs in ys
```

After reading yet more on this, I found that the main issue with the monad is its performance. Two articles in particular: Larsen (2011), and Scibior, Ghahramani, and Gordon (2015), refer to a GADT implementation of the monad which maximises laziness.

Erwig, Martin, and Steve Kollmansberger. 2006. “Functional pearls: Probabilistic functional programming in haskell.” *Journal of Functional Programming* 16 (1): 21–34. http://icerote.net/doc/library/programming/fp/Probabilistic%20functional%20programming%20in%20Haskell.pdf.

Scibior, A., Z. Ghahramani, and A. D. Gordon. 2015. “Practical probabilistic programming with monads.” In *2015 ACM SIGPLAN symposium on haskell*, 50:165–176. ACM. http://mlg.eng.cam.ac.uk/pub/pdf/SciGhaGor15.pdf.

Part 2 of a 3-part series on probability

Tags: Haskell

One of the more unusual monads is the “probability monad”:

```
newtype Probability a = Probability
{ runProb :: [(a,Rational)] }
data Coin = Heads | Tails
toss :: Probability Coin
toss = Probability [(Heads, 1 % 2), (Tails, 1 % 2)]
```

Although it’s a little inefficient, it’s an elegant representation. I’ve written about it before here.

It has some notable deficiencies, though. For instance: the user has to constantly check that all the probabilities add up to one. Its list can be empty, which doesn’t make sense. Also, individual outcomes can appear more than once in the same list.

A first go a fixing the problem might look something like this:

```
newtype Distrib a = Distrib
{ runDist :: [(a,Rational)] }
tossProb :: Distrib Coin
tossProb = Distrib [(Heads, 1), (Tails, 1)]
```

The type is the same as before: it’s the semantics which have changed. The second field of the tuples no longer have to add up to one. The list can still be empty, though, and now finding the probability of, say, the head, looks like this:

```
probHead :: Distrib a -> Rational
probHead (Distrib xs@((_,p):_)) = p / sum [ q | (_,q) <- xs ]
```

Infinite lists aren’t possible, either.

One other way to look at the problem is to mimic the structure of cons-lists. Something like this:

```
data Odds a = Certainly a
| Odds a Rational (Odds a)
deriving (Eq, Functor, Foldable, Show)
```

Here, the `Odds`

constructor (analogous to `(:)`

) contains the betting-style odds of the head element vs. *the rest of the list*. The coin from before is represented by:

```
tossOdds :: Odds Coin
tossOdds = Odds Heads (1 % 1) (Certainly Tails)
```

This representation has tons of nice properties. First, let’s use some pattern-synonym magic for rationals:

```
pattern (:%) :: Integer -> Integer -> Rational
pattern n :% d <- (numerator &&& denominator -> (n,d)) where
n :% d = n % d
```

Then, finding the probability of the head element is this:

```
probHeadOdds :: Odds a -> Rational
probHeadOdds (Certainly _) = 1
probHeadOdds (Odds _ (n :% d) _) = n :% (n + d)
```

The representation can handle infinite lists no problem:

```
probHeadOdds (Odds 'a' (1 :% 1) undefined)
1 % 2
```

Taking the tail preserves semantics, also. To do some more involved manipulation, a fold helper is handy:

```
foldOdds :: (a -> Rational -> b -> b) -> (a -> b) -> Odds a -> b
foldOdds f b = r where
r (Certainly x) = b x
r (Odds x p xs) = f x p (r xs)
```

You can use this function to find the probability of a given item:

```
probOfEvent :: Eq a => a -> Odds a -> Rational
probOfEvent e = foldOdds f b where
b x = if e == x then 1 else 0
f x n r = (if e == x then n else r) / (n + 1)
```

This assumes that each item only occurs once. A function which combines multiple events might look like this:

```
probOf :: (a -> Bool) -> Odds a -> Rational
probOf p = foldOdds f b where
b x = if p x then 1 else 0
f x n r = (if p x then r + n else r) / (n + 1)
```

Some utility functions to create `Odds`

:

```
equalOdds :: Foldable f => f a -> Maybe (Odds a)
equalOdds xs = case length xs of
0 -> Nothing
n -> Just (foldr f undefined xs (n - 1)) where
f y a 0 = Certainly y
f y a n = Odds y (1 % fromIntegral n) (a (n - 1))
fromDistrib :: [(a,Integer)] -> Maybe (Odds a)
fromDistrib [] = Nothing
fromDistrib xs = Just $ f (tot*lst) xs where
(tot,lst) = foldl' (\(!t,_) e -> (t+e,e)) (0,undefined) (map snd xs)
f _ [(x,_)] = Certainly x
f n ((x,p):xs) = Odds x (mp % np) (f np xs) where
mp = p * lst
np = n - mp
probOfEach :: Eq a => a -> Odds a -> Rational
probOfEach x xs = probOf (x==) xs
propOf :: Eq a => a -> [a] -> Maybe Rational
propOf _ [] = Nothing
propOf x xs = Just . uncurry (%) $
foldl' (\(!n,!m) e -> (if x == e then n+1 else n, m+1)) (0,0) xs
```

`propOf x xs == fmap (probOfEach x) (equalOdds xs)`

And finally, the instances:

```
append :: Odds a -> Rational -> Odds a -> Odds a
append = foldOdds f Odds where
f e r a p ys = Odds e ip (a op ys) where
ip = p * r / (p + r + 1)
op = p / (r + 1)
flatten :: Odds (Odds a) -> Odds a
flatten = foldOdds append id
instance Applicative Odds where
pure = Certainly
fs <*> xs = flatten (fmap (<$> xs) fs)
instance Monad Odds where
x >>= f = flatten (f <$> x)
```

Part 2 of a 2-part series on tries

Tags: Haskell

When I ended the last post, I had a nice `Trie`

datatype, with plenty of functions, but I couldn’t get it to conform to the standard Haskell classes. The problem was to do with the type variables in the Trie:

```
data OldTrie a = OldTrie
{ otEndHere :: Bool
, otChildren :: Map a (OldTrie a) }
```

Although the type variable is `a`

, the trie really contains *lists* of `a`

s. At least, that’s what’s reflected in functions like `insert`

, `member`

, etc.:

```
member :: (Foldable f, Ord a) => f a -> OldTrie a -> Bool
member = foldr f otEndHere where
f e a = maybe False a . Map.lookup e . otChildren
otInsert :: (Foldable f, Ord a) => f a -> OldTrie a -> OldTrie a
otInsert = foldr f b where
b (OldTrie _ c) = OldTrie True c
f e a (OldTrie n c) = OldTrie n (Map.alter (Just . a . fold) e c)
instance Ord a => Monoid (OldTrie a) where
mempty = OldTrie False mempty
OldTrie v c `mappend` OldTrie t d =
OldTrie (v || t) (Map.unionWith (<>) c d)
```

Realistically, the type which the trie contains is more like:

`Foldable f => Trie (f a)`

That signature strongly hints at GADTs, as was indicated by this stackoverflow answer. The particular GADT which is applicable here is this:

`data TrieSet a where TrieSet :: Bool -> Map a (TrieSet [a]) -> TrieSet [a]`

Why lists and not a general `Foldable`

? Well, for the particular use I had in mind (conforming to the `Foldable`

typeclass), I need `(:)`

.

```
instance Foldable TrieSet where
foldr f b (TrieSet e c) = if e then f [] r else r where
r = Map.foldrWithKey (flip . g . (:)) b c
g k = foldr (f . k)
```

With some more helper functions, the interface becomes pretty nice:

```
instance Show a => Show (TrieSet [a]) where
showsPrec d t =
showParen
(d > 10)
(showString "fromList " . shows (foldr (:) [] t))
instance Ord a => IsList (TrieSet [a]) where
type Item (TrieSet [a]) = [a]
fromList = foldr tsInsert mempty
toList = foldr (:) []
```

The trie has the side-effect of lexicographically sorting what it’s given:

```
fromList ["ced", "abc", "ced", "cb", "ab"] :: TrieSet String
fromList ["ab","abc","cb","ced"]
```

Most implementations of tries that I’ve seen are map-like data structures, rather than set-like. In other words, instead of holding a `Bool`

at the value position, it holds a `Maybe`

something.

```
data Trie a b = Trie
{ endHere :: b
, children :: Map a (Trie a b)
} deriving (Eq, Ord, Show, Functor, Foldable, Traversable)
```

This is a much more straightforward datatype. `Foldable`

can even be automatically derived.

However, I haven’t made the `endHere`

field a `Maybe a`

. I want to be able to write something like this:

```
type TrieSet [a] = Trie a Bool
type TrieMap a b = Trie a (Maybe b)
```

And have it automatically choose the implementation of the functions I need^{1}.

To do that, though, I’ll need to write the base functions, agnostic of the type of `b`

. I *can* rely on something like `Monoid`

, though:

```
instance (Ord a, Monoid b) => Monoid (Trie a b) where
mempty = Trie mempty Map.empty
mappend (Trie v k) (Trie t l) =
Trie (v <> t) (Map.unionWith (<>) k l)
```

In fact, quite a lot of functions naturally lend themselves to this fold + monoid style:

```
lookup :: (Ord a, Monoid b, Foldable f)
=> f a -> Trie a b -> b
lookup = foldr f endHere where
f e a = foldMap a . Map.lookup e . children
insert' :: (Foldable f, Ord a, Monoid b)
=> f a -> b -> Trie a b -> Trie a b
insert' xs v = foldr f b xs where
b (Trie p c) = Trie (v <> p) c
f e a (Trie n c) =
Trie n (Map.alter (Just . a . fold) e c)
```

A monoid is needed for the values, though, and neither `Bool`

nor `∀ a. Maybe a`

conform to `Monoid`

. Looking back to the implementation of the trie-set, the `(||)`

function has been replaced by `mappend`

. There *is* a newtype wrapper in `Data.Monoid`

which has exactly this behaviour, though: `Any`

.

Using that, the type signatures specialize to:

```
type TrieSet a = Trie a Any
lookup :: (Ord a, Foldable f)
=> f a -> TrieSet a -> Any
insert :: (Ord a, Foldable f)
=> f a -> Any -> TrieSet a -> TrieSet a
```

Similarly, for `Maybe`

, there’s both `First`

and `Last`

. They have the behaviour:

`First (Just x) <> First (Just y) == First (Just x)`

`Last (Just x) <> Last (Just y) == Last (Just y)`

I think it makes more sense for a value inserted into a map to overwrite whatever was there before. Since the newer value is on the left in the `mappend`

, then, `First`

makes most sense.

```
type TrieMap a b = Trie a (First b)
lookup :: (Ord a, Foldable f) => f a -> TrieMap a b -> First b
insert :: (Ord a, Foldable f)
=> f a -> First b -> TrieMap a b -> TrieMap a b
```

There are some other ways that you can interpret the monoid. For instance, subbing in `Sum Int`

gives you a bag-like trie:

```
type TrieBag a = Trie a (Sum Int)
lookup :: (Ord a, Foldable f) => f a -> TrieBag a -> Sum Int
insert :: (Ord a, Foldable f)
=> f a -> Sum Int -> TrieBag a -> TrieBag a
```

This is a set which can store multiple copies of each member. Turned the other way around, a map which stores many values for each key looks like this:

```
type TrieBin a b = Trie a [b]
lookup :: (Ord a, Foldable f) => f a -> TrieBin a b -> [b]
insert :: (Ord a, Foldable f)
=> f a -> [b] -> TrieBin a b -> TrieBin a b
```

This method so far isn’t really satisfying, though. Really, the `insert`

signatures should look like this:

```
insert :: (Ord a, Foldable f)
=> f a -> b -> TrieMap a b -> TrieMap a b
insert :: (Ord a, Foldable f)
=> f a -> b -> TrieBin a b -> TrieBin a b
```

Modifying insert slightly, you can get exactly that:

```
insert :: (Foldable f, Ord a, Applicative c, Monoid (c b))
=> f a -> b -> Trie a (c b) -> Trie a (c b)
insert xs v = foldr f b xs where
b (Trie p c) = Trie (pure v <> p) c
f e a (Trie n c) = Trie n (Map.alter (Just . a . fold) e c)
```

`pure`

from `Applicative`

is needed for the “embedding”.

Similarly, the “inserting” for the set-like types isn’t really right. The value argument is out of place. This should be the signature:

```
add :: (Ord a, Foldable f)
=> f a -> TrieSet a -> TrieSet a
add :: (Ord a, Foldable f)
=> f a -> TrieBin a -> TrieBin a
```

In particular, while we have an “empty” thing (0, False) for monoids, we need a “one” thing (1, True) for this function. A semiring^{2} gives this exact method:

```
class Monoid a => Semiring a where
one :: a
mul :: a -> a -> a
instance Num a => Semiring (Sum a) where
one = 1
mul = (*)
instance Semiring Any where
one = Any True
Any x `mul` Any y = Any (x && y)
```

This class is kind of like a combination of both monoid wrappers for both `Int`

and `Bool`

. You could take advantage of that:

```
class (Monoid add, Monoid mult)
=> SemiringIso a add mult | a -> add, a -> mult where
toAdd :: a -> add
fromAdd :: add -> a
toMult :: a -> mult
fromMult :: mult -> a
(<+>), (<.>) :: SemiringIso a add mult => a -> a -> a
x <+> y = fromAdd (toAdd x <> toAdd y)
x <.> y = fromMult (toMult x <> toMult y)
instance SemiringIso Int (Sum Int) (Product Int) where
toAdd = Sum
fromAdd = getSum
toMult = Product
fromMult = getProduct
instance SemiringIso Bool Any All where
toAdd = Any
fromAdd = getAny
toMult = All
fromMult = getAll
```

But it seems like overkill.

Anyway, assuming that we have the functions from `Semiring`

, here’s the `add`

function:

```
add :: (Foldable f, Ord a, Semiring b)
=> f a -> Trie a b -> Trie a b
add xs = foldr f b xs where
b (Trie p c) = Trie (one <> p) c
f e a (Trie n c) =
Trie n (Map.alter (Just . a . fold) e c)
```

Now, expressions can be built up without specifying the specific monoid implementation, and the whole behaviour can be changed with a type signature:

`ans = lookup "abc" (fromList ["abc", "def", "abc", "ghi"])`

```
ans :: Sum Int
Sum {getSum = 2}
```

```
ans :: Any
Any {getAny = True}
```

Slightly fuller implementations of all of these are available here.

Kind of like program inference in lieu of type inference↩

This isn’t really a very good definition of semiring. While Haskell doesn’t have this class in base, Purescript has it in their prelude.↩

Tags: Swift

So I don’t really know what KVC is, or much about `performSelector`

functions. This blogpost, from Brent Simmons, let me know a little bit about why I would want to use them.

It centred around removing code repetition of this type:

```
if localObject.foo != serverObject.foo {
localObject.foo = serverObject.foo
}
if localObject.bar != serverObject.bar {
localObject.bar = serverObject.bar // There was an (intentional)
} // bug here in the original post
```

To clean up the code, Brent used selector methods. At first, I was a little uncomfortable with the solution. As far as I could tell, the basis of a lot of this machinery used functions with types like this:

```
func get(fromSelector: String) -> AnyObject?
func set(forSelector: String) -> ()
```

Which *seems* to be extremely dynamic. Stringly-typed and all that. Except that there are two different things going on here. One is the dynamic stuff; the ability to get rid of types when you need to. The other, though, has *nothing* to do with types. The other idea is being able to pass around something which can access the property (or method) of an object. Let’s look at the code that was being repeated:

```
if localObject.foo != serverObject.foo {
localObject.foo = serverObject.foo
}
if localObject.bar != serverObject.bar {
localObject.bar = serverObject.bar
}
```

The logical, obvious thing to do here is try refactor out the common elements. In fact, the only things that *differ* between the two actions above are the `foo`

and `bar`

. It would be great to be able to write a function like this:

```
func checkThenUpdate(selector) {
if localObject.selector != serverObject.selector {
localObject.selector = serverObject.selector
}
}
```

And then maybe a single line like this:

`[foo, bar, baz].forEach(checkThenUpdate)`

That’s pretty obviously better. It’s just good programming: when faced with repetition, find the repeated part, and abstract it out. Is it more *dynamic* than the repetition, though? I don’t think so. All you have to figure out is an appropriate type for the selector, and you can keep all of your static checking. To me, it seems a lot like a lens:

```
struct Lens<Whole, Part> {
let get: Whole -> Part
let set: (Whole, Part) -> Whole
}
```

(This is a lens similar to the ones used in the data-lens library, in contrast to van Laarhoven lenses, or LensFamilies. LensFamilies are used in the lens package, and they allow you to change the type of the `Part`

. They’re also just normal functions, rather than a separate type, so you can manipulate them in a pretty standard way. Swift’s type system isn’t able to model those lenses, though, unfortunately.) It has two things: a getter and a setter. The getter is pretty obvious: it takes the object, and returns the property. The setter is a little more confusing. It’s taking an object, and the new property you want to stick in to the object, and returns the object with that property updated. For instance, if we were to make a `Person`

:

```
struct LocalPerson {
var age: Int
var name: String
}
```

We could then have a lens for the `name`

field like this:

```
let localName: Lens<LocalPerson,String> = Lens(
get: { p in p.name },
set: { (oldPerson,newName) in
var newPerson = oldPerson
newPerson.name = newName
return newPerson
}
)
```

And you’d use it like this:

```
let caoimhe = LocalPerson(age: 46, name: "caoimhe")
localName.get(caoimhe) // 46
localName.set(caoimhe, "breifne") // LocalPerson(age: 46, name: "breifne")
```

Straight away, we’re able to do (something) like the `checkThenUpdate`

function:

```
func checkThenUpdate
<A: Equatable>
(localLens: Lens<LocalPerson,A>, serverLens: Lens<ServerPerson,A>) {
let serverProp = serverLens.get(serverObject)
if localLens.get(localObject) != serverProp {
localObject = localLens.set(localObject,serverProp)
}
}
```

And it could be called pretty tersely:

`checkThenUpdate(localName, serverLens: serverName)`

The biggest problem with this approach, obviously, is the boilerplate. In Haskell, that’s solved with Template Haskell, so the lens code is generated for you. (I’d love to see something like that in Swift) There’s a protocol-oriented spin on lenses, also. One of the variants on lenses in Haskell are called “classy-lenses”. That’s where, instead of just generating a lens with the same name as the field it looks into, you generate a typeclass (protocol) for anything with that lens. In Swift, it might work something like this:

```
struct Place {
var name: String
}
// Instead of just having a lens for the name field, have a whole protocol
// for things with a name field:
protocol HasName {
associatedtype Name
static var name: Lens<Self,Name> { get }
var name: Name { get set }
}
// Because the mutable property is included in the protocol, you can rely on
// it in extensions:
extension HasName {
static var name: Lens<Self,Name> {
return Lens(
get: {$0.name},
set: { (w,p) in
var n = w
n.name = p
return n
}
)
}
var name: Name {
get { return Self.name.get(self) }
set { self = Self.name.set(self,newValue) }
}
}
// This way, you can provide either the lens or the property, and you get the
// other for free.
extension Place: HasName {}
// Then, you can rely on that protocol, and all of the types:
func checkEqualOnNames
<A,B where A: HasName, B: HasName, A.Name: Equatable, A.Name == B.Name>
(x: A, _ y: B) -> Bool {
return x.name == y.name
}
```

This protocol lets you do a kind of static `respondsToSelector`

, with all of the types intact. Other people have spoken about the other things you can do with lenses in Swift (Brandon Williams - Lenses in Swift), like composing them together, chaining operations, etc. (One other thing they can emulate is method cascading) Unfortunately, in current Swift, the boilerplate makes all of this a little unpleasant. Still, they’re an interesting idea, and they show how a good type system needn’t always get in the way.

Tags: Haskell

There’s a whole family of Haskell brainteasers surrounding one function: `foldr`

. The general idea is to convert some function on lists which uses recursion into one that uses `foldr`

. `map`

, for instance:

```
map :: (a -> b) -> [a] -> [b]
map f = foldr (\e a -> f e : a) []
```

Some can get a little trickier. `dropWhile`

, for instance. (See here and here for interesting articles on that one in particular.)

```
dropWhile :: (a -> Bool) -> [a] -> [a]
dropWhile p = fst . foldr f ([],[]) where
f e ~(xs,ys) = (if p e then xs else zs, zs) where zs = e : ys
```

One function which was a little harder to convert than it first seemed was `zip`

.

Here’s the first (non) solution:

```
zip :: [a] -> [b] -> [(a,b)]
zip = foldr f (const []) where
f x xs (y:ys) = (x,y) : xs ys
f _ _ [] = []
```

The problem with the above isn’t that it doesn’t work: it does. The problem is that it’s not *really* using `foldr`

. It’s only using it on the first list: there’s still a manual uncons being performed on the second. Ideally, I would want the function to look something like this:

```
zip :: [a] -> [b] -> [(a,b)]
zip xs ys = foldr f (\_ _ -> []) xs (foldr g (const []) ys)
```

The best solution I found online only dealt with `Fold`

s, not `Foldable`

s. You can read it here.

Reworking the solution online for `Foldable`

s, the initial intuition is to have the `foldr`

on the `ys`

produce a function which takes an element of the `xs`

, and returns a function which takes an element of the `xs`

, and so on, finally returning the created list. The *problem* with that approach is the types involved:

```
zip :: [a] -> [b] -> [(a,b)]
zip xs = foldr f (const []) xs . foldr g (\_ _ -> []) where
g e2 r2 e1 r1 = (e1,e2) : (r1 r2)
f e r x = x e r
```

You get the error:

`Occurs check: cannot construct the infinite type: t0 ~ a -> (t0 -> [(a, b)]) -> [(a, b)]`

.

Haskell’s typechecker doesn’t allow for infinitely recursive types.

You’ll be familiar with this problem if you’ve ever tried to encode the Y-combinator, or if you’ve fiddled around with the recursion-schemes package. You might also be familiar with the solution: a `newtype`

, encapsulating the recursion. In this case, the `newtype`

looks very similar to the signature for `foldr`

:

```
newtype RecFold a b =
RecFold { runRecFold :: a -> (RecFold a b -> b) -> b }
```

Now you can insert and remove the `RecFold`

wrapper, helping the typechecker to understand the recursive types as it goes:

```
zip :: [a] -> [b] -> [(a,b)]
zip xs =
foldr f (const []) xs . RecFold . foldr g (\_ _ -> []) where
g e2 r2 e1 r1 = (e1,e2) : (r1 (RecFold r2))
f e r x = runRecFold x e r
```

As an aside, the performance characteristics of the `newtype`

wrapper are totally opaque to me. There may be significant improvements by using `coerce`

from Data.Coerce, but I haven’t looked into it.

The immediate temptation from the function above is to generalise it. First to `zipWith`

, obviously:

```
zipWith :: (a -> b -> c) -> [a] -> [b] -> [c]
zipWith c xs =
foldr f (const []) xs . RecFold . foldr g (\_ _ -> []) where
g e2 r2 e1 r1 = c e1 e2 : (r1 (RecFold r2))
f e r x = runRecFold x e r
```

What’s maybe a little more interesting, though, would be a `foldr`

on two lists. Something which folds through both at once, using a supplied combining function:

```
foldr2 :: (Foldable f, Foldable g)
=> (a -> b -> c -> c)
-> c -> f a -> g b -> c
foldr2 c i xs =
foldr f (const i) xs . RecFold . foldr g (\_ _ -> i) where
g e2 r2 e1 r1 = c e1 e2 (r1 (RecFold r2))
f e r x = runRecFold x e r
```

Of course, once you can do two, you can do three:

```
foldr3 :: (Foldable f, Foldable g, Foldable h)
=> (a -> b -> c -> d -> d)
-> d -> f a -> g b -> h c -> d
foldr3 c i xs ys =
foldr f (const i) xs . RecFold . foldr2 g (\_ _ -> i) ys where
g e2 e3 r2 e1 r1 = c e1 e2 e3 (r1 (RecFold r2))
f e r x = runRecFold x e r
```

And so on.

There’s the added benefit that the above functions work on much more than just lists.

Getting a little formal about the above functions, a `fold`

can be described as a catamorphism. This is a name for a pattern of breaking down some recursive structure. There’s a bunch of them in the recursion-schemes package. The question is, then: can you express the above as a kind of catamorphism? Initially, using the same techniques as before, you can:

```
newtype RecF f a = RecF { unRecF :: Base f (RecF f a -> a) -> a }
zipo :: (Functor.Foldable f, Functor.Foldable g)
=> (Base f (RecF g c) -> Base g (RecF g c -> c) -> c)
-> f -> g -> c
zipo alg xs ys = cata (flip unRecF) ys (cata (RecF . alg) xs)
```

Then, coming full circle, you get a quite nice encoding of `zip`

:

```
zip :: [a] -> [b] -> [(a,b)]
zip = zipo alg where
alg Nil _ = []
alg _ Nil = []
alg (Cons x xs) (Cons y ys) = (x, y) : ys xs
```

However, the `RecF`

is a little ugly. In fact, it’s possible to write the above without any recursive types, using the RankNTypes extension. (It’s possible that you could do the same with `foldr2`

as well, but I haven’t figured it out yet)

You can actually use a `newtype`

that’s provided by the recursion-schemes library as-is. It’s `Mu`

. This is required for an encoding of the Y-combinator. It’s usually presented in this form:

`newtype Mu a = Roll { unroll :: Mu a -> a }`

However, in the recursion-schemes package, its definition looks like this:

`newtype Mu f = Mu (forall a. (f a -> a) -> a)`

No recursion! The `zipo`

combinator above can be written using `Mu`

like so:

```
zipo :: (Functor.Foldable f, Functor.Foldable g)
=> (Base f (Mu (Base g) -> c) -> Base g (Mu (Base g)) -> c)
-> f -> g -> c
zipo alg xs = cata (\x -> alg x . project) xs . refix
```

And the new version of `zip`

has a slightly more natural order of arguments:

```
zip :: [a] -> [b] -> [(a,b)]
zip = zipo alg where
alg Nil _ = []
alg _ Nil = []
alg (Cons x xs) (Cons y ys) = (x,y) : xs ys
```

There’s one more issue, though, that’s slightly tangential. A lot of the time, the attraction of rewriting functions using folds and catamorphisms is that the function becomes more general: it no longer is restricted to lists. For `zip`

, however, there’s still a pesky list left in the signature:

`zip :: (Foldable f, Foldable g) => f a -> g b -> [(a,b)]`

It would be a little nicer to be able to zip through something *preserving* the structure of one of the things being zipped through. For no reason in particular, let’s assume we’ll preserve the structure of the first argument. The function will have to account for the second argument running out before the first, though. A `Maybe`

can account for that:

```
zipInto :: (Foldable f, Foldable g)
=> (a -> Maybe b -> c)
-> f a -> g b -> f c
```

If the second argument runs out, `Nothing`

will be passed to the combining function.

It’s clear that this isn’t a *fold* over the first argument, it’s a *traversal*. A first go at the function uses the state monad, but restricts the second argument to a list:

```
zipInto :: Traversable f => (a -> Maybe b -> c) -> f a -> [b] -> f c
zipInto c xs ys = evalState (traverse f xs) ys where
f x = do
h <- gets uncons
case h of
Just (y,t) -> do
put t
pure (c x (Just y))
Nothing -> pure (c x Nothing)
```

That code can be cleaned up a little:

```
zipInto :: Traversable f => (a -> Maybe b -> c) -> f a -> [b] -> f c
zipInto c = evalState . traverse (state . f . c) where
f x [] = (x Nothing, [])
f x (y:ys) = (x (Just y), ys)
```

But really, the uncons needs to go. Another `newtype`

wrapper is needed, and here’s the end result:

```
newtype RecAccu a b =
RecAccu { runRecAccu :: a -> (RecAccu a b, b) }
zipInto :: (Traversable t, Foldable f)
=> (a -> Maybe b -> c) -> t a -> f b -> t c
zipInto f xs =
snd . flip (mapAccumL runRecAccu) xs . RecAccu . foldr h i where
i e = (RecAccu i, f e Nothing)
h e2 a e1 = (RecAccu a, f e1 (Just e2))
```

Part 1 of a 2-part series on tries

Tags: Haskell

A Trie is one of those data structures that I find myself writing very early on in almost every language I try to learn. It’s elegant and interesting, and easy enough to implement.

I usually write a version that is a set-like data structure, rather than a mapping type, for simplicity’s sake. It stores sequences, in a prefix-tree structure. It has a map (dictionary) where the keys are the first element of every sequence it stores, and the values are the Tries which store the rest of the sequence. It also has a boolean tag, representing whether or not the current Trie is a Trie on which a sequence ends. Here’s what the type looks like in Haskell:

```
data Trie a = Trie { endHere :: Bool
, getTrie :: Map a (Trie a)
} deriving (Eq)
```

Now, inserting into the Trie is easy. You just `uncons`

on a list, and insert the head into the map, with the value being the tail inserted into whatever existed at that key before:

```
empty :: Trie a
empty = Trie False Map.empty
insertRec :: Ord a => [a] -> Trie a -> Trie a
insertRec [] (Trie _ m) = Trie True m
insertRec (x:xs) (Trie e m) =
Trie e (Map.alter (Just . insertRec xs . fromMaybe empty) x m)
```

Searching is simple, also. For the empty list, you just check if the Trie has its `endHere`

tag set to `True`

, otherwise, you uncons, search the map, and query the Trie with the tail if it eas found, or just return `False`

if it was not:

```
memberRec :: Ord a => [a] -> Trie a -> Bool
memberRec [] (Trie e _) = e
memberRec (x:xs) (Trie _ m) =
fromMaybe False (memberRec xs <$> Map.lookup x m)
```

Here’s my problem. *Both* of those functions have the same pattern:

```
f [] = ...
f (x:xs) = ...
```

Any good Haskeller should be *begging* for a fold at this stage. But it proved a little trickier than I’d imagined. Take `member`

, for instance. You want to fold over a list, with the base case being the tag on the Trie:

```
member :: Ord a => [a] -> Trie a -> Bool
member = foldr f base where
base = ???
f e a = Map.lookup e ???
```

Where do you get the base case from, though? You have to specify it from the beginning, but the variable you’re looking for is nested deeply into the Trie. How can you look into the Trie, without traversing the list, to find the tag, *at the beginning of the function?*

That had been my issue for a while. Every time I cam back to writing a Trie, I would see the pattern, try and write `insert`

and `member`

with a fold, and remember again the trouble I had had with it in the past. Recently, though, I saw a different problem, that gave me an idea for a solution.

Rewrite

`dropWhile`

using`foldr`

It’s a (semi) well-known puzzle, that’s maybe a little more difficult than it seems at first. Here, for instance, was my first attempt at it:

```
dropWhileWrong :: (a -> Bool) -> [a] -> [a]
dropWhileWrong p = foldr f [] where
f e a | p e = a
| otherwise = e:a
```

Yeah. That’s `filter`

, not `dropWhile`

:

```
dropWhileWrong (<5) [1, 3, 6, 3, 1]
[6]
```

Here was my final solution:

```
dropWhileCount :: (a -> Bool) -> [a] -> [a]
dropWhileCount p l = drop (foldr f 0 l) l where
f e a | p e = a + 1
| otherwise = 0
```

After the problem I found this issue of The Monad Reader, which talks about the same problem. In my `drop`

version, I had been counting the number of items to drop as I went, adding one for every element that passed the test. The corresponding version in the article had been building up `tail`

functions, using `.`

to add them together:

```
dropWhileTail :: (a -> Bool) -> [a] -> [a]
dropWhileTail p l = (foldr f id l) l where
f e a | p e = tail . a
| otherwise = id
```

A quick visit to pointfree.io can generate some monadic pointsfree magic:

```
dropWhilePf :: (a -> Bool) -> [a] -> [a]
dropWhilePf p = join (foldr f id) where
f e a | p e = tail . a
| otherwise = id
```

Now, the final version in the article did *not* use this technique, as it was very inefficient. It used some cleverness beyond the scope of this post. The second-from-last version I quite liked, though:

```
dropWhileFp :: (a -> Bool) -> [a] -> [a]
dropWhileFp p l = foldr f l l where
f e a | p e = tail a
| otherwise = l
```

However, the idea of building up a function in a fold gave me an idea for adapting it to some of the Trie functions.

Let’s start with `member`

. It needs to fold over a list, and generate a function which acts on a Trie:

```
member :: Ord a => [a] -> Trie a -> Bool
member = foldr f base
```

The `base`

is the function being built up: the final part of the function chain. Each part of the function is generated based on each element of the list, and then chained with the base using `.`

:

```
member = foldr f base where
f e a = ??? . a
```

The base here is what’s called when the list is empty. Here’s what it looked like in the explicit recursion version:

`member [] (Trie e _) = e`

We could simplify this by using record syntax, and `getTrie`

:

`member [] t = getTrie t`

And this has an obvious pointsfree version:

`member [] = getTrie`

That fits for the base case. It’s just a function:

```
member = foldr f endHere where
f e a = ??? . a
```

Then, how to combine it. That’s easy enough, actually. It accesses the map, searches it for the key, and calls the accumulating function on it. If it’s not found in the map, just return `False`

. Here it is:

```
member :: Ord a => [a] -> Trie a -> Bool
member = foldr f endHere where
f e a = fromMaybe False . fmap a . Map.lookup e . getTrie
```

One of the other standard functions for a Trie is returning the “completions” for a given sequence. It’s a very similar function to `member`

, actually: instead of calling `endHere`

on the final Trie found, though, just return the Trie itself. And the thing to return if any given element of the sequence isn’t found is just an empty Trie:

```
complete :: Ord a => [a] -> Trie a -> Trie a
complete = foldr f id where
f e a = fromMaybe empty . fmap a . Map.lookup e . getTrie
```

In fact, you could abstract out the commonality here:

```
follow :: Ord a => c -> (Trie a -> c) -> [a] -> Trie a -> c
follow ifMiss onEnd = foldr f onEnd where
f e a = fromMaybe ifMiss . fmap a . Map.lookup e . getTrie
memberAbs :: Ord a => [a] -> Trie a -> Bool
memberAbs = follow False endHere
completeAbs :: Ord a => [a] -> Trie a -> Trie a
completeAbs = follow empty id
```

`insert`

is another deal entirely. In `member`

, the fold was tunneling into a Trie, applying the accumulator function to successively deeper Tries, and returning a result based on the final Trie. `insert`

needs to do the same tunneling - but the Trie returned needs to be the *outer* Trie.

It turns out it’s not that difficult. Instead of “building up a function” that is then applied to a Trie, here a function is “sent” into the inner Tries. The cool thing here is that the function being sent hasn’t been generated yet.

Here’s some more illustration of what I mean. Start off with the normal `foldr`

:

```
insert :: Ord a => [a] -> Trie a -> Trie a
insert = foldr f (\(Trie _ m) -> Trie True m)
```

With the final function to be applied being one that just flips the `endHere`

tag to `True`

. Then `f`

: this is going to act *over* the map of the Trie that it’s called on. It’s useful to define a function just for that:

```
overMap :: Ord b
=> (Map.Map a (Trie a)
-> Map.Map b (Trie b))
-> Trie a
-> Trie b
overMap f (Trie e m) = Trie e (f m)
```

Then, it will look up the next element of the sequence in the Trie, and apply the accumulating function to it. (if it’s not found it will provide an empty Trie instead) Simple!

```
insert :: Ord a => [a] -> Trie a -> Trie a
insert = foldr f (\(Trie _ m) -> Trie True m) where
f e a =
overMap (Map.alter (Just . a . fold) e)
```

I think this is really cool: with just a `foldr`

, you’re burrowing into a Trie, changing it, and burrowing back out again.

This is always the tricky one with a Trie. You *can* just follow a given sequence down to its tag, and flip it from on to off. But that doesn’t remove the sequence itself from the Trie. So maybe you just delete the sequence - but that doesn’t work either. How do you know that there are no other sequences stored below the one you were examining?

What you need to do is to send a function into the Trie, and have it report back as to whether or not it stores other sequences below it. So this version of `foldr`

is going to burrow into the Trie, like `member`

; maintain the outer Trie, like `insert`

; but *also* send messages back up to the outer functions. Cool!

The way to do the “message sending” is with `Maybe`

. If the function you send into the Trie to delete the end of the sequence returns `Nothing`

, then it signifies that you can delete that member. Luckily, the `alter`

function on `Data.Map`

works well with this:

```
alter :: Ord k
=> (Maybe a -> Maybe a)
-> k
-> Map k a
-> Map k a
```

Its first argument is a function which is given the result of looking up its *second* argument. If the function returns `Nothing`

, that key-value pair in the map is deleted (if it was there). If it returns `Just`

something, though, that key-value pair is added. In the delete function, we can chain the accumulating function with `=<<`

. This will skip the rest of the accumulation if any part of the sequence isn’t found. The actual function we’re chaining on is `nilIfEmpty`

, which checks if a given Trie is empty, and returns `Just`

the Trie if it’s not, or `Nothing`

otherwise.

Here’s the finished version:

```
delete :: Ord a => [a] -> Trie a -> Trie a
delete = (fromMaybe empty .) . foldr f i where
i (Trie _ m) | Map.null m = Nothing
| otherwise = Just (Trie False m)
f e a = nilIfEmpty . overMap (Map.alter (a =<<) e)
null :: Trie a -> Bool
null (Trie e m) = (not e) && (Map.null m)
nilIfEmpty :: Trie a -> Maybe (Trie a)
nilIfEmpty t | null t = Nothing
| otherwise = Just t
```

So how about folding the Trie itself? Same trick: build up a function with a fold. This time, a fold over the map, not a list. And the function being built up is a cons operation. When you hit a `True`

tag, fire off an empty list to the built-up function, allowing it to evaluate:

```
foldrTrie :: ([a] -> b -> b) -> b -> Trie a -> b
foldrTrie f i (Trie a m) = Map.foldrWithKey ff s m where
s = if a then f [] i else i
ff k = flip (foldrTrie $ f . (k :))
```

Unfortunately, it’s not easy to make the Trie *conform* to `Foldable`

. It is possible, and it’s what I’m currently trying to figure out, but it’s non-trivial.