## Revisiting a Trie in Haskell

# Conforming to Foldable

When I ended the last post, I had a nice `Trie`

datatype, with plenty of functions, but I couldn’t get it to conform to the standard Haskell classes. The problem was to do with the type variables in the Trie:

Although the type variable is `a`

, the trie really contains *lists* of `a`

s. At least, that’s what’s reflected in functions like `insert`

, `member`

, etc.:

```
member :: (Foldable f, Ord a) => f a -> OldTrie a -> Bool
member = foldr f otEndHere where
f e a = maybe False a . Map.lookup e . otChildren
otInsert :: (Foldable f, Ord a) => f a -> OldTrie a -> OldTrie a
otInsert = foldr f b where
b (OldTrie _ c) = OldTrie True c
f e a (OldTrie n c) = OldTrie n (Map.alter (Just . a . fold) e c)
instance Ord a => Monoid (OldTrie a) where
mempty = OldTrie False mempty
OldTrie v c `mappend` OldTrie t d =
OldTrie (v || t) (Map.unionWith (<>) c d)
```

Realistically, the type which the trie contains is more like:

That signature strongly hints at GADTs, as was indicated by this stackoverflow answer. The particular GADT which is applicable here is this:

Why lists and not a general `Foldable`

? Well, for the particular use I had in mind (conforming to the `Foldable`

typeclass), I need `(:)`

.

```
instance Foldable TrieSet where
foldr f b (TrieSet e c) = if e then f [] r else r where
r = Map.foldrWithKey (flip . g . (:)) b c
g k = foldr (f . k)
```

With some more helper functions, the interface becomes pretty nice:

```
instance Show a => Show (TrieSet [a]) where
showsPrec d t =
showParen
(d > 10)
(showString "fromList " . shows (foldr (:) [] t))
instance Ord a => IsList (TrieSet [a]) where
type Item (TrieSet [a]) = [a]
fromList = foldr tsInsert mempty
toList = foldr (:) []
```

The trie has the side-effect of lexicographically sorting what it’s given:

# Further Generalizing

Most implementations of tries that I’ve seen are map-like data structures, rather than set-like. In other words, instead of holding a `Bool`

at the value position, it holds a `Maybe`

something.

```
data Trie a b = Trie
{ endHere :: b
, children :: Map a (Trie a b)
} deriving (Eq, Ord, Show, Functor, Foldable, Traversable)
```

This is a much more straightforward datatype. `Foldable`

can even be automatically derived.

However, I haven’t made the `endHere`

field a `Maybe a`

. I want to be able to write something like this:

And have it automatically choose the implementation of the functions I need^{1}.

To do that, though, I’ll need to write the base functions, agnostic of the type of `b`

. I *can* rely on something like `Monoid`

, though:

```
instance (Ord a, Monoid b) => Monoid (Trie a b) where
mempty = Trie mempty Map.empty
mappend (Trie v k) (Trie t l) =
Trie (v <> t) (Map.unionWith (<>) k l)
```

In fact, quite a lot of functions naturally lend themselves to this fold + monoid style:

```
lookup :: (Ord a, Monoid b, Foldable f)
=> f a -> Trie a b -> b
lookup = foldr f endHere where
f e a = foldMap a . Map.lookup e . children
insert' :: (Foldable f, Ord a, Monoid b)
=> f a -> b -> Trie a b -> Trie a b
insert' xs v = foldr f b xs where
b (Trie p c) = Trie (v <> p) c
f e a (Trie n c) =
Trie n (Map.alter (Just . a . fold) e c)
```

A monoid is needed for the values, though, and neither `Bool`

nor `∀ a. Maybe a`

conform to `Monoid`

. Looking back to the implementation of the trie-set, the `(||)`

function has been replaced by `mappend`

. There *is* a newtype wrapper in `Data.Monoid`

which has exactly this behaviour, though: `Any`

.

Using that, the type signatures specialize to:

```
type TrieSet a = Trie a Any
lookup :: (Ord a, Foldable f)
=> f a -> TrieSet a -> Any
insert :: (Ord a, Foldable f)
=> f a -> Any -> TrieSet a -> TrieSet a
```

Similarly, for `Maybe`

, there’s both `First`

and `Last`

. They have the behaviour:

I think it makes more sense for a value inserted into a map to overwrite whatever was there before. Since the newer value is on the left in the `mappend`

, then, `First`

makes most sense.

```
type TrieMap a b = Trie a (First b)
lookup :: (Ord a, Foldable f) => f a -> TrieMap a b -> First b
insert :: (Ord a, Foldable f)
=> f a -> First b -> TrieMap a b -> TrieMap a b
```

There are some other ways that you can interpret the monoid. For instance, subbing in `Sum Int`

gives you a bag-like trie:

```
type TrieBag a = Trie a (Sum Int)
lookup :: (Ord a, Foldable f) => f a -> TrieBag a -> Sum Int
insert :: (Ord a, Foldable f)
=> f a -> Sum Int -> TrieBag a -> TrieBag a
```

This is a set which can store multiple copies of each member. Turned the other way around, a map which stores many values for each key looks like this:

```
type TrieBin a b = Trie a [b]
lookup :: (Ord a, Foldable f) => f a -> TrieBin a b -> [b]
insert :: (Ord a, Foldable f)
=> f a -> [b] -> TrieBin a b -> TrieBin a b
```

This method so far isn’t really satisfying, though. Really, the `insert`

signatures should look like this:

```
insert :: (Ord a, Foldable f)
=> f a -> b -> TrieMap a b -> TrieMap a b
insert :: (Ord a, Foldable f)
=> f a -> b -> TrieBin a b -> TrieBin a b
```

Modifying insert slightly, you can get exactly that:

```
insert :: (Foldable f, Ord a, Applicative c, Monoid (c b))
=> f a -> b -> Trie a (c b) -> Trie a (c b)
insert xs v = foldr f b xs where
b (Trie p c) = Trie (pure v <> p) c
f e a (Trie n c) = Trie n (Map.alter (Just . a . fold) e c)
```

`pure`

from `Applicative`

is needed for the “embedding”.

Similarly, the “inserting” for the set-like types isn’t really right. The value argument is out of place. This should be the signature:

```
add :: (Ord a, Foldable f)
=> f a -> TrieSet a -> TrieSet a
add :: (Ord a, Foldable f)
=> f a -> TrieBin a -> TrieBin a
```

In particular, while we have an “empty” thing (0, False) for monoids, we need a “one” thing (1, True) for this function. A semiring^{2} gives this exact method:

```
class Monoid a => Semiring a where
one :: a
mul :: a -> a -> a
instance Num a => Semiring (Sum a) where
one = 1
mul = (*)
instance Semiring Any where
one = Any True
Any x `mul` Any y = Any (x && y)
```

This class is kind of like a combination of both monoid wrappers for both `Int`

and `Bool`

. You could take advantage of that:

```
class (Monoid add, Monoid mult)
=> SemiringIso a add mult | a -> add, a -> mult where
toAdd :: a -> add
fromAdd :: add -> a
toMult :: a -> mult
fromMult :: mult -> a
(<+>), (<.>) :: SemiringIso a add mult => a -> a -> a
x <+> y = fromAdd (toAdd x <> toAdd y)
x <.> y = fromMult (toMult x <> toMult y)
instance SemiringIso Int (Sum Int) (Product Int) where
toAdd = Sum
fromAdd = getSum
toMult = Product
fromMult = getProduct
instance SemiringIso Bool Any All where
toAdd = Any
fromAdd = getAny
toMult = All
fromMult = getAll
```

But it seems like overkill.

Anyway, assuming that we have the functions from `Semiring`

, here’s the `add`

function:

```
add :: (Foldable f, Ord a, Semiring b)
=> f a -> Trie a b -> Trie a b
add xs = foldr f b xs where
b (Trie p c) = Trie (one <> p) c
f e a (Trie n c) =
Trie n (Map.alter (Just . a . fold) e c)
```

Now, expressions can be built up without specifying the specific monoid implementation, and the whole behaviour can be changed with a type signature:

Slightly fuller implementations of all of these are available here.

Kind of like program inference in lieu of type inference↩

This isn’t really a very good definition of semiring. While Haskell doesn’t have this class in base, Purescript has it in their prelude.↩