## A Trie in Haskell

## Basic Ops

A Trie is one of those data structures that I find myself writing very early on in almost every language I try to learn. It’s elegant and interesting, and easy enough to implement.

I usually write a version that is a set-like data structure, rather than a mapping type, for simplicity’s sake. It stores sequences, in a prefix-tree structure. It has a map (dictionary) where the keys are the first element of every sequence it stores, and the values are the Tries which store the rest of the sequence. It also has a boolean tag, representing whether or not the current Trie is a Trie on which a sequence ends. Here’s what the type looks like in Haskell:

```
data Trie a = Trie { endHere :: Bool
getTrie :: Map a (Trie a)
,deriving (Eq) }
```

Now, inserting into the Trie is easy. You just `uncons`

on a list, and insert the head
into the map, with the value being the tail inserted into whatever
existed at that key before:

```
empty :: Trie a
= Trie False Map.empty
empty
insertRec :: Ord a => [a] -> Trie a -> Trie a
Trie _ m) = Trie True m
insertRec [] (:xs) (Trie e m) =
insertRec (xTrie e (Map.alter (Just . insertRec xs . fromMaybe empty) x m)
```

Searching is simple, also. For the empty list, you just check if the
Trie has its `endHere`

tag set to
`True`

,
otherwise, you uncons, search the map, and query the Trie with the tail
if it eas found, or just return `False`

if it
was not:

```
memberRec :: Ord a => [a] -> Trie a -> Bool
Trie e _) = e
memberRec [] (:xs) (Trie _ m) =
memberRec (xFalse (memberRec xs <$> Map.lookup x m) fromMaybe
```

Here’s my problem. *Both* of those functions have the same
pattern:

```
= ...
f [] :xs) = ... f (x
```

Any good Haskeller should be *begging* for a fold at this
stage. But it proved a little trickier than I’d imagined. Take `member`

, for instance. You want to fold
over a list, with the base case being the tag on the Trie:

```
member :: Ord a => [a] -> Trie a -> Bool
= foldr f base where
member = ???
base = Map.lookup e ??? f e a
```

Where do you get the base case from, though? You have to specify it
from the beginning, but the variable you’re looking for is nested deeply
into the Trie. How can you look into the Trie, without traversing the
list, to find the tag, *at the beginning of the function?*

That had been my issue for a while. Every time I cam back to writing
a Trie, I would see the pattern, try and write `insert`

and `member`

with a fold, and remember again
the trouble I had had with it in the past. Recently, though, I saw a
different problem, that gave me an idea for a solution.

## The Highest Order

Rewrite

`dropWhile`

using`foldr`

It’s a (semi) well-known puzzle, that’s maybe a little more difficult than it seems at first. Here, for instance, was my first attempt at it:

```
dropWhileWrong :: (a -> Bool) -> [a] -> [a]
= foldr f [] where
dropWhileWrong p | p e = a
f e a | otherwise = e:a
```

Yeah. That’s `filter`

, not
`dropWhile`

:

```
<5) [1, 3, 6, 3, 1]
dropWhileWrong (6] [
```

Here was my final solution:

```
dropWhileCount :: (a -> Bool) -> [a] -> [a]
= drop (foldr f 0 l) l where
dropWhileCount p l | p e = a + 1
f e a | otherwise = 0
```

After the problem I found this
issue of The Monad Reader, which talks about the same problem. In my
`drop`

version, I had been counting the number of items to drop as I went,
adding one for every element that passed the test. The corresponding
version in the article had been building up `tail`

functions, using `.`

to add them
together:

```
dropWhileTail :: (a -> Bool) -> [a] -> [a]
= (foldr f id l) l where
dropWhileTail p l | p e = tail . a
f e a | otherwise = id
```

A quick visit to pointfree.io can generate some monadic pointsfree magic:

```
dropWhilePf :: (a -> Bool) -> [a] -> [a]
= join (foldr f id) where
dropWhilePf p | p e = tail . a
f e a | otherwise = id
```

Now, the final version in the article did *not* use this
technique, as it was very inefficient. It used some cleverness beyond
the scope of this post. The second-from-last version I quite liked,
though:

```
dropWhileFp :: (a -> Bool) -> [a] -> [a]
= foldr f l l where
dropWhileFp p l | p e = tail a
f e a | otherwise = l
```

However, the idea of building up a function in a fold gave me an idea for adapting it to some of the Trie functions.

## Folding Inwards

Let’s start with `member`

. It
needs to fold over a list, and generate a function which acts on a
Trie:

```
member :: Ord a => [a] -> Trie a -> Bool
= foldr f base member
```

The `base`

is the function
being built up: the final part of the function chain. Each part of the
function is generated based on each element of the list, and then
chained with the base using `.`

:

```
= foldr f base where
member = ??? . a f e a
```

The base here is what’s called when the list is empty. Here’s what it looked like in the explicit recursion version:

`Trie e _) = e member [] (`

We could simplify this by using record syntax, and `getTrie`

:

`= getTrie t member [] t `

And this has an obvious pointsfree version:

`= getTrie member [] `

That fits for the base case. It’s just a function:

```
= foldr f endHere where
member = ??? . a f e a
```

Then, how to combine it. That’s easy enough, actually. It accesses
the map, searches it for the key, and calls the accumulating function on
it. If it’s not found in the map, just return `False`

. Here it
is:

```
member :: Ord a => [a] -> Trie a -> Bool
= foldr f endHere where
member = fromMaybe False . fmap a . Map.lookup e . getTrie f e a
```

One of the other standard functions for a Trie is returning the
“completions” for a given sequence. It’s a very similar function to
`member`

, actually: instead of
calling `endHere`

on the final
Trie found, though, just return the Trie itself. And the thing to return
if any given element of the sequence isn’t found is just an empty
Trie:

```
complete :: Ord a => [a] -> Trie a -> Trie a
= foldr f id where
complete = fromMaybe empty . fmap a . Map.lookup e . getTrie f e a
```

In fact, you could abstract out the commonality here:

```
follow :: Ord a => c -> (Trie a -> c) -> [a] -> Trie a -> c
= foldr f onEnd where
follow ifMiss onEnd = fromMaybe ifMiss . fmap a . Map.lookup e . getTrie
f e a
memberAbs :: Ord a => [a] -> Trie a -> Bool
= follow False endHere
memberAbs
completeAbs :: Ord a => [a] -> Trie a -> Trie a
= follow empty id completeAbs
```

## Folding in and out

`insert`

is another deal
entirely. In `member`

, the fold
was tunneling into a Trie, applying the accumulator function to
successively deeper Tries, and returning a result based on the final
Trie. `insert`

needs to do the
same tunneling - but the Trie returned needs to be the *outer*
Trie.

It turns out it’s not that difficult. Instead of “building up a function” that is then applied to a Trie, here a function is “sent” into the inner Tries. The cool thing here is that the function being sent hasn’t been generated yet.

Here’s some more illustration of what I mean. Start off with the
normal `foldr`

:

```
insert :: Ord a => [a] -> Trie a -> Trie a
= foldr f (\(Trie _ m) -> Trie True m) insert
```

With the final function to be applied being one that just flips the
`endHere`

tag to `True`

. Then
`f`

: this is going to act
*over* the map of the Trie that it’s called on. It’s useful to
define a function just for that:

```
overMap :: Ord b
=> (Map.Map a (Trie a)
-> Map.Map b (Trie b))
-> Trie a
-> Trie b
Trie e m) = Trie e (f m) overMap f (
```

Then, it will look up the next element of the sequence in the Trie, and apply the accumulating function to it. (if it’s not found it will provide an empty Trie instead) Simple!

```
insert :: Ord a => [a] -> Trie a -> Trie a
= foldr f (\(Trie _ m) -> Trie True m) where
insert =
f e a Just . a . fold) e) overMap (Map.alter (
```

I think this is really cool: with just a `foldr`

, you’re
burrowing into a Trie, changing it, and burrowing back out again.

## Removal

This is always the tricky one with a Trie. You *can* just
follow a given sequence down to its tag, and flip it from on to off. But
that doesn’t remove the sequence itself from the Trie. So maybe you just
delete the sequence - but that doesn’t work either. How do you know that
there are no other sequences stored below the one you were
examining?

What you need to do is to send a function into the Trie, and have it
report back as to whether or not it stores other sequences below it. So
this version of `foldr`

is going
to burrow into the Trie, like `member`

; maintain the outer Trie, like
`insert`

; but *also* send
messages back up to the outer functions. Cool!

The way to do the “message sending” is with `Maybe`

. If the
function you send into the Trie to delete the end of the sequence
returns `Nothing`

, then
it signifies that you can delete that member. Luckily, the `alter`

function on `Data.Map`

works
well with this:

```
alter :: Ord k
=> (Maybe a -> Maybe a)
-> k
-> Map k a
-> Map k a
```

Its first argument is a function which is given the result of looking
up its *second* argument. If the function returns `Nothing`

, that
key-value pair in the map is deleted (if it was there). If it returns
`Just`

something, though, that key-value pair is added. In the delete function,
we can chain the accumulating function with `=<<`

.
This will skip the rest of the accumulation if any part of the sequence
isn’t found. The actual function we’re chaining on is `nilIfEmpty`

, which checks if a given
Trie is empty, and returns `Just`

the Trie
if it’s not, or `Nothing`

otherwise.

Here’s the finished version:

```
delete :: Ord a => [a] -> Trie a -> Trie a
= (fromMaybe empty .) . foldr f i where
delete Trie _ m) | Map.null m = Nothing
i (| otherwise = Just (Trie False m)
= nilIfEmpty . overMap (Map.alter (a =<<) e)
f e a
null :: Trie a -> Bool
null (Trie e m) = (not e) && (Map.null m)
nilIfEmpty :: Trie a -> Maybe (Trie a)
| null t = Nothing
nilIfEmpty t | otherwise = Just t
```

## Folding the Foldable

So how about folding the Trie itself? Same trick: build up a function
with a fold. This time, a fold over the map, not a list. And the
function being built up is a cons operation. When you hit a `True`

tag, fire
off an empty list to the built-up function, allowing it to evaluate:

```
foldrTrie :: ([a] -> b -> b) -> b -> Trie a -> b
Trie a m) = Map.foldrWithKey ff s m where
foldrTrie f i (= if a then f [] i else i
s = flip (foldrTrie $ f . (k :)) ff k
```

Unfortunately, it’s
not easy to make the Trie *conform* to `Foldable`

. It
is possible, and it’s what I’m currently trying to figure out, but it’s
non-trivial.