## Single-Pass Huffman Coding

Posted on February 17, 2018

While working on something else, I figured out a nice Haskell implementation of Huffman coding, and I thought I’d share it here. I’ll go through a few techniques for transforming a multi-pass algorithm into a single-pass one first, and then I’ll show how to use them for Huffman. If you just want to skip to the code, it’s provided at the end.1

The algorithm isn’t single-pass in the sense of Adaptive Huffman Coding: it still uses the normal Huffman algorithm, but the input is transformed in the same traversal that builds the tree to transform it.

## Circular Programming

There are several techniques for turning multi-pass algorithms into single-pass ones in functional languages. Perhaps the most famous is circular programming: using laziness to eliminate a pass. Bird (1984) used this to great effect in solving the repmin problem:

Given a tree of integers, replace every integer with the minimum integer in the tree, in one pass.

For an imperative programmer, the problem is relatively easy: first, write the code to find the minimum value in the tree in the standard way, using a loop and a “smallest so far” accumulator. Then, inside the loop, after updating the accumulator, set the value of the leaf to be a reference to the accumulator.

At first, that solution may seem necessarily impure: we’re using global, mutable state to update many things at once. However, as the paper shows, we can claw back purity using laziness:

data Tree a = Leaf a | Tree a :*: Tree a

repMin :: Tree Integer -> Tree Integer
repMin xs = ys where
(m, ys) = go xs
go (Leaf x) = (x, Leaf m)
go (xs :*: ys) = (min x y, xs' :*: ys')
where
(x,xs') = go xs
(y,ys') = go ys

## There and Back Again

Let’s say we don’t have laziness at our disposal: are we hosed? No!2 Danvy and Goldberg (2005) explore this very issue, by posing the question:

Given two lists, xs and ys, can you zip xs with the reverse of ys in one pass?

The technique used to solve the problem is named “There and Back Again”; it should be clear why from one of the solutions:

convolve xs ys = walk xs const where
walk [] k = k [] ys
walk (x:xs) k = walk xs (\r (y:ys) -> k ((x,y) : r) ys)

The traversal of one list builds up the function to consume the other. We could write repmin in the same way:

repMin = uncurry (\$) . go where
go (Leaf x) = (Leaf, x)
go (xs :*: ys) = (\m -> xs' m :*: ys' m, min xm ym) where
(xs',xm) = go xs
(ys',ym) = go ys

## Cayley Representations

If you’re doing a lot of appending to some list-like structure, you probably don’t want to use actual lists: you’ll end up traversing the left-hand-side of the append many more times than necessary. A type you can drop in to use instead is difference lists (Hughes 1986):

type DList a = [a] -> [a]

rep :: [a] -> DList a
rep = (++)

abs :: DList a -> [a]
abs xs = xs []

append :: DList a -> DList a -> DList a
append = (.)

append is $\mathcal{O}(1)$ in this representation. In fact, for any monoid with a slow mappend, you can use the same trick: it’s called the Cayley representation, and available as Endo in Data.Monoid.

rep :: Monoid a => a -> Endo a
rep x = Endo (mappend x)

abs :: Monoid a => Endo a -> a
abs (Endo f) = f mempty

instance Monoid (Endo a) where
mempty = Endo id
mappend (Endo f) (Endo g) = Enfo (f . g)

You can actually do the same transformation for “monoids” in the categorical sense: applying it to monads, for instance, will give you codensity (Rivas and Jaskelioff 2014).

## Traversable

Looking back—just for a second—to the repmin example, we should be able to spot a pattern we can generalize. There’s really nothing tree-specific about it, so why can’t we apply it to lists? Or other structures, for that matter? It turns out we can: the mapAccumL function is tailor-made to this need:

repMin :: Traversable t => t Integer -> t Integer
repMin xs = ys where
(~(Just m), ys) = mapAccumL f Nothing xs
f Nothing x = (Just x, m)
f (Just y) x = (Just (min x y), m)

The tilde before the Just ensures this won’t fail on empty input.

# Huffman Coding

Finally, it’s time for the main event. Huffman coding is a very multi-pass algorithm, usually. The steps look like this:

1. Build a frequency table for each character in the input.
2. Build a priority queue from that frequency table.
3. Iteratively pop elements and combine them (into Huffman trees) from the queue until there’s only one left.
4. That Huffman tree can be used to construct the mapping from items back to their Huffman codes.
5. Traverse the input again, using the constructed mapping to replace elements with their codes.

We can’t skip any of these steps: we can try perform them all at once, though.

Let’s write the multi-pass version first. We’ll need the frequency table:

frequencies :: Ord a => [a] -> Map a Int
frequencies = Map.fromListWith (+) . map (flip (,) 1)

And a heap, ordered on the frequencies of its elements (I’m using a skew heap here):

data Heap a
= Nil
| Node {-# UNPACK #-} !Int a (Heap a) (Heap a)

instance Monoid (Heap a) where
mappend Nil ys = ys
mappend xs Nil = xs
mappend h1@(Node i x lx rx) h2@(Node j y ly ry)
| i <= j    = Node i x (mappend h2 rx) lx
| otherwise = Node j y (mappend h1 ry) ly
mempty = Nil

Next, we need to build the tree3. We can use the tree type from above.

buildTree :: Map a Int -> Maybe (Tree a)
buildTree = prune . toHeap where
toHeap = Map.foldMapWithKey (\k v -> Node v (Leaf k) Nil Nil)
prune Nil = Nothing
prune (Node i x l r) = case mappend l r of
Nil -> Just x
Node j y l' r' ->
prune (mappend (Node (i+j) (x :*: y) Nil Nil) (mappend l' r'))

Then, a way to convert between the tree and a map:

toMapping :: Ord a => Tree a -> Map a [Bool]
toMapping (Leaf x) = Map.singleton x []
toMapping (xs :*: ys) =
Map.union (fmap (True:) (toMapping xs)) (fmap (False:) (toMapping ys))

And finally, putting the whole thing together:

huffman :: Ord a => [a] -> (Maybe (Tree a), [[Bool]])
huffman xs = (tree, map (mapb Map.!) xs) where
freq = frequencies xs
tree = buildTree freq
mapb = maybe Map.empty toMapping tree

## Removing the passes

The first thing to fix is the toMapping function: at every level, it calls union, a complex and expensive operation. However, union and empty form a monoid, so we can use the Cayley representation to reduce the calls to a minimum. Next, we want to get rid of the fmaps: we can do that by assembling a function to perform the fmap as we go, as in convolve4.

toMapping :: Ord a => Tree a -> Map a [Bool]
toMapping tree = go tree id Map.empty where
go (Leaf x) k = Map.insert x (k [])
go (xs :*: ys) k =
go xs (k . (:) True) . go ys (k . (:) False)

Secondly, we can integrate the toMapping function with the buildTree function, removing another pass:

buildTree :: Ord a => Map a Int -> Maybe (Tree a, Map a [Bool])
buildTree = prune . toHeap where
toHeap = Map.foldMapWithKey (\k v -> Node v (Leaf k, leaf k) Nil Nil)
prune Nil = Nothing
prune (Node i x l r) = case mappend l r of
Nil -> Just (fmap (\k -> k id Map.empty) x)
Node j y l' r' ->
prune (mappend (Node (i+j) (cmb x y) Nil Nil) (mappend l' r'))
leaf x k = Map.insert x (k [])
node xs ys k = xs (k . (:) True) . ys (k . (:) False)
cmb (xt,xm) (yt,ym) = (xt :*: yt, node xm ym)

Finally, to remove the second pass over the list, we can copy repmin, using mapAccumL to both construct the mapping and apply it to the structure in one go.

huffman :: (Ord a, Traversable t) => t a -> (Maybe (Tree a), t [Bool])
huffman xs = (fmap fst tree, ys) where
(freq,ys) = mapAccumL f Map.empty xs
f fm x = (Map.insertWith (+) x 1 fm, mapb Map.! x)
tree = buildTree freq
mapb = maybe Map.empty snd tree

And that’s it!

# Generalization

The similarity between the repmin function and the solution above is suggestive: is there a way to encode this idea of making a multi-pass algorithm single-pass? Of course! We can use an applicative:

data Circular a b c =
Circular !a
(b -> c)

instance Functor (Circular a b) where
fmap f (Circular tally run) = Circular tally (f . run)

instance Monoid a =>
Applicative (Circular a b) where
pure x = Circular mempty (const x)
Circular fl fr <*> Circular xl xr =
Circular
(mappend fl xl)
(\r -> fr r (xr r))

liftHuffman
:: Ord a
=> a -> Circular (Map a Int) (Map a [Bool]) [Bool]
liftHuffman x = Circular (Map.singleton x 1) (Map.! x)

runHuffman
:: Ord a
=> Circular (Map a Int) (Map a [Bool]) r -> (Maybe (Tree a), r)
runHuffman (Circular smry run) =
maybe (Nothing, run Map.empty) (Just *** run) (buildTree smry)

huffman
:: (Ord a, Traversable t)
=> t a -> (Maybe (Tree a), t [Bool])
huffman = runHuffman . traverse liftHuffman

Thanks to it being an applicative, you can do all the fun lensy things with it:

showBin :: [Bool] -> String
showBin = map (bool '0' '1')

>>> let liftBin = fmap showBin . liftHuffman
>>> (snd . runHuffman . (each.traverse) liftBin) ("abb", "cad", "c")
(["01","11","11"],["00","01","10"],["00"])

Bringing us back to the start, it can also let us solve repmin!

liftRepMin :: a -> Circular (Option (Min a)) a a
liftRepMin x = Circular (pure (pure x)) id

runRepMin :: Circular (Option (Min a)) a b -> b
runRepMin (Circular m r) = r (case m of
Option (Just (Min x)) -> x)

repMin :: (Ord a, Traversable t) => t a -> t a
repMin = runRepMin . traverse liftRepMin

# Related

So the Circular type is actually just the product of reader and writer, and is closely related to the sort type.

It’s also related to the Prescient type, which I noticed after I’d written the above.

# References

Bird, Richard, Geraint Jones, and Oege De Moor. 1997. “More haste‚ less speed: Lazy versus eager evaluation.” Journal of Functional Programming 7 (5) (September): 541–547. doi:10.1017/S0956796897002827. https://ora.ox.ac.uk/objects/uuid:761a4646-60a2-4622-a1e0-ddea11507d57/datastreams/ATTACHMENT01.

Bird, R. S. 1984. “Using Circular Programs to Eliminate Multiple Traversals of Data.” Acta Inf. 21 (3) (October): 239–250. doi:10.1007/BF00264249. http://dx.doi.org/10.1007/BF00264249.

Danvy, Olivier, and Mayer Goldberg. 2005. “There and Back Again.” http://brics.dk/RS/05/3/BRICS-RS-05-3.pdf.

Hughes, R. John Muir. 1986. “A Novel Representation of Lists and Its Application to the Function "Reverse".” Information Processing Letters 22 (3) (March): 141–144. doi:10.1016/0020-0190(86)90059-1. http://www.sciencedirect.com/science/article/pii/0020019086900591.

Pippenger, Nicholas. 1997. “Pure Versus Impure Lisp.” ACM Trans. Program. Lang. Syst. 19 (2) (March): 223–238. doi:10.1145/244795.244798. http://doi.acm.org/10.1145/244795.244798.

Rivas, Exequiel, and Mauro Jaskelioff. 2014. “Notions of Computation as Monoids.” arXiv:1406.4823 [cs, math] (May). http://arxiv.org/abs/1406.4823.

1. Huffman coding single-pass implementation:

import           Data.Map.Strict  (Map)
import qualified Data.Map.Strict  as Map
import           Data.Traversable (mapAccumL)

data Heap a
= Nil
| Node {-# UNPACK #-} !Int a (Heap a) (Heap a)

instance Monoid (Heap a) where
mappend Nil ys = ys
mappend xs Nil = xs
mappend h1@(Node i x lx rx) h2@(Node j y ly ry)
| i <= j    = Node i x (mappend h2 rx) lx
| otherwise = Node j y (mappend h1 ry) ly
mempty = Nil

data Tree a = Leaf a | Tree a :*: Tree a

buildTree :: Ord a => Map a Int -> Maybe (Tree a, Map a [Bool])
buildTree = prune . toHeap where
toHeap = Map.foldMapWithKey (\k v -> Node v (Leaf k, leaf k) Nil Nil)
prune Nil = Nothing
prune (Node i x l r) = case mappend l r of
Nil -> Just (fmap (\k -> k id Map.empty) x)
Node j y l' r' ->
prune (mappend (Node (i+j) (cmb x y) Nil Nil) (mappend l' r'))
leaf x k = Map.insert x (k [])
node xs ys k = xs (k . (:) True) . ys (k . (:) False)
cmb (xt,xm) (yt,ym) = (xt :*: yt, node xm ym)

huffman :: (Ord a, Traversable t) => t a -> (Maybe (Tree a), t [Bool])
huffman xs = (fmap fst tree, ys) where
(freq,ys) = mapAccumL f Map.empty xs
f fm x = (Map.insertWith (+) x 1 fm, mapb Map.! x)
tree = buildTree freq
mapb = maybe Map.empty snd tree
2. Well, that’s a little bit of a lie. In terms of asympostics, Pippenger (1997) stated a problem that could be solved in linear time in impure Lisp, but $\Omega(n \log n)$ in pure Lisp. Bird, Jones, and Moor (1997) then produced an algorithm that could solve the problem in linear time, by using laziness. So, in some cases, laziness will give you asymptotics you can’t get without it (if you want to stay pure).

3. There’s actually a nicer version of the buildTree function which uses StateT (Heap a) Maybe, but it’s equivalent to this one under the hood, and I though might be a little distracting.

4. Something to notice about this function is that it’s going top-down and bottom-up at the same time. Combining the maps (with (.)) is done bottom-up, but building the codes is top-down. This means the codes are built in reverse order! That’s why the accumulating parameter (k) is a difference list, rather than a normal list. As it happens, if normal lists were used, the function would be slightly more efficient through sharing, but the codes would all be reversed.