## Folding Two Things at Once

Posted on April 17, 2016
Tags: ,

There’s a whole family of Haskell brainteasers surrounding one function: `foldr`. The general idea is to convert some function on lists which uses recursion into one that uses `foldr`. `map`, for instance:

``````map :: (a -> b) -> [a] -> [b]
map f = foldr (\e a -> f e : a) []``````

Some can get a little trickier. `dropWhile`, for instance. (See here and here for interesting articles on that one in particular.)

``````dropWhile :: (a -> Bool) -> [a] -> [a]
dropWhile p = fst . foldr f ([],[]) where
f e ~(xs,ys) = (if p e then xs else zs, zs) where zs = e : ys``````

## Zip

One function which was a little harder to convert than it first seemed was `zip`.

Here’s the first (non) solution:

``````zip :: [a] -> [b] -> [(a,b)]
zip = foldr f (const []) where
f x xs (y:ys) = (x,y) : xs ys
f _ _  [] = []``````

The problem with the above isn’t that it doesn’t work: it does. The problem is that it’s not really using `foldr`. It’s only using it on the first list: there’s still a manual uncons being performed on the second. Ideally, I would want the function to look something like this:

``````zip :: [a] -> [b] -> [(a,b)]
zip xs ys = foldr f (\_ _ -> []) xs (foldr g (const []) ys)``````

The best solution I found online only dealt with `Fold`s, not `Foldable`s. You can read it here.

## Recursive Types

Reworking the solution online for `Foldable`s, the initial intuition is to have the `foldr` on the `ys` produce a function which takes an element of the `xs`, and returns a function which takes an element of the `xs`, and so on, finally returning the created list. The problem with that approach is the types involved:

``````zip :: [a] -> [b] -> [(a,b)]
zip xs = foldr f (const []) xs . foldr g (\_ _ -> []) where
g e2 r2 e1 r1 = (e1,e2) : (r1 r2)
f e r x = x e r``````

You get the error:

`Occurs check: cannot construct the infinite type: t0 ~ a -> (t0 -> [(a, b)]) -> [(a, b)]`.

Haskell’s typechecker doesn’t allow for infinitely recursive types.

You’ll be familiar with this problem if you’ve ever tried to encode the Y-combinator, or if you’ve fiddled around with the recursion-schemes package. You might also be familiar with the solution: a `newtype`, encapsulating the recursion. In this case, the `newtype` looks very similar to the signature for `foldr`:

``````newtype RecFold a b =
RecFold { runRecFold :: a -> (RecFold a b -> b) -> b }``````

Now you can insert and remove the `RecFold` wrapper, helping the typechecker to understand the recursive types as it goes:

``````zip :: [a] -> [b] -> [(a,b)]
zip xs =
foldr f (const []) xs . RecFold . foldr g (\_ _ -> []) where
g e2 r2 e1 r1 = (e1,e2) : (r1 (RecFold r2))
f e r x = runRecFold x e r``````

As an aside, the performance characteristics of the `newtype` wrapper are totally opaque to me. There may be significant improvements by using `coerce` from Data.Coerce, but I haven’t looked into it.

## Generalised Zips

The immediate temptation from the function above is to generalise it. First to `zipWith`, obviously:

``````zipWith :: (a -> b -> c) -> [a] -> [b] -> [c]
zipWith c xs =
foldr f (const []) xs . RecFold . foldr g (\_ _ -> []) where
g e2 r2 e1 r1 = c e1 e2 : (r1 (RecFold r2))
f e r x = runRecFold x e r``````

What’s maybe a little more interesting, though, would be a `foldr` on two lists. Something which folds through both at once, using a supplied combining function:

``````foldr2 :: (Foldable f, Foldable g)
=> (a -> b -> c -> c)
-> c -> f a -> g b -> c
foldr2 c i xs =
foldr f (const i) xs . RecFold . foldr g (\_ _ -> i) where
g e2 r2 e1 r1 = c e1 e2 (r1 (RecFold r2))
f e r x = runRecFold x e r``````

Of course, once you can do two, you can do three:

``````foldr3 :: (Foldable f, Foldable g, Foldable h)
=> (a -> b -> c -> d -> d)
-> d -> f a -> g b -> h c -> d
foldr3 c i xs ys =
foldr f (const i) xs . RecFold . foldr2 g (\_ _ -> i) ys where
g e2 e3 r2 e1 r1 = c e1 e2 e3 (r1 (RecFold r2))
f e r x = runRecFold x e r``````

And so on.

There’s the added benefit that the above functions work on much more than just lists.

## Catamorphisms

Getting a little formal about the above functions, a `fold` can be described as a catamorphism. This is a name for a pattern of breaking down some recursive structure. There’s a bunch of them in the recursion-schemes package. The question is, then: can you express the above as a kind of catamorphism? Initially, using the same techniques as before, you can:

``````newtype RecF f a = RecF { unRecF :: Base f (RecF f a -> a) -> a }

zipo :: (Functor.Foldable f, Functor.Foldable g)
=> (Base f (RecF g c) -> Base g (RecF g c -> c) -> c)
-> f -> g -> c
zipo alg xs ys = cata (flip unRecF) ys (cata (RecF . alg) xs)``````

Then, coming full circle, you get a quite nice encoding of `zip`:

``````zip :: [a] -> [b] -> [(a,b)]
zip = zipo alg where
alg Nil _ = []
alg _ Nil = []
alg (Cons x xs) (Cons y ys) = (x, y) : ys xs``````

However, the `RecF` is a little ugly. In fact, it’s possible to write the above without any recursive types. (It’s possible that you could do the same with `foldr2` as well, but I haven’t figured it out yet)

``````zipo :: (Functor.Foldable f, Functor.Foldable g)
=> (Base f (g -> c) -> Base g g -> c)
-> f -> g -> c
zipo alg = cata (\x -> alg x . project)``````

And the new version of `zip` has a slightly more natural order of arguments:

``````zip :: [a] -> [b] -> [(a,b)]
zip = zipo alg where
alg Nil _ = []
alg _ Nil = []
alg (Cons x xs) (Cons y ys) = (x,y) : xs ys``````

## Zipping Into

There’s one more issue, though, that’s slightly tangential. A lot of the time, the attraction of rewriting functions using folds and catamorphisms is that the function becomes more general: it no longer is restricted to lists. For `zip`, however, there’s still a pesky list left in the signature:

``zip :: (Foldable f, Foldable g) => f a -> g b -> [(a,b)]``

It would be a little nicer to be able to zip through something preserving the structure of one of the things being zipped through. For no reason in particular, let’s assume we’ll preserve the structure of the first argument. The function will have to account for the second argument running out before the first, though. A `Maybe` can account for that:

``````zipInto :: (Foldable f, Foldable g)
=> (a -> Maybe b -> c)
-> f a -> g b -> f c``````

If the second argument runs out, `Nothing` will be passed to the combining function.

It’s clear that this isn’t a fold over the first argument, it’s a traversal. A first go at the function uses the state monad, but restricts the second argument to a list:

``````zipInto :: Traversable f => (a -> Maybe b -> c) -> f a -> [b] -> f c
zipInto c xs ys = evalState (traverse f xs) ys where
f x = do
h <- gets uncons
case h of
Just (y,t) -> do
put t
pure (c x (Just y))
Nothing -> pure (c x Nothing)``````

That code can be cleaned up a little:

``````zipInto :: Traversable f => (a -> Maybe b -> c) -> f a -> [b] -> f c
zipInto c = evalState . traverse (state . f . c) where
f x [] = (x Nothing, [])
f x (y:ys) = (x (Just y), ys)``````

But really, the uncons needs to go. Another `newtype` wrapper is needed, and here’s the end result:

``````newtype RecAccu a b =
RecAccu { runRecAccu :: a -> (RecAccu a b, b) }

zipInto :: (Traversable t, Foldable f)
=> (a -> Maybe b -> c) -> t a -> f b -> t c
zipInto f xs =
snd . flip (mapAccumL runRecAccu) xs . RecAccu . foldr h i where
i e = (RecAccu i, f e Nothing)
h e2 a e1 = (RecAccu a, f e1 (Just e2))``````