## Some Tricks for List Manipulation

This post is a collection of some of the tricks I’ve learned for manipulating lists in Haskell. Each one starts with a puzzle: you should try the puzzle yourself before seeing the solution!

# The Tortoise and the Hare

How can you split a list in half, in one pass, without taking its length?

This first one is a relatively well-known trick, but it occasionally comes in handy, so I thought I’d mention it. The naive way is as follows:

But it’s unsatisfying: we have to traverse the list twice, and we’re taking its length (which is almost always a bad idea). Instead, we use the following function:

```
splitHalf :: [a] -> ([a],[a])
splitHalf xs = go xs xs
where
go (y:ys) (_:_:zs) = first (y:) (go ys zs)
go ys _ = ([],ys)
```

The “tortoise and the hare” is the two arguments to `go`

: it traverses the second one twice as fast, so when it hits the end, we know that the first list must be halfway done.

# There and Back Again

Given two lists,

`xs`

and`ys`

, write a function which zips`xs`

with thereverseof`ys`

(in one pass).

There’s a lovely paper (Danvy and Goldberg 2005) which goes though a number of tricks for how to do certain list manipulations “in reverse”. Their technique is known as “there and back again”. However, I’d like to describe a different way to get to the same technique, using folds.

Whenever I need to do some list manipulation in reverse (i.e., I need the input list to be reversed), I first see if I can rewrite the function as a fold, and then just switch out `foldr`

for `foldl`

.

For our puzzle here, we need to first write `zip`

as a fold:

```
zip :: [a] -> [b] -> [(a,b)]
zip = foldr f b
where
f x k (y:ys) = (x,y) : k ys
f x k [] = []
b _ = []
```

If that looks complex, or difficult to write, don’t worry! There’s a systematic way to get to the above definition from the normal version of `zip`

. First, let’s start with a normal `zip`

:

Then, we need to turn it into a case-tree, where the first branch is on the list we want to fold over. In other words, we want the function to look like this:

To figure out the cases, we factor out the cases in the original function. Since the second clause (`zip xs [] = []`

) is only reachable when `xs /= []`

, it’s effectively a case for the `x:xs`

branch.

```
zip :: [a] -> [b] -> [(a,b)]
zip xs = case xs of
[] -> \_ -> []
x:xs -> \case
[] -> []
y:ys -> (x,y) : zip xs ys
```

Now, we rewrite the different cases to be auxiliary functions:

```
zip :: [a] -> [b] -> [(a,b)]
zip xs = case xs of
[] -> b
x:xs -> f x xs
where
b = \_ -> []
f = \x xs -> \case
[] -> []
y:ys -> (x,y) : zip xs ys
```

And finally, we *refactor* the recursive call to the first case expression.

```
zip :: [a] -> [b] -> [(a,b)]
zip xs = case xs of
[] -> b
x:xs -> f x (zip xs)
where
b = \_ -> []
f = \x xs -> \case
[] -> []
y:ys -> (x,y) : xs ys
```

Then those two auxiliary functions are what you pass to `foldr`

!

So, to reverse it, we simply take wherever we wrote `foldr f b`

, and replace it with `foldl (flip f) b`

:

```
zipRev :: [a] -> [b] -> [(a,b)]
zipRev = foldl (flip f) b
where
f x k (y:ys) = (x,y) : k ys
f x k [] = []
b _ = []
```

Of course, we’re reversing the wrong list here. Fixing that is simple:

```
zipRev :: [a] -> [b] -> [(a,b)]
zipRev = flip (foldl (flip f) b)
where
f y k (x:xs) = (x,y) : k xs
f y k [] = []
b _ = []
```

# Maintaining Laziness

Rewrite the above function without using continuations.

`zipRev`

, as written above, actually uses *continuation-passing style*. In most languages (including standard ML, which was the one used in Danvy and Goldberg (2005)), this is pretty much equivalent to a direct-style implementation (modulo some performance weirdness). In a lazy language like Haskell, though, continuation-passing style often makes things unnecessarily strict.

Consider the church-encoded pairs:

```
newtype Pair a b
= Pair
{ runPair :: forall c. (a -> b -> c) -> c
}
firstC :: (a -> a') -> Pair a b -> Pair a' b
firstC f p = Pair (\k -> runPair p (k . f))
firstD :: (a -> a') -> (a, b) -> (a', b)
firstD f ~(x,y) = (f x, y)
fstD :: (a, b) -> a
fstD ~(x,y) = x
fstC :: Pair a b -> a
fstC p = runPair p const
>>> fstC (firstC (const ()) undefined)
undefined
>>> fstD (firstD (const ()) undefined)
()
```

So it’s sometimes worth trying to avoid continuations if there is a fast direct-style solution. (alternatively, continuations can give you extra strictness when you *do* want it)

First, I’m going to write a different version of `zipRev`

, which folds on the first list, not the second.

Then, we inline the definition of `foldl`

:

Then, as a hint, we tuple up the two accumulating parameters:

What we can see here is that we have two continuations stacked on top of each other. When this happens, they can often “cancel out”, like so:

And we have our direct-style implementation!

Note 14/05/2019: the “cancel-out” explanation there is a little handwavy, as I’m sure you’ll notice. However, there are a number of excellent explanations on this stackoverflow thread which explain it much better than I ever could. Thanks to Anders Kaseorg, Will Ness, user11228628, and to Joseph Sible (2019) for asking the question.

# Manual Fusion

Detect that a list is a palindrome, in one pass.

We now know a good way to split a list in two, and a good way to zip a list with its reverse. We can *combine* the two to get a program that checks if a list is a palindrome. Here’s a first attempt:

But this is doing *three* passes!

To get around it, we can manually do some fusion. Fusion is a technique where we can spot scenarios like the following:

And translate them into a version without a list:

The trick is making sure that the consumer is written as a fold, and then we just put its `f`

and `b`

in place of the `:`

and `[]`

in the producer.

So, when we inline the definition of `splitHalf`

into `zipRev`

, we get the following:

```
zipRevHalf :: [a] -> [(a,a)]
zipRevHalf xs = snd (go xs xs)
where
go (y:ys) (_:_:zs) = f y (go ys zs)
go (_:ys) [_] = (ys,[])
go ys [] = (ys,[])
f x (y:ys,r) = (ys,(x,y):r)
isPal xs = all (uncurry (==)) (zipRevHalf xs)
```

(adding a special case for odd-length lists)

Finally, the `all (uncurry (==))`

is implemented as a fold also. So we can fuse it with the rest of the definitions:

```
isPal :: Eq a => [a] -> Bool
isPal xs = snd (go xs xs)
where
go (y:ys) (_:_:zs) = f y (go ys zs)
go (_:ys) [_] = (ys,True)
go ys [] = (ys,True)
f x (y:ys,r) = (ys,(x == y) && r)
```

You may have spotted the writer monad over `All`

there. Indeed, we can rewrite it to use the monadic bind:

```
isPal :: Eq a => [a] -> Bool
isPal xs = getAll (fst (go xs xs)) where
go (y:ys) (_:_:zs) = f y =<< go ys zs
go (_:ys) [_] = pure ys
go ys [] = pure ys
f y (z:zs) = (All (y == z), zs)
```

# Eliminating Multiple Passes with Laziness

Construct a Braun tree from a list in linear time.

This is also a very well-known trick (Bird 1984), but today I’m going to use it to write a function for constructing Braun trees.

A Braun tree is a peculiar structure. It’s a binary tree, where adjacent branches can differ in size by only 1. When used as an array, it has $\mathcal{O}(\log n)$ lookup times. It’s enumerated like so:

```
┌─7
┌3┤
│ └11
┌1┤
│ │ ┌─9
│ └5┤
│ └13
0┤
│ ┌─8
│ ┌4┤
│ │ └12
└2┤
│ ┌10
└6┤
└14
```

The objective is to construct a tree from a list in linear time, in the order defined above. Okasaki (1997) observed that, from the list:

Each level in the tree is constructed from chucks of powers of two. In other words:

From this, we can write the following function:

```
rows k [] = []
rows k xs = (k , take k xs) : rows (2*k) (drop k xs)
build (k,xs) ts = zipWith3 Node xs ts1 ts2
where
(ts1,ts2) = splitAt k (ts ++ repeat Leaf)
fromList = head . foldr build [Leaf] . rows 1
```

The first place we’ll look to eliminate a pass is the `build`

function. It combines two rows by splitting the second in half, and zipping it with the first.

We don’t need to store the length of the first list, though, as we are only using it to split the second, and we can do *that* at the same time as the zipping.

```
zipUntil :: (a -> b -> c) -> [a] -> [b] -> ([c],[b])
zipUntil _ [] ys = ([],ys)
zipUntil f (x:xs) (y:ys) = first (f x y:) (zipUntil f xs ys)
>>> zipUntil (,) [1,2] "abc"
([(1,'a'),(2,'b')],"c")
```

Using this function in `build`

looks like the following:

That top-level `zipWith`

is *also* unnecessary, though. If we make the program circular, we can produce `ts2`

as we consume it, making the whole thing single-pass.

```
build xs ts = ys
where
(ys,ts2) = zip3Node xs (ts ++ repeat Leaf) ts2
zip3Node (x:xs) (y:ys) ~(z:zs) = first (Node x y z:) (zip3Node xs ys zs)
zip3Node [] ys _ = ([], ys)
```

That `zip3Node`

is a good candidate for rewriting as a fold, also, making the whole thing look like this:

```
rows k [] = []
rows k xs = take k xs : rows (2*k) (drop k xs)
build xs ts = ys
where
(ys,zs) = foldr f b xs ts zs
f x xs (y:ys) ~(z:zs) = first (Node x y z:) (xs ys zs)
b ys _ = ([],ys)
fromList = head . foldr build (repeat Leaf) . rows 1
```

To fuse all of those definitions, we first will need to rewrite `rows`

as a fold:

```
rows xs = uncurry (:) (foldr f b xs 1 2)
where
b _ _ = ([],[])
f x k 0 j = ([], uncurry (:) (f x k j (j*2)))
f x k i j = first (x:) (k (i-1) j)
```

Once we have everything as a fold, the rest of the transformation is pretty mechanical. At the end of it all, we get the following linear-time function for constructing a Braun tree from a list:

```
fromList :: [a] -> Tree a
fromList xs = head (l (foldr f b xs 1 2))
where
b _ _ ys zs = (repeat Leaf, (repeat Leaf, ys))
l k = let (xs, ys) = uncurry k ys in xs
f x k 0 j ys zs = ([], (l (f x k j (j*2)), ys))
f x k i j ~(y:ys) ~(z:zs) = first (Node x y z:) (k (i-1) j ys zs)
```

# References

Bird, R. S. 1984. “Using Circular Programs to Eliminate Multiple Traversals of Data.” *Acta Inf.* 21 (3) (October): 239–250. doi:10.1007/BF00264249. http://dx.doi.org/10.1007/BF00264249.

Danvy, Olivier, and Mayer Goldberg. 2005. “There and Back Again.” *BRICS Report Series* 12 (3). doi:10.7146/brics.v12i3.21869. https://tidsskrift.dk/brics/article/view/21869.

Okasaki, Chris. 1997. “Three Algorithms on Braun Trees.” *Journal of Functional Programming* 7 (6) (November): 661–666. doi:10.1017/S0956796897002876. https://www.eecs.northwestern.edu/~robby/courses/395-495-2013-fall/three-algorithms-on-braun-trees.pdf.

Sible, Joseph. 2019. “How can two continuations cancel each other out?” *Stack Overflow*. https://stackoverflow.com/questions/56122022/how-can-two-continuations-cancel-each-other-out.