Some More List Algorithms
It’s been a while since I last wrote a post (I’ve been busy with my Master’s thesis, which is nearly done), so I thought I would quickly throw out some fun snippets of Haskell I had reason to write over the past couple of weeks.
Zipping With Folds
For some reason, until recently I had been under the impression that
it was impossible to fuse zips efficiently. In other words, I thought
that zip
was like tail
, in that if it was
implemented using only foldr
it would result in an
asymptotic slowdown (tail
is normally
,
implemented as a fold it’s
).
Well, it seems like this is not the case. The old zip-folding code I had looks to me now to be the correct complexity: it’s related to How To Zip Folds, by Oleg Kiselyov (although I’m using a different version of the function which can be found on the mailing list). The relevant code is as follows:
newtype Zip a b =
Zip { runZip :: a -> (Zip a b -> b) -> b }
zip :: [a] -> [b] -> [(a,b)]
zip xs ys = foldr xf xb xs (Zip (foldr yf yb ys))
where
= runZip yk x xk
xf x xk yk = []
xb _
= (x,y) : xk (Zip yk)
yf y yk x xk = [] yb _ _
There are apparently reasons
for why the Prelude’s zip
isn’t allowed to fuse both of its
arguments: I don’t fully understand them, however. (in particular the
linked page says that the fused zip would have different strictness
behaviour, but the version I have above seems to function properly).
This version of zip leads to some more fun solutions to folding puzzles, like this one:
Write a function that is equivalent to:
= reverse (zip (reverse xs) (reverse ys)) zipFromEnd xs ys
Without creating any intermediate lists.
The desired function is interesting in that, instead of lining up lists according to their first elements, it aligns them according to the ends.
>>> zipFromEnd [1,2,3] "abc"
1,'a'),(2,'b'),(3,'c')]
[(
>>> zipFromEnd [1,2,3] "abcd"
1,'b'),(2,'c'),(3,'d')]
[(
>>> zipFromEnd [1,2,3,4] "abc"
2,'a'),(3,'b'),(4,'c')] [(
The solution here is just to use foldl
, and we get the
following:
zipFromEnd :: [a] -> [b] -> [(a,b)]
= foldl xf xb xs (Zip (foldl yf yb ys)) []
zipFromEnd xs ys where
= runZip yk x xk
xf xk x yk = zs
xb _ zs
= xk (Zip yk) ((x,y) : zs)
yf yk y x xk zs = zs yb _ _ zs
Another function which is a little interesting is the “zip longest” function:
zipLongest :: (a -> a -> a) -> [a] -> [a] -> [a]
= foldr xf xb xs (Zip (foldr yf yb ys))
zipLongest c xs ys where
= runZip yk (Just x) xk
xf x xk yk = runZip zs Nothing xb
xb zs
Nothing xk = y : xk (Zip yk)
yf y yk Just x) xk = c x y : xk (Zip yk)
yf y yk (
Nothing _ = []
yb Just x) zs = x : zs (Zip yb) yb (
Finally, all of these functions rely on the Zip
type,
which is not strictly positive. This means that we can’t use it
in Agda, and it’s tricky to reason about: I wonder what it is about
functions for deforestation that tends to lead to non-strictly-positive
datatypes.
Lexicographic Permutations
The next puzzle I was interested in was finding the next lexicographic permutation of some string. In other words, given some string , you need to find another string that is a permutation of such that , and that there is no string that is a permutation of and . The Wikipedia article on the topic is excellent (and clear), but again the algorithm is described in extremely imperative terms:
- Find the largest index k such that a[k] < a[k + 1]. If no such index exists, the permutation is the last permutation.
- Find the largest index l greater than k such that a[k] < a[l].
- Swap the value of a[k] with that of a[l].
- Reverse the sequence from a[k + 1] up to and including the final element a[n].
The challenge here is to write this algorithm without doing any indexing: indexing is expensive on Haskell lists, and regardless it is cleaner to express it without.
I managed to work out the following:
nextLexPerm :: Ord a => [a] -> Maybe [a]
= Nothing
nextLexPerm [] :xs) = go1 x xs
nextLexPerm (xwhere
= Nothing
go1 _ [] :xs) = maybe (go2 i j [] xs) (Just . (i:)) (go1 j xs)
go1 i (j
go2 i j xs ys| j <= i = Nothing
| otherwise = Just (fromMaybe (j : foldl (flip (:)) (i:xs) ys) (go3 i (j:xs) ys))
= Nothing
go3 _ _ [] :ys) = go2 i j xs ys go3 i xs (j
Circular Sorting
This comes from the Rosetta Code problem Circle Sort. This is a strange little sorting algorithm, where basically you compare elements on opposite sides of an array, swapping them as needed. The example given is the following:
6 7 8 9 2 5 3 4 1
First we compare (and swap) 6
and 1
, and
then 7
and 4
, and so on, until we reach the
middle. At this point we split the array in two and perform the
procedure on each half. After doing this once it is not the case that
the array is definitely sorted: you may have to repeat the procedure
several (but finitely many) times, until no swaps are performed.
I have absolutely no idea what the practical application for such an odd algorithm would be, but it seemed like an interesting challenge to try implement it in a functional style (i.e. without indices or mutation).
The first thing we have to do is fold the list in half, so we pair up the right items. We’ve actually seen an algorithm to do this before: it’s often called the “tortoise and the hare”, and our previous use was to check if a list was a palindrome. Here’s how we implement it:
halve :: [a] -> [(a,a)]
= snd (go xs xs)
halve xs where
:ys) (_:_:zs) = f y (go ys zs)
go (y:ys) [_] = (ys,[])
go (_= (ys,[])
go ys []
:ys,zs) = (ys, (x,y) : zs)
f x (y
>>> halve [6,7,8,9,2,5,3,4,1]
6,1),(7,4),(8,3),(9,5)] [(
Notice that the 2
in the very middle of the list is
missing from the output: I’ll describe how to handle that element later
on. In the above piece of code, that 2
actually gets bound
to the underscore (in (_:ys)
) in the second clause of
go
.
Next we need to do the actual swapping: this is actually pretty straightforward, if we think of the algorithm functionally, rather than imperatively. Instead of swapping things in place, we are building up both halves of the new list, so the “swap” operation should simply decide which list each item goes into.
halve :: Ord a => [a] -> ([a],[a])
= tl (go xs xs)
halve xs where
= (lte,gt)
tl (_,lte,gt)
:ys) (_:_:zs) = swap y (go ys zs)
go (y:ys) [_] = (ys,[],[])
go (_= (ys,[],[])
go ys []
:ys,lte,gt)
swap x (y| x <= y = (ys, x : lte, y : gt)
| otherwise = (ys, y : lte, x : gt)
At this point we can also see what to do with the middle item: we’ll put it in the higher or lower list, depending on a comparison with the element it’s next to.
halve :: Ord a => [a] -> ([a],[a])
= tl (go xs xs)
halve xs where
= (lte,gt)
tl (_,lte,gt)
:ys) (_:_:zs) = swap y (go ys zs)
go (y= (ys,[],[])
go ys [] :ys) [_] = (ys,[y | e],[y | not e])
go (ywhere e = y <= head ys
:ys,lte,gt)
swap x (y| x <= y = (ys, x : lte, y : gt)
| otherwise = (ys, y : lte, x : gt)
Next, we can use this as a helper function in the overall recursive function.
circleSort :: Ord a => [a] -> [a]
= []
circleSort [] = [x]
circleSort [x] =
circleSort xs let (lte,gt) = halve xs
in circleSort lte ++ circleSort (reverse gt)
This function isn’t correct (yet). As we mentioned already, we need to run the circle sort procedure multiple times until no swaps occur. We can add in the tracking of swaps like so:
circleSort :: Ord a => [a] -> [a]
= if swapped then circleSort ks else ks
circleSort xs where
= go xs
(swapped,ks)
= (False, [])
go [] = (False, [x])
go [x] =
go xs let (s,_,lte,gt) = halve xs xs
= go lte
(sl,lte') = go (reverse gt)
(sg,gt' ) in (s || sl || sg, lte' ++ gt')
:ys) (_:_:zs) = swap y (halve ys zs)
halve (y= (False,ys,[],[])
halve ys [] :ys) [_] = (False,ys,[y | e],[y | not e])
halve (ywhere e = y <= head ys
:ys,lte,gt)
swap x (s,y| x <= y = (s ,ys, x : lte, y : gt)
| otherwise = (True,ys, y : lte, x : gt)
So at this point we actually have a working implementation of the
function, which avoids indices as intended. It has some problems still,
though. First, we call ++
, when we could be using
difference lists. Here’s the solution to that:
circleSort :: Ord a => [a] -> [a]
= if swapped then circleSort ks else ks
circleSort xs where
= go xs []
(swapped,ks)
= (False, zs)
go [] zs = (False, x:zs)
go [x] zs =
go xs zs let (s,_,lte,gt) = halve xs xs
= go lte gt'
(sl,lte') = go (reverse gt) zs
(sg,gt' ) in (s || sl || sg, lte')
:ys) (_:_:zs) = swap y (halve ys zs)
halve (y= (False,ys,[],[])
halve ys [] :ys) [_] = (False,ys,[y | e],[y | not e])
halve (ywhere e = y <= head ys
:ys,lte,gt)
swap x (s,y| x <= y = (s ,ys, x : lte, y : gt)
| otherwise = (True,ys, y : lte, x : gt)
Next we can actually rewrite the go
function to allow
for a certain amount of tail recursion (kind of):
circleSort :: Ord a => [a] -> [a]
= if swapped then circleSort ks else ks
circleSort xs where
= go xs (False,[])
(swapped,ks)
= (s,ks)
go [] (s,ks) = (s,x:ks)
go [x] (s,ks) =
go xs (s,ks) let (s',_,ls,rs) = halve s xs xs
in go ls (go (reverse rs) (s',ks))
:ys) (_:_:zs) = swap y (halve s ys zs)
halve s (y= (s,ys,[],[])
halve s ys [] :ys) [_] = (s,ys,[y | e],[y | not e])
halve s (ywhere e = y <= head ys
:ys,ls,rs)
swap x (s,y| x <= y = ( s,ys,x:ls,y:rs)
| otherwise = (True,ys,y:ls,x:rs)
Next, we call reverse
: but we can avoid the reverse by
passing a parameter which tells us which direction we’re walking down
the list. Since the swapping logic is symmetric, we’re able to just
invert some of the functions. It is a little tricky, though:
circleSort :: Ord a => [a] -> [a]
= if swapped then circleSort ks else ks
circleSort xs where
= go False xs (False,[])
(swapped,ks)
= (s,ks)
go d [] (s,ks) = (s,x:ks)
go d [x] (s,ks) =
go d xs (s,ks) let (s',_,ls,rs) = halve d s xs xs
in go False ls (go True rs (s',ks))
:ys) (_:_:zs) = swap d y (halve d s ys zs)
halve d s (y= (s,ys,[],[])
halve d s ys [] :ys) [_] = (s,ys,[y | e],[y | not e])
halve d s (ywhere e = y <= head ys
:ys,ls,rs)
swap d x (s,y| bool (<=) (<) d x y = ( d || s,ys,x:ls,y:rs)
| otherwise = (not d || s,ys,y:ls,x:rs)
So there it is! The one-pass, purely function implementation of circle sort. Very possibly the most useless piece of code I’ve ever written.