Some More List Algorithms
It’s been a while since I last wrote a post (I’ve been busy with my Master’s thesis, which is nearly done), so I thought I would quickly throw out some fun snippets of Haskell I had reason to write over the past couple of weeks.
Zipping With Folds
For some reason, until recently I had been under the impression that it was impossible to fuse zips efficiently. In other words, I thought that zip
was like tail
, in that if it was implemented using only foldr
it would result in an asymptotic slowdown (tail
is normally , implemented as a fold it’s ).
Well, it seems like this is not the case. The old zip-folding code I had looks to me now to be the correct complexity: it’s related to How To Zip Folds, by Oleg Kiselyov (although I’m using a different version of the function which can be found on the mailing list). The relevant code is as follows:
newtype Zip a b =
Zip { runZip :: a -> (Zip a b -> b) -> b }
zip :: [a] -> [b] -> [(a,b)]
zip xs ys = foldr xf xb xs (Zip (foldr yf yb ys))
where
= runZip yk x xk
xf x xk yk = []
xb _
= (x,y) : xk (Zip yk)
yf y yk x xk = [] yb _ _
There are apparently reasons for why the Prelude’s zip
isn’t allowed to fuse both of its arguments: I don’t fully understand them, however. (in particular the linked page says that the fused zip would have different strictness behaviour, but the version I have above seems to function properly).
This version of zip leads to some more fun solutions to folding puzzles, like this one:
Write a function that is equivalent to:
= reverse (zip (reverse xs) (reverse ys)) zipFromEnd xs ys
Without creating any intermediate lists.
The desired function is interesting in that, instead of lining up lists according to their first elements, it aligns them according to the ends.
>>> zipFromEnd [1,2,3] "abc"
1,'a'),(2,'b'),(3,'c')]
[(
>>> zipFromEnd [1,2,3] "abcd"
1,'b'),(2,'c'),(3,'d')]
[(
>>> zipFromEnd [1,2,3,4] "abc"
2,'a'),(3,'b'),(4,'c')] [(
The solution here is just to use foldl
, and we get the following:
zipFromEnd :: [a] -> [b] -> [(a,b)]
= foldl xf xb xs (Zip (foldl yf yb ys)) []
zipFromEnd xs ys where
= runZip yk x xk
xf xk x yk = zs
xb _ zs
= xk (Zip yk) ((x,y) : zs)
yf yk y x xk zs = zs yb _ _ zs
Another function which is a little interesting is the “zip longest” function:
zipLongest :: (a -> a -> a) -> [a] -> [a] -> [a]
= foldr xf xb xs (Zip (foldr yf yb ys))
zipLongest c xs ys where
= runZip yk (Just x) xk
xf x xk yk = runZip zs Nothing xb
xb zs
Nothing xk = y : xk (Zip yk)
yf y yk Just x) xk = c x y : xk (Zip yk)
yf y yk (
Nothing _ = []
yb Just x) zs = x : zs (Zip yb) yb (
Finally, all of these functions rely on the Zip
type, which is not strictly positive. This means that we can’t use it in Agda, and it’s tricky to reason about: I wonder what it is about functions for deforestation that tends to lead to non-strictly-positive datatypes.
Lexicographic Permutations
The next puzzle I was interested in was finding the next lexicographic permutation of some string. In other words, given some string , you need to find another string that is a permutation of such that , and that there is no string that is a permutation of and . The Wikipedia article on the topic is excellent (and clear), but again the algorithm is described in extremely imperative terms:
- Find the largest index k such that a[k] < a[k + 1]. If no such index exists, the permutation is the last permutation.
- Find the largest index l greater than k such that a[k] < a[l].
- Swap the value of a[k] with that of a[l].
- Reverse the sequence from a[k + 1] up to and including the final element a[n].
The challenge here is to write this algorithm without doing any indexing: indexing is expensive on Haskell lists, and regardless it is cleaner to express it without.
I managed to work out the following:
nextLexPerm :: Ord a => [a] -> Maybe [a]
= Nothing
nextLexPerm [] :xs) = go1 x xs
nextLexPerm (xwhere
= Nothing
go1 _ [] :xs) = maybe (go2 i j [] xs) (Just . (i:)) (go1 j xs)
go1 i (j
go2 i j xs ys| j <= i = Nothing
| otherwise = Just (fromMaybe (j : foldl (flip (:)) (i:xs) ys) (go3 i (j:xs) ys))
= Nothing
go3 _ _ [] :ys) = go2 i j xs ys go3 i xs (j
Circular Sorting
This comes from the Rosetta Code problem Circle Sort. This is a strange little sorting algorithm, where basically you compare elements on opposite sides of an array, swapping them as needed. The example given is the following:
6 7 8 9 2 5 3 4 1
First we compare (and swap) 6
and 1
, and then 7
and 4
, and so on, until we reach the middle. At this point we split the array in two and perform the procedure on each half. After doing this once it is not the case that the array is definitely sorted: you may have to repeat the procedure several (but finitely many) times, until no swaps are performed.
I have absolutely no idea what the practical application for such an odd algorithm would be, but it seemed like an interesting challenge to try implement it in a functional style (i.e. without indices or mutation).
The first thing we have to do is fold the list in half, so we pair up the right items. We’ve actually seen an algorithm to do this before: it’s often called the “tortoise and the hare”, and our previous use was to check if a list was a palindrome. Here’s how we implement it:
halve :: [a] -> [(a,a)]
= snd (go xs xs)
halve xs where
:ys) (_:_:zs) = f y (go ys zs)
go (y:ys) [_] = (ys,[])
go (_= (ys,[])
go ys []
:ys,zs) = (ys, (x,y) : zs)
f x (y
>>> halve [6,7,8,9,2,5,3,4,1]
6,1),(7,4),(8,3),(9,5)] [(
Notice that the 2
in the very middle of the list is missing from the output: I’ll describe how to handle that element later on. In the above piece of code, that 2
actually gets bound to the underscore (in (_:ys)
) in the second clause of go
.
Next we need to do the actual swapping: this is actually pretty straightforward, if we think of the algorithm functionally, rather than imperatively. Instead of swapping things in place, we are building up both halves of the new list, so the “swap” operation should simply decide which list each item goes into.
halve :: Ord a => [a] -> ([a],[a])
= tl (go xs xs)
halve xs where
= (lte,gt)
tl (_,lte,gt)
:ys) (_:_:zs) = swap y (go ys zs)
go (y:ys) [_] = (ys,[],[])
go (_= (ys,[],[])
go ys []
:ys,lte,gt)
swap x (y| x <= y = (ys, x : lte, y : gt)
| otherwise = (ys, y : lte, x : gt)
At this point we can also see what to do with the middle item: we’ll put it in the higher or lower list, depending on a comparison with the element it’s next to.
halve :: Ord a => [a] -> ([a],[a])
= tl (go xs xs)
halve xs where
= (lte,gt)
tl (_,lte,gt)
:ys) (_:_:zs) = swap y (go ys zs)
go (y= (ys,[],[])
go ys [] :ys) [_] = (ys,[y | e],[y | not e])
go (ywhere e = y <= head ys
:ys,lte,gt)
swap x (y| x <= y = (ys, x : lte, y : gt)
| otherwise = (ys, y : lte, x : gt)
Next, we can use this as a helper function in the overall recursive function.
circleSort :: Ord a => [a] -> [a]
= []
circleSort [] = [x]
circleSort [x] =
circleSort xs let (lte,gt) = halve xs
in circleSort lte ++ circleSort (reverse gt)
This function isn’t correct (yet). As we mentioned already, we need to run the circle sort procedure multiple times until no swaps occur. We can add in the tracking of swaps like so:
circleSort :: Ord a => [a] -> [a]
= if swapped then circleSort ks else ks
circleSort xs where
= go xs
(swapped,ks)
= (False, [])
go [] = (False, [x])
go [x] =
go xs let (s,_,lte,gt) = halve xs xs
= go lte
(sl,lte') = go (reverse gt)
(sg,gt' ) in (s || sl || sg, lte' ++ gt')
:ys) (_:_:zs) = swap y (halve ys zs)
halve (y= (False,ys,[],[])
halve ys [] :ys) [_] = (False,ys,[y | e],[y | not e])
halve (ywhere e = y <= head ys
:ys,lte,gt)
swap x (s,y| x <= y = (s ,ys, x : lte, y : gt)
| otherwise = (True,ys, y : lte, x : gt)
So at this point we actually have a working implementation of the function, which avoids indices as intended. It has some problems still, though. First, we call ++
, when we could be using difference lists. Here’s the solution to that:
circleSort :: Ord a => [a] -> [a]
= if swapped then circleSort ks else ks
circleSort xs where
= go xs []
(swapped,ks)
= (False, zs)
go [] zs = (False, x:zs)
go [x] zs =
go xs zs let (s,_,lte,gt) = halve xs xs
= go lte gt'
(sl,lte') = go (reverse gt) zs
(sg,gt' ) in (s || sl || sg, lte')
:ys) (_:_:zs) = swap y (halve ys zs)
halve (y= (False,ys,[],[])
halve ys [] :ys) [_] = (False,ys,[y | e],[y | not e])
halve (ywhere e = y <= head ys
:ys,lte,gt)
swap x (s,y| x <= y = (s ,ys, x : lte, y : gt)
| otherwise = (True,ys, y : lte, x : gt)
Next we can actually rewrite the go
function to allow for a certain amount of tail recursion (kind of):
circleSort :: Ord a => [a] -> [a]
= if swapped then circleSort ks else ks
circleSort xs where
= go xs (False,[])
(swapped,ks)
= (s,ks)
go [] (s,ks) = (s,x:ks)
go [x] (s,ks) =
go xs (s,ks) let (s',_,ls,rs) = halve s xs xs
in go ls (go (reverse rs) (s',ks))
:ys) (_:_:zs) = swap y (halve s ys zs)
halve s (y= (s,ys,[],[])
halve s ys [] :ys) [_] = (s,ys,[y | e],[y | not e])
halve s (ywhere e = y <= head ys
:ys,ls,rs)
swap x (s,y| x <= y = ( s,ys,x:ls,y:rs)
| otherwise = (True,ys,y:ls,x:rs)
Next, we call reverse
: but we can avoid the reverse by passing a parameter which tells us which direction we’re walking down the list. Since the swapping logic is symmetric, we’re able to just invert some of the functions. It is a little tricky, though:
circleSort :: Ord a => [a] -> [a]
= if swapped then circleSort ks else ks
circleSort xs where
= go False xs (False,[])
(swapped,ks)
= (s,ks)
go d [] (s,ks) = (s,x:ks)
go d [x] (s,ks) =
go d xs (s,ks) let (s',_,ls,rs) = halve d s xs xs
in go False ls (go True rs (s',ks))
:ys) (_:_:zs) = swap d y (halve d s ys zs)
halve d s (y= (s,ys,[],[])
halve d s ys [] :ys) [_] = (s,ys,[y | e],[y | not e])
halve d s (ywhere e = y <= head ys
:ys,ls,rs)
swap d x (s,y| bool (<=) (<) d x y = ( d || s,ys,x:ls,y:rs)
| otherwise = (not d || s,ys,y:ls,x:rs)
So there it is! The one-pass, purely function implementation of circle sort. Very possibly the most useless piece of code I’ve ever written.