Pure and Lazy Breadth-First Traversals of Graphs in Haskell
Today, I’m going to look at extending the previous breadth-first traversal algorithms to arbitrary graphs (rather than just trees). Graphs with cycles are notoriously cumbersome in functional languages, so this actually proves to be a little trickier than I thought it would be. First, a quick recap.
3 Ways to Breadth-First Search
So far, we have three major ways to traverse a tree in breadth-first order. The first is the simplest, and the fastest:
bfe :: Tree a -> [a]
= f r b []
bfe r where
Node x xs) fw bw = x : fw (xs : bw)
f (
= []
b [] = foldl (foldr f) b qs [] b qs
Given a tree like the following:
┌4
┌2┤
│ │ ┌8
│ └5┤
│ └9
1┤
│ ┌10
│ ┌6┘
└3┤
└7
We get:
>>> bfe tree
1,2,3,4,5,6,7,8,9,10] [
It also demonstrates a theme that will run through this post: lists
are the only visible data structure (other than the tree, of
course). However, we are carefully batching the operations on those
lists (the foldl
is effectively a reverse) so that they
have the same complexity as if we had used a queue. In actual fact, when
lists are used this way, they are queues: “corecursive” ones
(Allison 2006; Smith
2009).
The next two functions perform a breadth-first traversal “level-wise”: instead of just returning all the nodes of the tree, we get them delimited by how far they are from the root.
lwe :: Tree a -> [[a]]
= f b r [] []
lwe r where
Node x xs) ls qs = k (x : ls) (xs : qs)
f k (
= []
b _ [] = k : foldl (foldl f) b qs [] []
b k qs
>>> lwe tree
1],[2,3],[4,5,6,7],[8,9,10]] [[
The above function is very clearly related to the bfe
function: we just add another queue (representing the current level),
and work from there.
The third of these functions also does level-wise enumeration, but in a direct style (without continuations).
lwe :: Tree a -> [[a]]
= f r []
lwe r where
Node x xs) (q:qs) = (x:q) : foldr f qs xs
f (Node x xs) [] = [x] : foldr f [] xs f (
There are more techniques out there than just these three (including the one in Data.Tree), but these are my favorite, and they’re what I’ll be looking at today.
Graphs and Purity
Functional programming in general excels at working with trees and similar data structures. Graphs, though, are trickier. There’s been a lot of recent work in improving the situation (Mokhov 2017), but I’m going to keep it simple today: a graph is just a function.
type Graph a = a -> [a]
So the tree from above could be represented as:
1 = [2,3]
graph 2 = [4,5]
graph 3 = [6,7]
graph 5 = [8,9]
graph 6 = [10]
graph = [] graph _
As it happens, all of the algorithms that follow will work on graphs represented as rose trees (or represented any way, really).
So let’s fire up our first traversal!
bfs :: Graph a -> Graph a
= f r b []
bfs g r where
= x : fw (g x : bw)
f x fw bw
= []
b [] = foldl (foldr f) b qs []
b qs
>>> bfs graph 1
1,2,3,4,5,6,7,8,9,10] [
Unfortunately, this won’t handle cycles properly:
1 = [2,3]
graph 2 = [4,5,1]
graph 3 = [6,7]
graph 5 = [8,9]
graph 6 = [10]
graph = []
graph _
>>> bfs graph 1
1,2,3,4,5,1,6,7,8,9,2,3,10,4,5,1,6,7,8,9,2,3,10,4,5,1,6,7,8,9,2,3,10,4,5... [
We need a way to mark off what we’ve already seen. The following isn’t good enough, also:
>>> nub (bfs graph 1)
1,2,3,4,5,6,7,8,9,10... [
It will hang without finishing the list. The solution is to mark off nodes as we find them, with some set structure:
bfs :: Ord a => Graph a -> Graph a
= f ts b [] Set.empty
bfs g ts where
f x fw bw s| Set.member x s = fw bw s
| otherwise = x : fw (g x : bw) (Set.insert x s)
= []
b [] _ = foldl (foldr f) b qs [] s
b qs s
>>> bfs graph 1
1,2,3,4,5,6,7,8,9,10] [
The levelwise algorithm is similar:
lws :: Ord a => Graph a -> a -> [[a]]
= f b r [] [] Set.empty
lws g r where
f k x ls qs s| Set.member x s = k ls qs s
| otherwise = k (x : ls) (g x : qs) (Set.insert x s)
= []
b _ [] _ = k : foldl (foldl f) b qs [] [] s b k qs s
Tying the Knot
The other levelwise algorithm doesn’t translate across so easily. To see why, let’s look at the version without cycle detection:
lws :: Graph a -> a -> [[a]]
= f r []
lws g r where
:qs) = (x:q) : foldr f qs (g x)
f x (q= [x] : foldr f [] (g x) f x []
The recursive call is being made depth-first, not breadth-first. The result, of course, is breadth-first, but that’s only because the recursive call zips as it goes.
Just looking at the fourth line for now:
:qs) = (x:q) : foldr f qs (g x) f x (q
We want whatever process built up that q
to be denied
access to x
. The following doesn’t work:
:qs) = (x:filter (x/=) q) : foldr f qs (g x) f x (q
As well as being terribly slow, the later computation can diverge when it finds a cycle, and filtering won’t do anything to help that.
The solution is to “tie the knot”. We basically do two passes over the data: one to build up the “seen so far” list, and then another to do the actual search. The trick is to do both of these passes at once, and feed the result back into the demanding computation.
= takeWhile (not.null) (map fst (fix (f r . push)))
lws g r where
= ([],Set.empty) : [ ([],seen) | (_,seen) <- xs ]
push xs @((l,s):qs)
f x q| Set.member x s = q
| otherwise = (x:l, Set.insert x s) : foldr f qs (g x)
And it works!
I got the idea for this trick from the appendix of Okasaki (2000). There’s something similar in Kiselyov (2002).