Terminating Tricky Traversals
Imports
{-# OPTIONS --cubical --sized-types #-} module Post where open import ../code/terminating-tricky-traversals/Post.Prelude
Just a short one today. I’m going to look at a couple of algorithms for breadth-first traversals with complex termination proofs.
Breadth-First Graph Traversal
In a previous post I talked about breadth-first traversals over graphs, and the difficulties that cycles cause. Graphs are especially tricky to work with in a purely functional language, because so many of the basic algorithms are described in explicitly mututing terms (i.e. “mark off a node as you see it”), with no obvious immutable translation. The following is the last algoirthm I came up with:
bfs :: Ord a => (a -> [a]) -> a -> [[a]]
= takeWhile (not.null) (map fst (fix (f r . push)))
bfs g r where
= ([],Set.empty) : [ ([],seen) | (_,seen) <- xs ]
push xs @((l,s):qs)
f x q| Set.member x s = q
| otherwise = (x:l, Set.insert x s) : foldr f qs (g x)
As difficult as it is to work with graphs in a pure functional
language, it’s even more difficult to work in a total
language, like Agda. Looking at the above function, there are several
bits that we can see right off the bat won’t translate over easily.
Let’s start with fix
.
We shouldn’t expect to be able to write fix
in Agda
as-is. Just look at its Haskell implementation:
fix :: (a -> a) -> a
= f (fix f) fix f
It’s obviously non total!
(this is actually a non-memoizing version of fix
, which
is different from the usual
one)
We can write a function like fix
, though, using
coinduction and sized types.
record Thunk (A : Size → Type a) (i : Size) : Type a where coinductive field force : ∀ {j : Size< i} → A j open Thunk public fix : (A : Size → Type a) → (∀ {i} → Thunk A i → A i) → ∀ {j} → A j fix A f = f λ where .force → fix A f
Coinductive types are the dual to inductive types. Totality-wise, a coinductive type must be “productive”; i.e. a coinductive list can be infinitely long, but it must be provably able to evaluate to a constructor (cons or nil) in finite time.
Sized types also help us out here: they’re quite subtle, and a little finicky to use occasionally, but they are invaluable when it comes to proving termination or productivity of complex (especially higher-order) functions. The canonical example is mapping over the following tree type:
module NonTerminating where data Tree (A : Type a) : Type a where _&_ : A → List (Tree A) → Tree A {-# TERMINATING #-} mapTree : (A → B) → Tree A → Tree B mapTree f (x & xs) = f x & map (mapTree f) xs
The compiler can’t tell that the recursive call in the
mapTree
function will only be called on subnodes of the
argument: it can’t tell that it’s structurally recursive, in other
words. Annoyingly, we can fix the problem by inlining
map
.
mutual mapTree′ : (A → B) → Tree A → Tree B mapTree′ f (x & xs) = f x & mapForest f xs mapForest : (A → B) → List (Tree A) → List (Tree B) mapForest f [] = [] mapForest f (x ∷ xs) = mapTree′ f x ∷ mapForest f xs
The other solution is to give the tree a size parameter. This way, all submodes of a given tree will have smaller sizes, which will give the compiler a finite descending chain condition it can use to prove termination.
data Tree (A : Type a) (i : Size) : Type a where _&_ : A → ∀ {j : Size< i} → List (Tree A j) → Tree A i mapTree : (A → B) → Tree A i → Tree B i mapTree f (x & xs) = f x & map (mapTree f) xs
So how do we use this stuff in our graph traversal? Well first we’ll need a coinductive Stream type:
record Stream (A : Type a) (i : Size) : Type a where coinductive field head : A tail : ∀ {j : Size< i} → Stream A j open Stream public smap : (A → B) → Stream A i → Stream B i smap f xs .head = f (xs .head) smap f xs .tail = smap f (xs .tail)
And then we can use it to write our breadth-first traversal.
bfs : ⦃ _ : IsDiscrete A ⦄ → (A → List A) → A → Stream (List A) i bfs g r = smap fst (fix (Stream _) (f r ∘ push)) where push : Thunk (Stream _) i → Stream _ i push xs .head = ([] , []) push xs .tail = smap (_,_ [] ∘ snd) (xs .force) f : _ → Stream _ i → Stream _ i f x qs with (x ∈? qs .head .snd) .does ... | true = qs ... | false = λ where .head → (x ∷ qs .head .fst , x ∷ qs .head .snd) .tail → foldr f (qs .tail) (g x)
How do we convert this to a list of lists? Well, for this condition we would actually need to prove that there are only finitely many elements in the graph. We could actually use Noetherian finiteness for this: though I have a working implementation, I’m still figuring out how to clean this up, so I will leave it for another post.
Traversing a Braun Tree
A recent paper (Nipkow and Sewell 2020) provided Coq
proofs for some algorithms on Braun trees (Okasaki 1997),
which prompted me to take a look at them again. This time, I came up
with an interesting linear-time toList
function, which
relies on the following peculiar type:
newtype Q2 a
= Q2
unQ2 :: (Q2 a -> Q2 a) -> (Q2 a -> Q2 a) -> a
{ }
Even after coming up with the type myself, I still can’t really make heads nor tails of it. If I squint, it starts to look like some bizarre church-encoded binary number (but I have to really squint). It certainly seems related to corecursive queues (Smith 2009).
Anyway, we can use the type to write the following lovely
toList
function on a Braun tree.
toList :: Tree a -> [a]
= unQ2 (f t b) id id
toList t where
Node x l r) xs = Q2 (\ls rs -> x : unQ2 xs (ls . f l) (rs . f r))
f (Leaf xs = Q2 (\_ _ -> [])
f
= Q2 (\ls rs -> unQ2 (ls (rs b)) id id) b
So can we convert it to Agda?
Not really! As it turns out, this function is even more difficult to
implement than one might expect. We can’t even write the
Q2
type in Agda without getting in trouble.
{-# NO_POSITIVITY_CHECK #-} record Q2 (A : Type a) : Type a where inductive field q2 : (Q2 A → Q2 A) → (Q2 A → Q2 A) → A open Q2
Q2
isn’t strictly positive, unfortunately.
{-# TERMINATING #-} toList : Braun A → List A toList t = f t n .q2 id id where n : Q2 A n .q2 ls rs = ls (rs n) .q2 id id f : Braun A → Q2 (List A) → Q2 (List A) f leaf xs .q2 ls rs = [] f (node x l r) xs .q2 ls rs = x ∷ xs .q2 (ls ∘ f l) (rs ∘ f r)
Apparently this problem of strict positivity for breadth-first traversals has come up before: Berger, Matthes, and Setzer (2019); Hofmann (1993).
Wait—Where did Q2 Come From?
Update 31/01/2020
Daniel Peebles (@copumpkin on twitter) replied to my tweet about this post with the following:
Interesting! Curious how you came up with that weird type at the end. It doesn’t exactly feel like the first thing one might reach for and it would be interesting to see some writing on the thought process that led to it
So that’s what I’m going to add here!
Let’s take the Braun tree of the numbers 1 to 15:
┌8
┌4┤
│ └12
┌2┤
│ │ ┌10
│ └6┤
│ └14
1┤
│ ┌9
│ ┌5┤
│ │ └13
└3┤
│ ┌11
└7┤
└15
Doing a normal breadth-first traversal for the first two levels is fine (1, 2, 3): it starts to fall apart at the third level (4, 6, 5, 7). Here’s the way we should traverse it: “all of the left branches, and then all of the right branches”. So, we will have a queue of trees. We take the root element of each tree in the queue, and emit it, and then we add all of the left children of the trees in the queue to one queue, and then all the right children to another, and then concatenate them into a new queue and we start again. We can stop whenever we hit an empty tree because of the structure of the Braun tree. Here’s an ascii diagram to show what’s going on:
┌8 | ┌8 | ┌8 | 8
┌4┤ | ┌4┤ | 4┤ |
│ └12 | │ └12 | └12 | 9
┌2┤ | 2┤ | |
│ │ ┌10 | │ ┌10 | ┌9 | 10
│ └6┤ | └6┤ | 5┤ |
│ └14 | └14 | └13 | 11
1┤ --> -----> -------->
│ ┌9 | ┌9 | ┌10 | 12
│ ┌5┤ | ┌5┤ | 6┤ |
│ │ └13 | │ └13 | └14 | 13
└3┤ | 3┤ | |
│ ┌11 | │ ┌11 | ┌11 | 14
└7┤ | └7┤ | 7┤ |
└15 | └15 | └15 | 15
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
If we want to do this in Haskell, we have a number of options for how we would represent queues: as ever, though, I much prefer to use vanilla lists and time the reversals so that they stay linear. Here’s what that looks like:
toList :: Tree a -> [a]
= f t b [] []
toList t where
Node x l r) xs ls rs = x : xs (l : ls) (r : rs)
f (Leaf _ _ _ = []
f
= foldr f b (reverse ls ++ reverse rs) [] [] b ls rs
Any place we see a foldr
being run after a reverse or a
concatenation, we know that we can remove a pass (in actual fact rewrite
rules will likely do this automatically for us).
toList :: Tree a -> [a]
= f b t [] []
toList t where
Node x l r) xs ls rs = x : xs (l : ls) (r : rs)
f (Leaf _ _ _ = []
f
= foldl (flip f) (foldl (flip f) b rs) ls [] [] b ls rs
Finally, since we’re building up the lists with :
(in a
linear way, i.e. we will not use the intermediate queues more than
once), and we’re immediately consuming them with a fold, we can deforest
the intermediate list, replacing every :
with
f
(actually, it’s a little more tricky than that, since we
replace the :
with the reversed version of
f
, i.e. the one you would pass to foldr
if you
wanted it to act like foldl
. This trick is explained in
more detail in this
post).
toList :: Tree a -> [a]
= f t b id id
toList t where
Node x l r) xs ls rs = x : xs (ls . f l) (rs . f r)
f (Leaf _ _ _ = []
f
= ls (rs b) id id b ls rs
Once you do that, however, you run into the “cannot construct the infinite type” error. To be precise:
• Occurs check: cannot construct the infinite type: a3 ~ (a3 -> c0) -> (a3 -> c1) -> [a2]
And this gives us the template for our newtype! It requires some trial and error, but you can see where some of the recursive calls are, and what you eventually get is the following:
newtype Q2 a
= Q2
unQ2 :: (Q2 a -> Q2 a) -> (Q2 a -> Q2 a) -> [a]
{ }
(You can remove the list type constructor at the end, I did as I
thought it made it slightly more general). And from there we get back to
the toList
function.