Terminating Tricky Traversals

Posted on January 29, 2020

Part 8 of a 10-part series on Breadth-First Traversals

Tags: Agda, Haskell

Imports

{-# OPTIONS --cubical --sized-types #-}

module Post where

open import ../code/terminating-tricky-traversals/Post.Prelude

Just a short one today. I’m going to look at a couple of algorithms for breadth-first traversals with complex termination proofs.

Breadth-First Graph Traversal

In a previous post I talked about breadth-first traversals over graphs, and the difficulties that cycles cause. Graphs are especially tricky to work with in a purely functional language, because so many of the basic algorithms are described in explicitly mututing terms (i.e. “mark off a node as you see it”), with no obvious immutable translation. The following is the last algoirthm I came up with:

bfs :: Ord a => (a -> [a]) -> a -> [[a]]
bfs g r = takeWhile (not.null) (map fst (fix (f r . push)))
  where
    push xs = ([],Set.empty) : [ ([],seen) | (_,seen) <- xs ]
    f x q@((l,s):qs)
      | Set.member x s = q
      | otherwise = (x:l, Set.insert x s) : foldr f qs (g x)

As difficult as it is to work with graphs in a pure functional language, it’s even more difficult to work in a total language, like Agda. Looking at the above function, there are several bits that we can see right off the bat won’t translate over easily. Let’s start with fix.

We shouldn’t expect to be able to write fix in Agda as-is. Just look at its Haskell implementation:

fix :: (a -> a) -> a
fix f = f (fix f)

It’s obviously non total!

(this is actually a non-memoizing version of fix, which is different from the usual one)

We can write a function like fix, though, using coinduction and sized types.

record Thunk (A : Size → Type a) (i : Size) : Type a where
  coinductive
  field force : ∀ {j : Size< i} → A j
open Thunk public

fix : (A : Size → Type a) → (∀ {i} → Thunk A i → A i) → ∀ {j} → A j
fix A f = f λ where .force → fix A f

Coinductive types are the dual to inductive types. Totality-wise, a coinductive type must be “productive”; i.e. a coinductive list can be infinitely long, but it must be provably able to evaluate to a constructor (cons or nil) in finite time.

Sized types also help us out here: they’re quite subtle, and a little finicky to use occasionally, but they are invaluable when it comes to proving termination or productivity of complex (especially higher-order) functions. The canonical example is mapping over the following tree type:

module NonTerminating where
  data Tree (A : Type a) : Type a where
    _&_ : A → List (Tree A) → Tree A

  {-# TERMINATING #-}
  mapTree : (A → B) → Tree A → Tree B
  mapTree f (x & xs) = f x & map (mapTree f) xs

The compiler can’t tell that the recursive call in the mapTree function will only be called on subnodes of the argument: it can’t tell that it’s structurally recursive, in other words. Annoyingly, we can fix the problem by inlining map.

  mutual
    mapTree′ : (A → B) → Tree A → Tree B
    mapTree′ f (x & xs) = f x & mapForest f xs

    mapForest : (A → B) → List (Tree A) → List (Tree B)
    mapForest f [] = []
    mapForest f (x ∷ xs) = mapTree′ f x ∷ mapForest f xs

The other solution is to give the tree a size parameter. This way, all submodes of a given tree will have smaller sizes, which will give the compiler a finite descending chain condition it can use to prove termination.

data Tree (A : Type a) (i : Size) : Type a where
  _&_ : A → ∀ {j : Size< i} → List (Tree A j) → Tree A i

mapTree : (A → B) → Tree A i → Tree B i
mapTree f (x & xs) = f x & map (mapTree f) xs

So how do we use this stuff in our graph traversal? Well first we’ll need a coinductive Stream type:

record Stream (A : Type a) (i : Size) : Type a where
  coinductive
  field
    head : A
    tail : ∀ {j : Size< i} → Stream A j
open Stream public

smap : (A → B) → Stream A i → Stream B i
smap f xs .head = f (xs .head)
smap f xs .tail = smap f (xs .tail)

And then we can use it to write our breadth-first traversal.

bfs : ⦃ _ : IsDiscrete A ⦄ → (A → List A) → A → Stream (List A) i
bfs g r = smap fst (fix (Stream _) (f r ∘ push))
  where
  push : Thunk (Stream _) i → Stream _ i
  push xs .head = ([] , [])
  push xs .tail = smap (_,_ [] ∘ snd) (xs .force)

  f : _ → Stream _ i → Stream _ i
  f x qs with (x ∈? qs .head .snd) .does
  ... | true = qs
  ... | false = λ where .head → (x ∷ qs .head .fst , x ∷ qs .head .snd)
                        .tail → foldr f (qs .tail) (g x)

How do we convert this to a list of lists? Well, for this condition we would actually need to prove that there are only finitely many elements in the graph. We could actually use Noetherian finiteness for this: though I have a working implementation, I’m still figuring out how to clean this up, so I will leave it for another post.

Traversing a Braun Tree

A recent paper (Nipkow and Sewell 2020) provided Coq proofs for some algorithms on Braun trees (Okasaki 1997), which prompted me to take a look at them again. This time, I came up with an interesting linear-time toList function, which relies on the following peculiar type:

newtype Q2 a
  = Q2
  { unQ2 :: (Q2 a -> Q2 a) -> (Q2 a -> Q2 a) -> a
  }

Even after coming up with the type myself, I still can’t really make heads nor tails of it. If I squint, it starts to look like some bizarre church-encoded binary number (but I have to really squint). It certainly seems related to corecursive queues (Smith 2009).

Anyway, we can use the type to write the following lovely toList function on a Braun tree.

toList :: Tree a -> [a]
toList t = unQ2 (f t b) id id
  where
    f (Node x l r) xs = Q2 (\ls rs -> x : unQ2 xs (ls . f l) (rs . f r))
    f Leaf         xs = Q2 (\_  _  -> [])

    b = Q2 (\ls rs -> unQ2 (ls (rs b)) id id)

So can we convert it to Agda?

Not really! As it turns out, this function is even more difficult to implement than one might expect. We can’t even write the Q2 type in Agda without getting in trouble.

{-# NO_POSITIVITY_CHECK #-}
record Q2 (A : Type a) : Type a where
  inductive
  field
    q2 : (Q2 A → Q2 A) →
         (Q2 A → Q2 A) →
         A
open Q2

Q2 isn’t strictly positive, unfortunately.

{-# TERMINATING #-}
toList : Braun A → List A
toList t = f t n .q2 id id
  where
  n : Q2 A
  n .q2 ls rs = ls (rs n) .q2 id id

  f : Braun A → Q2 (List A) → Q2 (List A)
  f leaf         xs .q2 ls rs = []
  f (node x l r) xs .q2 ls rs = x ∷ xs .q2 (ls ∘ f l) (rs ∘ f r)

Apparently this problem of strict positivity for breadth-first traversals has come up before: Berger, Matthes, and Setzer (2019); Hofmann (1993).

Wait—Where did Q2 Come From?

Update 31/01/2020

Daniel Peebles (@copumpkin on twitter) replied to my tweet about this post with the following:

Interesting! Curious how you came up with that weird type at the end. It doesn’t exactly feel like the first thing one might reach for and it would be interesting to see some writing on the thought process that led to it

Dan P (@copumpkin), Jan 30, 2020.

So that’s what I’m going to add here!

Let’s take the Braun tree of the numbers 1 to 15:

     ┌8
   ┌4┤
   │ └12
 ┌2┤
 │ │ ┌10
 │ └6┤
 │   └14
1┤
 │   ┌9
 │ ┌5┤
 │ │ └13
 └3┤
   │ ┌11
   └7┤
     └15

Doing a normal breadth-first traversal for the first two levels is fine (1, 2, 3): it starts to fall apart at the third level (4, 6, 5, 7). Here’s the way we should traverse it: “all of the left branches, and then all of the right branches”. So, we will have a queue of trees. We take the root element of each tree in the queue, and emit it, and then we add all of the left children of the trees in the queue to one queue, and then all the right children to another, and then concatenate them into a new queue and we start again. We can stop whenever we hit an empty tree because of the structure of the Braun tree. Here’s an ascii diagram to show what’s going on:

     ┌8   |     ┌8    |     ┌8     |       8
   ┌4┤    |   ┌4┤     |    4┤      |
   │ └12  |   │ └12   |     └12    |       9
 ┌2┤      |  2┤       |            |
 │ │ ┌10  |   │ ┌10   |     ┌9     |       10
 │ └6┤    |   └6┤     |    5┤      |
 │   └14  |     └14   |     └13    |       11
1┤       -->        ----->      -------->
 │   ┌9   |     ┌9    |     ┌10    |       12
 │ ┌5┤    |   ┌5┤     |    6┤      |
 │ │ └13  |   │ └13   |     └14    |       13
 └3┤      |  3┤       |            |
   │ ┌11  |   │ ┌11   |     ┌11    |       14
   └7┤    |   └7┤     |    7┤      |
     └15  |     └15   |     └15    |       15

         1,         2, 3,       4, 5, 6, 7,   8, 9, 10, 11, 12, 13, 14, 15

If we want to do this in Haskell, we have a number of options for how we would represent queues: as ever, though, I much prefer to use vanilla lists and time the reversals so that they stay linear. Here’s what that looks like:

toList :: Tree a -> [a]
toList t = f t b [] []
  where
    f (Node x l r) xs ls rs = x : xs (l : ls) (r : rs)
    f Leaf         _ _  _  = []

    b ls rs = foldr f b (reverse ls ++ reverse rs) [] []

Any place we see a foldr being run after a reverse or a concatenation, we know that we can remove a pass (in actual fact rewrite rules will likely do this automatically for us).

toList :: Tree a -> [a]
toList t = f b t [] []
  where
    f (Node x l r) xs ls rs = x : xs (l : ls) (r : rs)
    f Leaf         _  _  _  = []

    b ls rs = foldl (flip f) (foldl (flip f) b rs) ls [] []

Finally, since we’re building up the lists with : (in a linear way, i.e. we will not use the intermediate queues more than once), and we’re immediately consuming them with a fold, we can deforest the intermediate list, replacing every : with f (actually, it’s a little more tricky than that, since we replace the : with the reversed version of f, i.e. the one you would pass to foldr if you wanted it to act like foldl. This trick is explained in more detail in this post).

toList :: Tree a -> [a]
toList t = f t b id id
  where
    f (Node x l r) xs ls rs = x : xs (ls . f l) (rs . f r)
    f Leaf         _ _ _ = []

    b ls rs = ls (rs b) id id

Once you do that, however, you run into the “cannot construct the infinite type” error. To be precise:

• Occurs check: cannot construct the infinite type:
    a3 ~ (a3 -> c0) -> (a3 -> c1) -> [a2]

And this gives us the template for our newtype! It requires some trial and error, but you can see where some of the recursive calls are, and what you eventually get is the following:

newtype Q2 a
  = Q2
  { unQ2 :: (Q2 a -> Q2 a) -> (Q2 a -> Q2 a) -> [a]
  }

(You can remove the list type constructor at the end, I did as I thought it made it slightly more general). And from there we get back to the toList function.

References

Berger, Ulrich, Ralph Matthes, and Anton Setzer. 2019. “Martin Hofmann’s Case for Non-Strictly Positive Data Types.” In 24th international conference on types for proofs and programs (TYPES 2018), ed by. Peter Dybjer, José Espírito Santo, and Luís Pinto, 130:22. Leibniz international proceedings in informatics (LIPIcs). Dagstuhl, Germany: Schloss DagstuhlLeibniz-Zentrum fuer Informatik. doi:10.4230/LIPIcs.TYPES.2018.1. http://drops.dagstuhl.de/opus/volltexte/2019/11405.

Hofmann, Martin. 1993. “Non Strictly Positive Datatypes in System F.” https://www.seas.upenn.edu/~sweirich/types/archive/1993/msg00027.html.

Nipkow, Tobias, and Thomas Sewell. 2020. “Proof pearl: Braun trees.” In Certified programs and proofs, CPP 2020, ed by. J. Blanchette and C. Hritcu, –. ACM. http://www21.in.tum.de/~nipkow/pubs/cpp20.html.

Okasaki, Chris. 1997. “Three Algorithms on Braun Trees.” Journal of Functional Programming 7 (6) (November): 661–666. doi:10.1017/S0956796897002876. https://www.eecs.northwestern.edu/~robby/courses/395-495-2013-fall/three-algorithms-on-braun-trees.pdf.

Smith, Leon P. 2009. “Lloyd Allison’s Corecursive Queues: Why Continuations Matter.” The Monad.Reader 14 (14) (July): 28. https://meldingmonads.files.wordpress.com/2009/06/corecqueues.pdf.