Balancing Scans
Previously I tried to figure out a way to fold lists in a more balanced way. Usually, when folding lists, you’ve got two choices for your folds, both of which are extremely unbalanced in one direction or another. Jon Fairbairn wrote a more balanced version, which looked something like this:
treeFold :: (a -> a -> a) -> a -> [a] -> a
= go
treeFold f where
= x
go x [] :l) = go (f a b) (pairMap l)
go a (b:y:rest) = f x y : pairMap rest
pairMap (x= xs pairMap xs
Magical Speedups
The fold above is kind of magical: for a huge class of algorithms, it
kind of “automatically” improves some factor of theirs from
to
.
For instance: to sum a list of floats, foldl' (+) 0
will have an error growth of
;
treeFold (+) 0
, though, has an error rate of
.
Similarly, using the following function to merge two sorted lists:
merge :: Ord a => [a] -> [a] -> [a]
= ys
merge [] ys :xs) ys = go x xs ys
merge (xwhere
= x : xs
go x xs [] :ys)
go x xs (y| x <= y = x : go y ys xs
| otherwise = y : go x xs ys
We get either insertion sort () or merge sort () just depending on which fold you use.
foldr merge [] . map pure -- n^2
. map pure -- n log(n) treeFold merge []
I’ll give some more examples later, but effectively it gives us a better “divide” step in many divide and conquer algorithms.
Termination
As it was such a useful fold, and so integral to many tricky algorithms, I really wanted to have it available in Agda. Unfortunately, though, the functions (as defined above) aren’t structurally terminating, and there doesn’t look like there’s an obvious way to make it so. I tried to make well founded recursion work, but the proofs were ugly and slow.
However, we can use some structures from a previous post: the nested binary sequence, for instance. It has some extra nice properties: instead of nesting the types, we can just apply the combining function.
mutual
data Tree {a} (A : Set a) : Set a where
_×_+_ : ℕ → A → Node A → Tree A
2^
data Node {a} (A : Set a) : Set a where
: Node A
⟨⟩ _⟩ : Tree A → Node A
⟨
module TreeFold {a} {A : Set a} (_*_ : A → A → A) where
infixr 5 _⊛_ 2^_×_⊛_
_×_⊛_ : ℕ → A → Tree A → Tree A
2^= 2^ n × x + ⟨ 2^ m × y + ys ⟩
2^ n × x ⊛ 2^ suc m × y + ys = 2^ suc n × (x * y) + ⟨⟩
2^ n × x ⊛ 2^ zero × y + ⟨⟩ = 2^ suc n × (x * y) ⊛ ys
2^ n × x ⊛ 2^ zero × y + ⟨ ys ⟩
_⊛_ : A → Tree A → Tree A
_⊛_ = 2^ 0 ×_⊛_
_⟧↓ : Tree A → A
⟦_ × x + ⟨⟩ ⟧↓ = x
⟦ 2^ _ × x + ⟨ xs ⟩ ⟧↓ = x * ⟦ xs ⟧↓
⟦ 2^
_⟧↑ : A → Tree A
⟦= 2^ 0 × x + ⟨⟩
⟦ x ⟧↑
_,_⦆ : A → List A → A
⦅= ⟦ foldr _⊛_ ⟦ x ⟧↑ xs ⟧↓ ⦅ x , xs ⦆
Alternatively, we can get cons with the skew array:
infixr 5 _⊛_
_⊛_ : A → Tree A → Tree A
= 2^ 0 × x + ⟨ 2^ n × y + ⟨⟩ ⟩
x ⊛ 2^ n × y + ⟨⟩ 0 × y₂ + ys ⟩ = 2^ suc n × (x * (y₁ * y₂)) + ys
x ⊛ 2^ n × y₁ + ⟨ 2^ = 2^ 0 × x + ⟨ 2^ n × y₁ + ⟨ 2^ m × y₂ + ys ⟩ ⟩ x ⊛ 2^ n × y₁ + ⟨ 2^ suc m × y₂ + ys ⟩
Using this, a proper and efficient merge sort is very straightforward:
data Total {a r} {A : Set a} (_≤_ : A → A → Set r) (x y : A) : Set (a ⊔ r) where
: ⦃ _ : x ≤ y ⦄ → Total _≤_ x y
x≤y : ⦃ _ : y ≤ x ⦄ → Total _≤_ x y
y≤x
module Sorting {a r}
{A : Set a}
{_≤_ : A → A → Set r}
(_≤?_ : ∀ x y → Total _≤_ x y) where
data [∙] : Set a where
: [∙]
⊥ _] : A → [∙]
[
data _≥_ (x : A) : [∙] → Set (a ⊔ r) where
instance ⌈_⌉ : ∀ {y} → y ≤ x → x ≥ [ y ]
instance ⌊⊥⌋ : x ≥ ⊥
infixr 5 _∷_
data Ordered (b : [∙]) : Set (a ⊔ r) where
: Ordered b
[] _∷_ : ∀ x → ⦃ x≥b : x ≥ b ⦄ → (xs : Ordered [ x ]) → Ordered b
_∪_ : ∀ {b} → Ordered b → Ordered b → Ordered b
= ys
[] ∪ ys (x ∷ xs) ∪ ys = ⟅ x ∹ xs ∪ ys ⟆
where
_∹_∪_⟆ : ∀ {b} → ∀ x ⦃ _ : x ≥ b ⦄ → Ordered [ x ] → Ordered b → Ordered b
⟅_∪_∹_⟆ : ∀ {b} → Ordered b → ∀ y ⦃ _ : y ≥ b ⦄ → Ordered [ y ] → Ordered b
⟅: ∀ {b} x y ⦃ _ : x ≥ b ⦄ ⦃ _ : y ≥ b ⦄
merge → Total _≤_ x y
→ Ordered [ x ]
→ Ordered [ y ]
→ Ordered b
= x ∷ xs
⟅ x ∹ xs ∪ [] ⟆ = merge x y (x ≤? y) xs ys
⟅ x ∹ xs ∪ y ∷ ys ⟆ = y ∷ ys
⟅ [] ∪ y ∹ ys ⟆ = merge x y (x ≤? y) xs ys
⟅ x ∷ xs ∪ y ∹ ys ⟆
= x ∷ ⟅ xs ∪ y ∹ ys ⟆
merge x y x≤y xs ys = y ∷ ⟅ x ∹ xs ∪ ys ⟆
merge x y y≤x xs ys
open TreeFold
: List A → Ordered ⊥
sort = ⦅ _∪_ , [] ⦆ ∘ map (_∷ []) sort
Validity
It would be nice if we could verify these optimizated versions of
folds. Luckily, by writing them using foldr
, we’ve stumbled
into well-trodden ground: the foldr fusion law. It states that
if you have some transformation
,
and two binary operators
and
,
then:
This fits right in with the function we used above.
is ⟦_⟧↓
,
is _⊛_
, and
is whatever combining function was passed in. Let’s prove the foldr
fusion law, then, before we go any further.
module Proofs
{a r}
{A : Set a}
{R : Rel A r}
where
infix 4 _≈_
_≈_ = R
open import Algebra.FunctionProperties _≈_
: Transitive _≈_
foldr-universal → ∀ {b} {B : Set b} (h : List B → A) f e
→ ∀[ f ⊢ Congruent₁ ]
→ (h [] ≈ e)
→ (∀ x xs → h (x ∷ xs) ≈ f x (h xs))
→ ∀ xs → h xs ≈ foldr f e xs
_○_ h f e f⟨_⟩ ⇒[] ⇒_∷_ [] = ⇒[]
foldr-universal _○_ h f e f⟨_⟩ ⇒[] ⇒_∷_ (x ∷ xs) =
foldr-universal (⇒ x ∷ xs) ○ f⟨ foldr-universal _○_ h f e f⟨_⟩ ⇒[] ⇒_∷_ xs ⟩
: Transitive _≈_
foldr-fusion → Reflexive _≈_
→ ∀ {b c} {B : Set b} {C : Set c} (f : C → A) {_⊕_ : B → C → C} {_⊗_ : B → A → A} e
→ ∀[ _⊗_ ⊢ Congruent₁ ]
→ (∀ x y → f (x ⊕ y) ≈ x ⊗ f y)
→ ∀ xs → f (foldr _⊕_ e xs) ≈ foldr _⊗_ (f e) xs
_○_ ∎ h {f} {g} e g⟨_⟩ fuse =
foldr-fusion _○_ (h ∘ foldr f e) g (h e) g⟨_⟩ ∎ (λ x xs → fuse x (foldr f e xs)) foldr-universal
We’re not using the proofs in Agda’s standard library because these are tied to propositional equality. In other words, instead of using an abstract binary relation, they prove things over actual equality. That’s all well and good, but as you can see above, we don’t need propositional equality: we don’t even need the relation to be an equivalence, we just need transitivity and reflexivity.
After that, we can state precisely what correspondence the tree fold has, and under what conditions it does the same things as a fold:
module _ {_*_ : A → A → A} where
open TreeFold _*_
: Transitive _≈_
treeFoldHom → Reflexive _≈_
→ Associative _*_
→ RightCongruent _*_
→ ∀ x xs
→ ⦅ x , xs ⦆ ≈ foldr _*_ x xs
_○_ ∎ assoc *⟨_⟩ b = foldr-fusion _○_ ∎ ⟦_⟧↓ ⟦ b ⟧↑ *⟨_⟩ (⊛-hom zero)
treeFoldHom where
: ∀ n x xs → ⟦ 2^ n × x ⊛ xs ⟧↓ ≈ x * ⟦ xs ⟧↓
⊛-hom (2^ suc m × y + ⟨⟩ ) = ∎
⊛-hom n x (2^ suc m × y + ⟨ ys ⟩) = ∎
⊛-hom n x (2^ zero × y + ⟨⟩ ) = ∎
⊛-hom n x (2^ zero × y + ⟨ ys ⟩) = ⊛-hom (suc n) (x * y) ys ○ assoc x y ⟦ ys ⟧↓ ⊛-hom n x
“Implicit” Data Structures
Consider the following implementation of the tree above in Haskell:
type Tree a = [(Int,a)]
cons :: (a -> a -> a) -> a -> Tree a -> Tree a
*) = cons' 0
cons (where
= [(n,x)]
cons' n x [] 0,y):ys) = cons' (n+1) (x * y) ys
cons' n x ((:ys) = (n,x) : (m-1,y) : ys cons' n x ((m,y)
The cons
function “increments” that list as if it were
the bits of a binary number. Now, consider using the merge
function from above, in a pattern like this:
= foldr (cons merge . pure) [] f
What does f
build? A list of lists, right?
Kind of. That’s what’s built in terms of the observable, but what’s actually stored in memory us a bunch of thunks. The shape of those is what I’m interested in. We can try and see what they look like by using a data structure that doesn’t force on merge:
data Tree a = Leaf a | Tree a :*: Tree a
= foldr (cons (:*:) . Leaf) [] f
Using a handy tree-drawing function, we can see what
f [1..13]
looks like:
[(0,*),(1,*),(0,*)]
└1 │ ┌2 │ ┌6
│┌┤ │ ┌┤
││└3 │ │└7
└┤ │┌┤
│┌4 │││┌8
└┤ ││└┤
└5 ││ └9
└┤
│ ┌10
│┌┤
││└11
└┤
│┌12
└┤
└13
It’s a binomial heap! It’s a list of trees, each one contains elements. But they’re not in heap order, you say? Well, as a matter of fact, they are. It just hasn’t been evaluated yet. Once we force—say—the first element, the rest will shuffle themselves into a tree of thunks.
This illustrates a pretty interesting similarity between binomial heaps and merge sort. Performance-wise, though, there’s another interesting property: the thunks stay thunked. In other words, if we do a merge sort via:
sort = foldr (merge . snd) [] . foldr (cons merge . pure) []
We could instead freeze the fold, and look at it at every point:
= map (foldr (merge . snd) []) . scanl (flip (cons merge . pure)) []
sortPrefixes >>> [[],[1],[1,4],[1,2,4],[1,2,3,4],[1,2,3,4,5]]
And sortPrefixes
is only
(rather than
).
I confess I don’t know of a use for sorted prefixes, but it should
illustrate the general idea: we get a pretty decent batching of
operations, with the ability to freeze at any point in time. The other
nice property (which I mentioned in the last post) is that any of the
tree folds are extremely parallel.
Random Shuffles
There’s a great article on shuffling in Haskell which provides an implementation of a perfect random shuffle. Unfortunately, the Fisher-Yates shuffle isn’t applicable in a pure functional setting, so you have to be a little cleverer.
The first implementation most people jump to (certainly the one I thought of) is to assign everything in the sequence a random number, and then sort according to that number. Perhaps surprisingly, this isn’t perfectly random! It’s a little weird, but the example in the article explains it well: basically, for elements, your random numbers will have possible values, but the output of the sort will have possible values. Since they don’t divide into each other evenly, you’re going to have some extra weight on some permutations, and less on others.
Instead, we can generate a random factoradic number. A factoradic number is one where the th digit is in base . Because of this, a factoradic number with digits has possible values: exactly what we want.
In the article, the digits of the number are used to pop values from a binary tree. Because the last digit will have possible values, and the second last , and so on, you can keep popping without hitting an empty tree.
This has the correct time complexity——but there’s a lot of overhead. Building the tree, then indexing into it, the rebuilding after each pop, etc.
We’d like to just sort the list, according to the indices.
The problem is that the indices are relative: if you want to
cons
something onto the list, you have to increment the
rest of the indices, as they’ve all shifted right by one.
What we’ll do instead is use the indices as gaps. Our merge function looks like the following:
= ys
merge [] ys = xs
merge xs [] :xs) ((y,j):ys)
merge ((x,i)| i <= j = (x,i) : merge xs ((y,j-i):ys)
| otherwise = (y,j) : merge ((x,i-j-1):xs) ys
With that, and the same cons
as above, we get a very
simple random shuffle algorithm:
= map fst
shuffle xs . foldr (merge . snd) []
. foldr f (const []) xs
where
:is) = cons merge [(x,i)] (xs is) f x xs (i
The other interesting thing about this algorithm is that it can use Peano numbers with taking too much of a performance hit:
: ∀ {a} {A : Set a} → List (A × ℕ) → List (A × ℕ) → List (A × ℕ)
merge = xs
merge xs [] A = A} xs ((y , j) ∷ ys) = go-r xs y j ys
merge {where
-l : A → ℕ → List (A × ℕ) → List (A × ℕ) → List (A × ℕ)
go-r : List (A × ℕ) → A → ℕ → List (A × ℕ) → List (A × ℕ)
go: ℕ → ℕ → A → ℕ → List (A × ℕ) → A → ℕ → List (A × ℕ) → List (A × ℕ)
go
= (y , j′) ∷ go-l x i xs ys
go i zero x i′ xs y j′ ys = (x , i′) ∷ go-r xs y j ys
go zero (suc j) x i′ xs y j′ ys = go i j
go (suc i) (suc j)
-l x i xs [] = (x , i) ∷ xs
go-l x i xs ((y , j) ∷ ys) = go i j x i xs y j ys
go
-r [] y j ys = (y , j) ∷ ys
go-r ((x , i) ∷ xs) y j ys = go i j x i xs y j ys
go
: ∀ {a} {A : Set a} → List A → List ℕ → List A
shuffle A} xs i = map proj₁ (⦅ [] , zip-inds xs i ⦆)
shuffle {a} {where
TreeFold {a} {List (A × ℕ)} merge
open
zip-inds : List A → List ℕ → List (List (A × ℕ))
zip-inds [] inds = []
zip-inds (x ∷ xs) [] = ((x , 0) ∷ []) ∷ zip-inds xs []
zip-inds (x ∷ xs) (i ∷ inds) = ((x , i) ∷ []) ∷ zip-inds xs inds
I don’t know exactly what the complexity of this is, but I think it should be better than the usual approach of popping from a vector.
Future Stuff
This is just a collection of random thoughts for now, but I intend to work on using these folds to see if there are any other algorithms they can be useful for. In particular, I think I can write a version of Data.List.permutations which benefits from sharing. And I’m interested in using the implicit binomial heap for some search problems.