Monuses and Heaps
This post is about a simple algebraic structure that I have found
useful for algorithms that involve searching or sorting based on some
ordered weight. I used it a bit in a pair of papers (2021; 2025), and more recently I used it to
implement a version of the Phases type
(Easterly
2019) that supported arbitrary keys, inspired by some work by
Blöndal (2025a; 2025b)
and Visscher
(2025).
The algebraic structure in question is a monus, which is a kind of monoid that supports a partial subtraction operation (that subtraction operation, denoted by the symbol ∸, is also often called a “monus”). However, before giving the full definition of the structure, let me first try to motivate its use. The context here is heap-based algorithms. For the purposes of this post, a heap is a tree that obeys the “heap property”; i.e. every node in the tree has some “weight” attached to it, and every parent node has a weight less than or equal to the weight of each of its children. So, for a tree like the following:
┌d
┌b┤
a┤ └e
└cThe heap property is satisfied when , , , and .
Usually, we also want our heap structure to have an operation like
popMin :: Heap k v -> Maybe (v, Heap k v)
that returns the least-weight value in the heap paired with the rest of
the heap. If this operation is efficient, we can use the heap to
efficiently implement sorting algorithms, graph search, etc. In fact,
let me give the whole basic interface for a heap here:
popMin :: Heap k v -> Maybe (v, Heap k v)
insert :: k -> v -> Heap k v -> Heap k v
empty :: Heap k vUsing these functions it’s not hard to see how we can implement a sorting algorithm:
sortOn :: (a -> k) -> [a] -> [a]
sortOn k = unfoldr popMin . foldr (\x -> insert (k x) x) emptyThe monus becomes relevant when the weight involved is some kind of monoid. This is quite a common situation: if we were using the heap for graph search (least-cost paths or something), we would expect the weight to correspond to path costs, and we would expect that we can add the costs of paths in a kind of monoidal way. Furthermore, we would probably expect the monoidal operations to relate to the order in some coherent way. A monus (Amer 1984) is an ordered monoid where the order itself can be defined in terms of the monoidal operations1:
I read this definition as saying “ is less than iff there is some that fits between and ”. In other words, the gap between and has to exist, and it is equal to .
Notice that this order definition won’t work for groups like . For a group, we can always find some that will fit the existential (specifically, ). Monuses, then, tend to be positive monoids: in fact, many monuses are the positive cones of some group ( is the positive cone of ).
We can derive a lot of useful properties from this basic structure. For example, if the order above is total, then we can derive the binary subtraction operator mentioned above:
If we require the underlying monoid to be commutative, and we further require the derived order to be total and antisymmetric, we get the particular flavour of monus I worked with in a pair of papers on graph search (2021; 2025). In this post I will actually be working with a weakened form of the algebra that I will define shortly.
Getting back to our heap from above, with this new order defined, we can see that the heap property actually tells us something about the makeup of the weights in the tree. Instead of every child just having a weight equal to some arbitrary quantity, the heap property tells us that each child weight has to be made up of the combination of its parent’s weight and some difference.
┌d ┌b•(d∸b)
┌b┤ ┌a•(b∸a)┤
a┤ └e = a┤ └b•(e∸b)
└c └a•(c∸a)This observation gives us an opportunity for a different representation: instead of storing the full weight at each node, we could instead just store the difference.
┌d∸b
┌b∸a┤
a┤ └e∸b
└c∸aJust in terms of data structure design, I prefer this version: if we wanted to write down a type of heaps using the previous design, we would first define the type of trees, and then separately write a predicate corresponding to the heap property. With this design, it is impossible to write down a tree that doesn’t satisfy the heap property.
More practically, though, using this algebraic structure when working with heaps enables some optimisations that might be difficult to implement otherwise. The strength of this representation is that it allows for efficient relative and global computation: now, if we wanted to add some quantity to every weight in the tree, we can do it just by adding the weight to the root node.
Monuses in Haskell
To see some examples of how to use this pattern, let’s first write a class for Haskell monuses:
class (Semigroup a, Ord a) => Monus a where
(∸) :: a -> a -> aYou’ll notice that we’re requiring semigroup here, not monoid. That’s because one of the nice uses of this pattern actually works with a weakening of the usual monus algebra; this weakening only requires semigroup, and the following two laws.
A straightforward monus instance is the following:
instance (Num a, Ord a) => Monus (Sum a) where
(∸) = (-)Pairing Heaps in Haskell
Next, let’s look at a simple heap implementation. I will always go for pairing heaps (Fredman et al. 1986) in Haskell; they are extremely simple to implement, and (as long as you don’t have significant persistence requirements) their performance seems to be the best of the available pointer-based heaps (Larkin, Sen, and Tarjan 2013). Here is the type definition:
data Root k v = Root !k v [Root k v]
type Heap k v = Maybe (Root k v)A Root is a
non-empty pairing heap; the Heap type
represents possibly-empty heaps. The key function to implement is the
merging of two heaps; we can accomplish this as an implementation of the
semigroup <>.
instance Monus k => Semigroup (Root k v) where
Root xk xv xs <> Root yk yv ys
| xk <= yk = Root xk xv (Root (yk ∸ xk) yv ys : xs)
| otherwise = Root yk yv (Root (xk ∸ yk) xv xs : ys)The only difference between this and a normal pairing heap merge is
the use of ∸ in the key of the
child node (yk ∸ xk and xk ∸ yk). This difference ensures that
each child only holds the difference of the weight between itself and
its parent.
It’s worth working out why the weakened monus laws above are all we need in order to maintain the heap property on this structure.
The rest of the methods are implemented the same as their
implementations on a normal pairing heap. First, we have the pairing
merge of a list of heaps, here given as an implementation of the
semigroup method sconcat:
sconcat (x1 :| []) = x1
sconcat (x1 :| [x2]) = x1 <> x2
sconcat (x1 :| x2 : x3 : xs) = (x1 <> x2) <> sconcat (x3 :| xs)
merges :: Monus k => [Root k v] -> Heap k v
merges = fmap sconcat . nonEmptyThe pattern of this two-level merge is what gives the pairing heap its excellent performance.
The Heap type
derives its monoid instance from the monoid instance on Maybe (instance Semigroup a => Monoid (Maybe a)),
so we can implement insert like
so:
insert :: Monus k => k -> v -> Heap k v -> Heap k v
insert k v hp = Just (Root k v []) <> hpAnd popMin is also relatively
simple:
delay :: Semigroup k => k -> Heap k v -> Heap k v
delay by = fmap (\(Root k v xs) -> Root (by <> k) v xs)
popMin :: Monus k => Heap k v -> Maybe (v,Heap k v)
popMin = fmap (\(Root k v xs) -> (v, k `delay` merges xs))Notice that we delay the rest
of the heap, because all of its entries need to be offset by the weight
of the previous root node. Thankfully, because we’re only storing the
differences, we can “modify” every weight by just increasing the weight
of the root, making this an
operation.
Finally, we can implement heap sort like so:
sortOn :: Monus k => (a -> k) -> [a] -> [a]
sortOn k = unfoldr popMin . foldl' (\xs x -> insert (k x) x xs) NothingAnd it does indeed work:
>>> sortOn Sum [3,4,2,5,1]
[1,2,3,4,5]Here is a trace of the output:
Trace
Input Heap:
list:
[3,4,2,5,1]
[4,2,5,1] 3
[2,5,1] 3─1
[5,1] 2─1─1
[1] ┌3
2┤
└1─1
[] ┌3
1─1┤
└1─1
Output Heap:
list:
[] ┌3
1─1┤
└1─1
[1] ┌3
2┤
└1─1
[1,2] ┌2
3┤
└1
[1,2,3] 4─1
[1,2,3,4] 5
[1,2,3,4,5]
While the heap implementation presented here is pretty efficient,
note that we could significantly improve its performance with a few
optimisations: first, we could unpack all of the constructors, using a
custom list definition in Root instead
of Haskell’s built-in lists; second, in foldl' we could avoid the Maybe wrapper
by building a non-empty heap. There are probably more small
optimisations available as well.
Retrieving a Normal Heap
A problem with the definition of sortOn above is that it requires a
Monus
instance on the keys, but it only really needs Ord. It seems
that by switching to the Monus-powered
heap we have lost some generality.
Luckily, there are two monuses we can use to solve this problem:
instance Ord a => Monus (Max a) where
x ∸ y = x
instance Ord a => Monus (Last a) where
x ∸ y = xThe Max semigroup
uses the max operation,
and the Last semigroup
returns its second operand.
instance Ord a => Semigroup (Max a) where
Max x <> Max y = Max (max x y)
instance Semigroup (Last a) where
x <> y = yWhile the Monus
instances here might seem degenerate, they do actually satisfy the Monus laws as
given above.
Max and
Last
Monus laws
Max:
x <= y ==> x <> (y ∸ x) = y
Max x <= Max y ==> Max x <> (Max y ∸ Max x) = Max y
Max x <= Max y ==> Max x <> Max y = Max y
Max x <= Max y ==> Max (max x y) = Max y
x <= y ==> max x y = y
x <= y ==> z <> x <= z <> y
Max x <= Max y ==> Max z <> Max x <= Max z <> Max y
Max x <= Max y ==> Max (max z x) <= Max (max z y)
x <= y ==> max z x <= max z yLast:
x <= y ==> x <> (y ∸ x) = y
Last x <= Last y ==> Last x <> (Last y ∸ Last x) = Last y
Last x <= Last y ==> Last x <> Last y = Last y
Last x <= Last y ==> Last y = Last y
x <= y ==> z <> x <= z <> y
Last x <= Last y ==> Last z <> Last x <= Last z <> Last y
Last x <= Last y ==> Last x <= Last yEither Max or Last will
work; semantically, there’s no real difference. Last avoids
some comparisons, so we can use that:
sortOn' :: Ord k => (a -> k) -> [a] -> [a]
sortOn' k = sortOn (Last . k)Phases as a Pairing Heap
The Phases
applicative (Easterly
2019) is an Applicative
transformer that allows reordering of Applicative
effects in an easy-to-use, high-level way. The interface looks like
this:
phase :: Natural -> f a -> Phases f a
runPhases :: Applicative f => Phases f a -> f a
instance Applicative f => Applicative (Phases f)And we can use it like this:
phased :: IO String
phased = runPhases $ sequenceA
[ phase 3 $ emit 'a'
, phase 2 $ emit 'b'
, phase 1 $ emit 'c'
, phase 2 $ emit 'd'
, phase 3 $ emit 'e' ]
where emit c = putChar c >> return c
>>> phased
cbdae
"abcde"The above computation performs the effects in the order
dictated by their phases (this is why the characters are printed out in
the order cbdae), but the pure
value (the returned string) has its order unaffected.
I have written about this type before, and in a handful of
papers (2021; Gibbons et
al. 2022; Gibbons et al. 2023), but more recently
Blöndal (2025a) started looking into trying to
use the Phases pattern
with arbitrary ordered keys (Visscher 2025;
Blöndal
2025b). There are a lot of different directions you can go
from the Phases type;
what interested me most immediately was the idea of implementing the
type efficiently using standard data-structure representations. If our
core goal here is to order some values according to a key, then that is
clearly a problem that a heap should solve: enter the free applicative
pairing heap.
Here is the type’s definition:
data Heap k f a where
Pure :: a -> Heap k f a
Root :: !k -> (x -> y -> a) -> f x -> Heaps k f y -> Heap k f a
data Heaps k f a where
Nil :: Heaps k f ()
App :: !k -> f x -> Heaps k f y -> Heaps k f z -> Heaps k f (x,y,z)We have had to change a few aspects of the original pairing heap, but
the overall structure remains. The entries in this heap are now
effectful computations: the fs.
The data structure also contains some scaffolding to reconstruct the
pure values “inside” each effect when we actually run the heap.
The root-level structure is the Heap: this can
either be Pure
(corresponding to an empty heap: notice that, though this constructor
has some contents (the a), it is
still regarded as “empty” because it contains no effects (f)); or a Root, which is
a singleton value, paired with the list of sub-heaps represented by the
Heaps
type. We’re using the usual Yoneda-ish trick here to allow the top-level
data type to be parametric and a Functor, by
storing the function x -> y -> a.
The Heaps type
then plays the role of [Root k v] in
the previous pairing heap implementation; here, we have inlined all of
the constructors so that we can get all of the types to line up.
Remember, this is a heap of effects, not of pure values: the
pure values need to be able to be reconstructed to one single top-level
a when we run the heap at the
end.
Merging two heaps happens in the Applicative
instance itself:
instance Functor (Heap k f) where
fmap f (Pure x) = Pure (f x)
fmap f (Root k c x xs) = Root k (\a b -> f (c a b)) x xs
instance Monus k => Applicative (Heap k f) where
pure = Pure
Pure f <*> xs = fmap f xs
xs <*> Pure f = fmap ($ f) xs
Root xk xc xs xss <*> Root yk yc ys yss
| xk <= yk = Root xk (\a (b,c,d) -> xc a d (yc b c)) xs (App (yk ∸ xk) ys yss xss)
| otherwise = Root yk (\a (b,c,d) -> xc b c (yc a d)) ys (App (xk ∸ yk) xs xss yss)To actually run the heap we will use the following two functions:
merges :: (Monus k, Applicative f) => Heaps k f a -> Heap k f a
merges Nil = Pure ()
merges (App k1 e1 t1 Nil) = Root k1 (,,()) e1 t1
merges (App k1 e1 t1 (App k2 e2 t2 xs)) =
(Root k1 (\a b cd es -> (a,b, cd es)) e1 t1 <*> Root k2 (,,) e2 t2) <*> merges xs
runHeap :: (Monus k, Applicative f) => Heap k f a -> f a
runHeap (Pure x) = pure x
runHeap (Root _ c x xs) = liftA2 c x (runHeap (merges xs))And we can lift a computation into Phases like
so:
phase :: k -> f a -> Heap k f a
phase k xs = Root k const xs NilStabilising Phases
There’s a problem. A heap sort based on a pairing heap isn’t
stable. That means that the order of effects here can vary for
two effects in the same phase. If we look back to the example with the
strings we saw above, that means that outputs like cdbea would be possible (in actual
fact, we don’t get any reordering in this particular example, but that’s
just an accident of the way the applicative operators are associated
under the hood).
This is problematic because we would expect effects in the same phase to behave as if they were normal applicative effects, sequenced according to their syntactic order. It also means that the applicative transformer breaks the applicative laws, because effects might be reordered according to the association of the applicative operators, which should lawfully be associative.
To make the sort stable, we could layer the heap effect with some state effect that would tag each effect with its order. However, that would hurt efficiency and composability: it would force us to linearise the whole heap sort procedure, where currently different branches of the tree can compute completely independently of each other. The solution comes in the form of another monus: the key monus.
data Key k = !k :* {-# UNPACK #-} !Int deriving (Eq, Ord)A Key k is some
ordered key k coupled with an
Int that
represents the offset between the original position and the current
position of the key. In this way, when two keys compare as equal, we can
cascade on to compare their original positions, thereby maintaining
their original order when there is ambiguity caused by a key collision.
However, in contrast to the approach of walking over the data once and
tagging it all with positions, this approach keeps the location
information completely local: we never need to know that some key is in
the
th
position in the original sequence, only that it has moved
steps from its original position.
The instances are as follows:
instance Semigroup (Key k) where
(xk :* xi) <> (yk :* yi) = yk :* (xi + yi)
instance Ord k => Monus (Key k) where
(xk :* xi) ∸ (yk :* yi) = xk :* (xi - yi)This instance is basically a combination of the Last semigroup
and the
group. We could make a slightly more generalised version of Key that is
the combination of any monus and
,
but since I’m only going to be using this type for simple sorting-like
algorithms I will leave that generalisation for another time.
The stable heap type is as follows:
data Stable k f a
= Stable { size :: {-# UNPACK #-} !Int
, heap :: !(Heap (Key k) f a) }We need to track the size of the heap so that we can supply the right-hand operand with their offsets. Because we’re storing differences, we can add an offset to every entry in a heap in time by simply adding to the root:
delayKey :: Int -> Heap (Key k) f a -> Heap (Key k) f a
delayKey _ hp@(Pure _) = hp
delayKey n (Root (k :* m) c x xs) = Root (k :* (n + m)) c x xsFinally, using this we can implement the Applicative
instance and the rest of the interface:
instance Ord k => Applicative (Stable k f) where
pure = Stable 0 . pure
Stable n xs <*> Stable m ys = Stable (n+m) (xs <*> delayKey n ys)
runStable :: (Applicative f, Ord k) => Stable k f a -> f a
runStable = runHeap . heap
stable :: Ord k => k -> f a -> Stable k f a
stable k fa = Stable 1 (phase (k :* 0) fa)This is a pure, optimally efficient implementation of Phases ordered
by an arbitrary total-ordered key.
Local Computation in a Monadic Heap
In (2021), I developed a monadic heap based on the free monad transformer.
newtype Search k a = Search { runSearch :: [Either a (k, Search k a)] }This type is equivalent to the free
monad transformer over the list monad and (,) k functor (i.e. the writer
monad).
Search k a ≅ FreeT ((,) k) [] aIn the paper (2021) we extended the type to become a
full monad transformer, replacing lists with ListT. This
let us order the effects according to the weight k; however, for this example we only
need the simplified type, which lets us order the values according to
k.
This Search type
follows the structure of a pairing heap (although not as closely as the
version above). However, this type is interesting because
semantically it needs the weights to be stored as differences,
rather than absolute weights. As a free monad transformer, the Search type
layers effects on top of each other; we can later interpret those layers
by collapsing them together using the monadic join. In the case of Search, those
layers are drawn from the list monad and the (,) k functor (writer monad). That
means that if we have some heap representing the tree from above:
Search [ Right (a, Search [ Right (b, Search [ Right (d, Search [Left x])
, Right (e, Search [Left y])])
, Right (c, Search [Left z])])]When we collapse this computation down to the leaves, the weights we will get are the following:
[(a <> b <> d, x), (a <> b <> e, y), (a <> c, z)]So, if we want the weights to line up properly, we need to store the differences.
mergeS :: Monus k => [(k, Search k a)] -> Maybe (k, Search k a)
mergeS [] = Nothing
mergeS (x:xs) = Just (mergeS' x xs)
where
mergeS' x1 [] = x1
mergeS' x1 [x2] = x1 <+> x2
mergeS' x1 (x2:x3:xs) = (x1 <+> x2) <+> mergeS' x3 xs
(xw, Search xs) <+> (yw, Search ys)
| xw <= yw = (xw, Search (Right (yw ∸ xw, Search ys) : xs))
| otherwise = (yw, Search (Right (xw ∸ yw, Search xs) : ys))
popMins :: Monus k => Search k a -> ([a], Maybe (k, Search k a))
popMins = fmap mergeS . partitionEithers . runSearchConclusion
The technique of “don’t store the absolute value, store the
difference” seems to be generally quite useful; I think that monuses are
a handy algebra to keep in mind whenever that technique looks like it
might be needed. The Key monus
above is closely related to the factorial numbers, and the trick I used
in this post.