Tries for Polynomials
One of my favourite Haskell papers is McIlroy’s wonderful “Power Series, Power Serious” (1999). The paper is about power series, which are a type of infinite sums that behave like (infinite) polynomials. For example, can be represented by the following power series:
A power series is characterised fully by its coefficients, meaning that we can represent one as an infinite stream of rational numbers. In Haskell, we often use lazy lists to represent streams, so we can encode a power series with the following type:
type PowerSeries = [Rational]In this encoding, we can write as the following:
cos :: PowerSeries
cos = zipWith (*) (cycle [1,0,-1,0]) (scanl (/) 1 [1..])
>>> cos
[1,0,-1/2,0,1/24,0,-1/720,...We can also build :
sin = zipWith (*) (cycle [0,1,0,-1]) (scanl (/) 1 [1..])While it can be difficult and unintuitive to work with infinite series like the ones above, happily we can define all of the normal numeric operations on power series as (lazy) list-manipulation programs:
instance Num PowerSeries where
(x:xs) + (y:ys) = (x+y) : (xs + ys)
(x:xs) * ys = map (x*) ys + (0 : xs * ys)
negate = map negate
fromInteger n = fromInteger n : repeat 0(if you try and put this code into a Haskell interpreter you’ll get all sorts of warnings; I’ll put the full code for this post below with all of the imports and pragmas you need to get it to work)
McIlroy (1999) goes through the various algorithms and numeric operations that can be implemented on this representation, but at this point I would like to diverge from the paper and turn our focus to finite polynomials. Like a power series, a finite polynomial can be represented by a list of coefficients:
type Polynomial = [Rational]And, even though the underlying list is finite rather than infinite, the numeric operations work basically the same way as they do on power series. We just need to add clauses in each function to handle the empty list:
instance Num Polynomial where
[] + ys = ys
xs + [] = xs
(x:xs) + (y:ys) = (x+y) : (xs + ys)
[] * _ = []
(x:xs) * ys = map (x*) ys + (0 : xs * ys)
negate = map negate
fromInteger n = [fromInteger n]The only semantic trickiness with this representation is that it’s important to quotient by trailing zeroes.
instance Eq Polynomial where
[] == ys = all (0==) ys
xs == [] = all (0==) xs
(x:xs) == (y:ys) = (x == y) && (xs == ys)Evaluation and Horner’s Rule
The definition of a power series above suggests that we should implement evaluation using exponentiation and indices:
eval :: Polynomial -> Rational -> Rational
eval p x = sum (zipWith (\a i -> a * x^i) p [0..])And this does in fact give us the correct answer. Consider the polynomial :
poly = [4,2,5,-1] -- 4 + 2x + 5x² - x³
eval poly x = eval [4,0,5,-1] x
= sum (zipWith (\a i -> a * x ^ i) [4,2,5,-1] [0..])
= 4*x^0 + 2*x^1 + 5*x^2 + (-1)*x^3
= 4 + 2*x + 5*x^2 - x^3However, this evaluation algorithm is unsatisfactory in one respect:
it performs a lot of multiplication. In numeric programs, we
generally want to minimise the number of multiplications performed,
since multiplication is a relatively expensive operation (when compared
to addition or subtraction). In the example above, it takes six
multiplications to compute the result: one for
,
two for
,
and three for
.
In general, for a polynomial of degree
,
the above implementation of eval
will perform
multiplications.
There is, however, a trick that can bring the number of multiplications down to : Horner’s rule. The basic idea is to rewrite the expanded polynomial into a factorised form: . If we evaluate this expression directly, we will only have to perform three multiplications (and we don’t even have to perform any extra additions as compensation). While Horner’s rule is really quite a simple trick, the generalised pattern is surprisingly powerful (Gibbons 2011). Indeed, the representation I develop in this post is basically a data structure encoding of Horner’s rule.
Before getting there, however, let’s return to our list-based
polynomial, and look at using Horner’s rule to implement eval. Interestingly, the list-based
representation has kind of already performed our factorisation for us.
As a result, Horner’s rule evaluation is actually more natural to
implement than the expanded version above.
eval :: Polynomial -> Rational -> Rational
eval xs x = foldr (\a p -> a + x * p) 0 xsMultiple Variables
A cool trick with this representation is that if you want to support multiple variables you can smuggle them in through the coefficients. A polynomial in two variables is the same as a polynomial with coefficients drawn from another polynomial.
type TwoVar = [Polynomial]To save us having to write a separate Num instance
for TwoVar, we can
instead generalise the Num instance
on Polynomial
above:
instance Num a => Num [a] whereThe rest of the instance is the same. Now, we can write 5 ^ 2 :: Polynomial
or 6 :: TwoVar
and it will just work.
We also have to generalise the type of eval slightly:
eval :: Num a => [a] -> a -> abut again, the implementation remains the same.
With this machinery, we can now write and evaluate polynomials in 2 variables:
eval2 :: TwoVar -> Rational -> Rational -> Rational
eval2 p x y = eval (eval p [x]) y
var :: Num a => [a]
var = [0,1]
x = var
y = [var]
poly = 2 * x ^ 2 - y ^ 3 + 4
>>> poly
[[4,0,0,-1],[0],[2]]
>>> eval2 poly 2 3
-15We can even use some typeclass shenanigans to build a generalised evaluator that works with any fixed number of variables.
Implementation of an Evaluator for Polynomials in Arbitrary Variables
instance Num n => Num (e -> n) where
fromInteger = const . fromInteger
(f + g) x = f x + g x
(f * g) x = f x * g x
abs = (abs .)
signum = (signum .)
negate = (negate .)
class Num r => Poly p r | p -> r, r -> p where
evalN :: p -> r
instance Poly Integer Integer where
evalN = id
instance Poly p r => Poly [p] (Integer -> r) where
evalN xs x = foldr (\a s -> evalN a + fromInteger x * s) 0 xs>>> evalN poly 2 3
-15
z = [[var]]
>>> evalN (poly + z) 2 3 1
-14Sums of Products
While the above representation is elegant, it is inefficient, and perhaps a little unintuitive. In most implementations I have seen, variables are represented simply with a type for names, rather than the kind of implicit de Bruijn indices used above. One natural representation uses a list of terms:
newtype Poly v c = Poly { terms :: [([v], c)] }Here, a value of type Poly v c is a
polynomial with coefficients drawn from c and variables from v. It is a list of monomials, where
the outer list represents a sum, and each monomial represents a product
of variables with a single coefficient.
data Var = X | Y | Z
poly :: Poly Var Integer
poly = Poly [([X,Y,Y],5),([Z],3),([Y,Z],2)]
-- 5xy² + 3z + 2yzNote that this representation requires some normalisation:
norm :: (Ord v, Num c, Eq c) => [([v],c)] -> [([v],c)]
norm = Map.toList . Map.fromListWith (+) . filter ((/=0).snd)
instance (Num c, Ord v, Eq c) => Eq (Poly v c) where
(==) = (==) `on` norm . termsAnd we have the following Num
instance:
instance Num c => Num (Poly v c) where
fromInteger n = Poly [([],fromInteger n)]
Poly xs + Poly ys = Poly (xs ++ ys)
xs * ys = Poly [ (xv ++ yv, xc * yc) | (xv,xc) <- terms xs, (yv,yc) <- terms ys ]
negate = Poly . map (fmap negate) . termsThis representation perhaps maps more closely to the description of
multivariate polynomials that many of us will have encountered in
secondary school: it’s straightforward to see how a polynomial like
corresponds to the value Poly [([X,Y],2),([Y,Y],1),([],-3)].
The previous representation (TwoVar) would
represent the same expression as the enigmatic [[-3,0,1],[0,2]].
However, there are some wrinkles to this type that are worth noting. First we can see that multiplication is not commutative (even after normalisation).
x = Poly [([X],1)]
y = Poly [([Y],1)]
x * y == Poly [([X,Y],1)]
y * x == Poly [([Y,X],1)]
x * y /= y * xThis is in contrast to TwoVar, where
both
and
would be represented as [[0,0],[0,1]].
Conceptually, polynomials are a kind of free structure: they
represent the normalised and quotiented syntax of an algebraic theory.
The fact that Poly above
doesn’t have commutative multiplication just tells us that the
underlying algebraic theory in question here is noncommutative
rings, rather than commutative rings.
The second thing to note about this type is actually two related
observations about inefficiency. Because I didn’t implement
normalisation on any of the numeric operations, we might expect the size
of the underlying list of Poly to blow
up:
poly = (1 + x) * (3 + 4) * (y + 2)
>>> terms poly
[([Y],3),([],6),([Y],4),([],8),([X,Y],3),([X],6),([X,Y],4),([X],8)]
>>> norm (terms poly)
[([],14),([X],14),([X,Y],7),([Y],7)]And indeed it does, as you can see above. To counteract this, we can represent our polynomial as a mapping from monics (strings of variables) to coefficients:
newtype Poly v c = Poly { terms :: Map [v] c }
Num
instance for Map-based
polynomial
instance (Ord v, Num c) => Num (Poly v c) where
fromInteger n = Poly (Map.singleton [] (fromInteger n))
Poly xs + Poly ys = Poly (Map.unionWith (+) xs ys)
xs * ys = Poly (Map.fromListWith (+) [ (xv ++ yv, xc * yc)
| (xv,xc) <- Map.toList (terms xs)
, (yv,yc) <- Map.toList (terms ys) ])
negate = Poly . fmap negate . termsIndeed, this mapping is probably more efficient in practice. However,
it is still pretty inefficient to store a Map [a] v:
Haskell’s Map is a
binary search tree (but this applies to basically all mapping
structures), so search is always going to have to perform comparisons on
the keys. When those keys are lists, that comparison takes time
proportional to the length of each list. This is wasted effort that
could be cached with a cleverer data structure.
This also brings the second observation about inefficiency into focus: we have lost our neat evaluation with Horner’s rule.
eval :: Num c => Poly v c -> (v -> c) -> c
eval (Poly mp) v = Map.foldrWithKey (\vs c s -> foldr ((*) . v) c vs + s) 0 mpWe’re back to performing multiplications per term.
Both of these inefficiencies are actually the same pattern, and can be solved with a general form of Horner’s rule. We need to cache prefixes: the data structure that does that best is a trie.
A Trie
Horner’s rule saved us from performing redundant multiplications by factoring out common terms to the left. That was simple to implement in the single-variable case, but it can still apply for multiple variables. Take an expression like , and multiply it out to . We can still factor this expression to remove common prefixes, like so:
The difference between this factorisation and the list-based polynomial we started with is that the tree representing the polynomial only had one child. Here, we have a child for each leading term. In terms of the data structure, where a list has a single tail in the cons case,
data List a = Nil
| Cons a (List a)The multivariate version of the same thing will be a tree
data Tree a = Nil
| Cons a [Tree a]Or, more specifically, a trie, where the subtree mapping is based on variables.
data Poly v c = c :<+ Map v (Poly v c)A polynomial is a constant coefficient c plus the sum of variables drawn from
v each multiplied by another
polynomial. The polynomial above is represented with this type as the
following:
4 :<+ {(X,12 :<+ {(X,9 :<+ {}),(Y,(-15) :<+ {})}),(Y,(-20) :<+ {(X,(-15) :<+ {}),(Y,25 :<+ {})})}This trie type (with some improvements I’ll describe below) is the focus of this post; I think it’s a cool data structure for representing polynomials.
The numeric functions on Tries
Let’s first write evaluation:
eval :: Num c => (v -> c) -> Poly v c -> c
eval f (c :<+ vs) = c + Map.foldrWithKey (\v p s -> f v * eval f p + s) 0 vsNotice that we have retrieved Horner’s rule: the evaluation of each term only performs a single multiplication; we don’t have to repeat multiplications for terms that share prefixes any more.
(for those concerned with performance, it might be worth swapping out
foldrWithKey with a strict
variant. (also, this is somewhat unrelated but a bit of a pet peeve of
mine: this is not a place where foldl' is the best option! foldl' is not a panacea!))
The numeric operations on this data structure can be implemented as follows:
deriving instance Functor (Poly v)
instance (Ord v, Num c, Eq c) => Num (Poly v c) where
fromInteger n = fromInteger n :<+ Map.empty
(n :<+ ns) + (m :<+ ms) = (n + m) :<+ Map.unionWith (+) ns ms
(n :<+ ns) * ms = fmap (n*) ms + (0 :<+ fmap (*ms) ns)
negate = fmap negateIt’s worth taking a moment to note how efficient these operations are
(for a pointer-ridden high-level language like Haskell, that is). We
don’t have to compare any strings; we can use Data.Map’s
efficient unionWith on single
variables; and multiplication doesn’t have to expand out any Cartesian
product.
I will note that we do have to perform a little bit of normalisation
for the derived Eq instance to
be correct: we have to remove terms that multiply to zeros. Pruning dead
branches like this is a pretty standard procedure on tries; in
polynomial terms, that just means we have to get rid of entries in the
map that evaluate to zero (so
should be pruned to
).
This can be done without really changing the efficiency of the
operations above, but it does make them slightly more verbose.
Normalising Num instance
For this version, we will rely on the extremely efficient custom merge operations in containers.
0 <+? ns | Map.null ns = Nothing
n <+? ns = Just (n :<+ ns)
instance (Ord v, Num c, Eq c) => Num (Poly v c) where
fromInteger n = fromInteger n :<+ Map.empty
a + b = fromMaybe 0 (add a b)
where
add (n :<+ ns) (m :<+ ms) =
(n + m) <+?
Map.merge
Map.preserveMissing
Map.preserveMissing
(Map.zipWithMaybeMatched (const add))
ns ms
_ * (0 :<+ ms) | Map.null ms = 0 :<+ Map.empty
(0 :<+ ns) * ms = 0 :<+ fmap (*ms) ns
(n :<+ ns) * ms = fmap (n*) ms + (0 :<+ fmap (*ms) ns)
negate = fmap negate
abs = fmap abs
signum (n :<+ _) = signum n :<+ Map.emptyAnyways, when we have all of the above instances, we can manipulate polynomials using the API you might expect, and the normalisation behaviour happens automatically.
data Var = X | Y deriving (Eq, Ord, Show)
var :: Num c => v -> Poly v c
var v = 0 :<+ Map.singleton v (1 :<+ Map.empty)
x,y :: Poly Var Integer
x = var X
y = var Y
poly = (2 + 3 * x - 5 * y) ^ 2
>>> poly
4 + Y*(-20 + Y*25 + X*(-15)) + X*(12 + Y*(-15) + X*9)Lenses and Division
Lenses in Haskell are very cool, and personally I think one of the best demonstrations of their power is tries. A few years ago, when I was still on Twitter, I posted an implementation of a trie that fit in a tweet (gist link).
Tweet Trie
{-# LANGUAGE RankNTypes #-}
import Control.Comonad.Cofree
import Control.Lens hiding ((:<))
import qualified Data.Map as Map
import Data.Map (Map)
import Prelude hiding (lookup)
import Data.Maybe (isJust)
import Test.QuickCheck
type Trie a b = Cofree (Map a) (Maybe b)
string :: Ord a => [a] -> Lens' (Trie a b) (Maybe b)
string =
foldr
(\x r -> _unwrap . at x . anon (Nothing :< mempty)
(\(v :< m) -> null v && null m) . r)
_extract
insert :: Ord a => [a] -> b -> Trie a b -> Trie a b
insert xs x = string xs .~ Just x
lookup :: Ord a => [a] -> Trie a b -> Maybe b
lookup = view . string
delete :: Ord a => [a] -> Trie a b -> Trie a b
delete xs = string xs .~ NothingLenses are what allowed this very terse implementation. The original purpose of lenses was to facilitate deep access in nested records and data structures: a trie is effectively a nested map, so it’s no great surprise that lenses are a good fit.
It turns out that lenses are also useful for manipulating polynomial tries. At first, it might be difficult to see why: in the trie implementation above, a lens was used to build getters and setters for a mapping from strings to payloads. But what does that translate to in the context of a polynomial? What does it mean to “look up” a string of variables in some expression like ?
It turns out that lookups corresponds to division. For example, dividing the polynomial by the monic gives us a quotient and remainder .
>>> divMod (2 * x ^ 2 + y) [X,X]
(2, y)This is already quite similar to a lens: before the van Laarhoven encoding, lenses were usually thought of as functions that took a data structure and returned a pair of the “focus” of the lens and the “rest” of the structure. In polynomial terms, that “focus” is the quotient, and the “rest” is the remainder.
But that’s a little vague. Let’s construct the actual lenses here, in the van Laarhoven style:
constant :: Lens' (Poly v c) c
constant f (c :<+ vs) = fmap (:<+ vs) (f c)
vars :: Lens (Poly v c) (Poly v' c) (Map v (Poly v c)) (Map v' (Poly v' c))
vars f (c :<+ vs) = fmap (c :<+) (f vs)
isZero :: (Num c, Eq c) => Poly v c -> Bool
isZero (n :<+ ns) = (0 == n) && Map.null ns
factored :: (Ord v, Num c, Eq c) => [v] -> Lens' (Poly v c) (Poly v c)
factored = foldr (\v vs -> vars . at v . anon 0 isZero . vs) idThis last lens does indeed give us an interface that looks like division:
>>> view (factored [X,X]) (2*x^2 + y)
2
>>> set (factored [X,X]) 0 (2*x^2 + y)
YIf we want to define an actual division function, we can define it in
terms of factored, in a fun
example of the kind of golfy code that lens enables.
divMod :: (Ord v, Num c, Eq c) => Poly v c -> [v] -> (Poly v c,Poly v c)
divMod p vs = factored vs (,0) p
>>> (2*x^2 + y) `divMod` [X,X]
(2,Y)Gröbner Bases
While the interface above lets us do some basic computer algebra, to do any serious work with polynomials we will have to at some point compute Gröbner bases. A Gröbner basis is… somewhat hard to define, actually. I’ll quote an explainer on the topic by Sturmfels (2005):
A Gröbner basis is a set of multivariate polynomials that has desirable algorithmic properties
Basically, in several algorithms over polynomials (division, Gaussian elimination, etc.) it becomes necessary at some point to compute this thing called a Gröbner Basis.
There is a lot of published literature on computing Gröbner bases in different settings. However, the trie polynomial I have built above is fundamentally noncommutative, and the literature on computing Gröbner bases for noncommutative rings is comparatively smaller. I have been following Xiu’s thesis (2012) for this project. It outlines a noncommutative version of Buchberger’s algorithm, and a few optimisations that I was able to implement.
One slightly annoying aspect of these algorithms is that they tend to use monomials as a primitive. In other words, instead of working with the polynomial directly, the algorithms tend to describe operations with the assumption that your representation is basically a list of monomials. In particular, the algorithms will frequently extract the “leading” monomial, and it becomes important for performance that the polynomial representation can provide that leading monomial quickly. Unfortunately, extraction of the leading monomial is slightly awkward on the trie representation (or certainly less natural than the implementation on a listed representation); so we will need to do some work to implement it.
Monomial Orderings
The first important concept to implement for Gröbner bases is an admissible monomial ordering. This is a total order on strings of variables that is “admissible”; meaning that it respects concatenation on both sides, and it also is a well-ordering, meaning that any strictly descending chain is finite.
These constraints rule out the usual lexicographic ordering on strings. Instead, we’ll go with graded lexicographic. This means we first compare strings for length, and only in the case where they’re equal do we move to the normal lexicographic comparison.
grlex :: Ord a => [a] -> [a] -> Ordering
grlex xs ys
| length xs < length ys = LT
| length xs > length ys = GT
| otherwise = compare xs ysWe can improve the efficiency of the above function somewhat by using
one of my favourite monoids: the monoid instance on Ordering.
grlex :: Ord a => [a] -> [a] -> Ordering
grlex = go EQ
where
go !a [] [] = a
go !a [] (_:_) = LT
go !a (_:_) [] = GT
go !a (x:xs) (y:ys) = go (a <> compare x y) xs ysThis version performs just one pass through each list, and does the correct comparison without additionally calculating the length. It’s also nonstrict: if one of the lists passed is infinite, this comparison will still terminate.
Another admissible order we could use is reverse grlex,
which basically amounts to reversing the lists before the comparison.
The trie structure means that we’re basically forced to use grlex, but I will include an
implementation of grevlex here
because I think it’s cute.
Implementations of grevlex
grevlex :: Ord a => [a] -> [a] -> Ordering
grevlex [] [] = EQ
grevlex (_:_) [] = GT
grevlex [] (_:_) = LT
grevlex (x:xs) (y:ys) = grevlex xs ys <> compare x y
-- This version is tail-recursive, but it also might unnecessarily compare
-- elements. However, that should be cheaper than building up the list of
-- comparisons.
grevlex :: Ord a => [a] -> [a] -> Ordering
grevlex = go EQ
where
go !a [] [] = a
go !a (_:_) [] = GT
go !a [] (_:_) = LT
go !a (x:xs) (y:ys) = go (compare x y <> a) xs ysEnumerating Monomials
The problem with all the admissible monomial orderings is that they need to see the entire monomial before they can decide whether it’s ordered before or after another. This is at odds with the trie, which tends to prefer computations that can be described in terms of prefix/suffix decompositions.
To demonstrate the problem, let’s take a look at an algorithm that enumerates the monomials of a polynomial in lexicographic order:
monos :: (Num c, Eq c) => Poly v c -> [([v],c)]
monos p = search [] p []
where
cons vs 0 ms = ms
cons vs c ms = (reverse vs,c) : ms
search sv (n :<+ ns) ms = cons sv n (Map.foldrWithKey (search . (:sv)) ms ns)
>>> monos ((2 + 3*x - 5*y) ^ 2)
[([],4),([X],12),([X,X],9),([X,Y],-15),([Y],-20),([Y,X],-15),([Y,Y],25)]Notice that the function search emits the monomial (reverse sv, n)
straight away (if n /= 0),
when it encounters it: for a proper admissible monomial ordering, it
would instead want to first emit monomials of higher degree; that is,
those monomials in the map ns.
However, we can’t just flip the order of consing in search: notice that even if we
reversed the output, we still wouldn’t get an admissible monomial
ordering (the singleton list [Y] should be
grouped with the other singleton lists). The problem is that monos is performing a
depth-first search. What we need is breadth-first.
I happen to be a little obsessed with breadth-first search, so I probably spent too much time on this particular implementation, but I do always get excited when I see a breadth-first traversal pop up in the wild.
For this case, I started with the levels function.
levels :: (Num c, Eq c) => Poly v c -> [[([v],c)]]
levels p = search [] p []
where
cons _ 0 ms = ms
cons vs c ms = (reverse vs,c) : ms
search sv (n :<+ ns) [] = cons sv n [] : Map.foldrWithKey (search . (:sv)) [] ns
search sv (n :<+ ns) (q:qs) = cons sv n q : Map.foldrWithKey (search . (:sv)) qs ns
>>> levels ((2 + 3*x - 5*y) ^ 2)
[[([],4)],[([X],12),([Y],-20)],[([X,X],9),([X,Y],-15),([Y,X],-15),([Y,Y],25)]]I have written about levels before (see also Gibbons 2015; Jones and Gibbons 1993).
I think it’s a good fit here because it lets us build the prefix
string for each monomial in a natural way (that prefix string is the
sv that’s passed to search).
However, one flaw of this function is that it produces a list of lists: one inner list for each degree of polynomial. The output that I actually want, however, is the concatenation of the whole thing.
In reality, this isn’t actually a flaw: we can just call concat and
move on. I had a feeling, though, that there was probably some annoying
circular program that would let us avoid the second traversal to
concatenate the inner lists. Inspired by Geraint Jones’ cyclic
breadth-first traversal (1993), I finally arrived at the
following solution:
data Knots a
= Knot
{ tied :: !Bool
, yank :: [a]
, ends :: Knots a }
tighten :: Knots a -> Knots a
tighten ~(Knot t y e) = Knot False (if t then y else []) (tighten e)
monos :: (Eq c, Num c) => Poly v c -> [([v],c)]
monos p = y
where
Knot _ y e = tie [] p (tighten e)
cons sv 0 ms = ms
cons sv c ms = (reverse sv, c) : ms
tie sv (n :<+ m) (Knot _ ms ps) = Knot True (cons sv n ms) (Map.foldrWithKey (tie . (:sv)) ps m)
>>> monos ((2 + 3 * x - 5 * y) ^ 2)
[([],4),([X],12),([Y],-20),([X,X],9),([X,Y],-15),([Y,X],-15),([Y,Y],25)]While this does order the output according to grlex, it’s ordered from smallest to largest, which is the reverse of what we want. And yes, while we could just reverse the output, I didn’t write the circular abomination above to throw away the single-pass traversal at such a small hurdle. Any (list-based) algorithm written in a fold-like fashion can usually be reversed by swapping out right-folds for left.
pull :: Knots a -> [a]
pull (Knot True _ e) = pull e
pull (Knot False y _) = y
monosDesc :: (Eq c, Num c) => Poly v c -> [([v],c)]
monosDesc p = pull r
where
r = tie [] p (Knot False [] (tighten r))
cons sv 0 ms = ms
cons sv c ms = (reverse sv, c) : ms
tie sv (n :<+ m) (Knot _ ms ps) = Knot True (cons sv n ms) (Map.foldlWithKey (\a v p -> tie (v:sv) p a) ps m)Efficiently Popping the Leading Monomial
Unfortunately, as fun as monosDesc is, it doesn’t really do
what we need it to for most of the Gröbner basis algorithms. While it is
pretty efficient if we want all of the monomials of a
polynomial, usually we just want the first one. And sadly,
while monosDesc is linear
overall, it’s not lazy in the right way, meaning that we have to pay
that full linear cost even if we only inspect the first element of the
list it produces.
The solution here will require us to use a new data structure in
place of the Map that we
have currently. To avoid traversing the whole tree to find the largest
monomial, we need to cache the depth of each subterm so that we can just
descend into the subterm which contains the monomial of the highest
degree. But we don’t want to just swap out our Map v (Poly v c)
for a Map v (Word, Poly v c):
that solution would require us to walk over every entry in the map to
find the largest Word. While it
would be an improvement in practical terms, it would still incur an
cost to find the leading monomial.
Instead, we need the map itself to be able to efficiently provide the entry with the largest degree. We need our map to simultaneously act as a priority queue.
Luckily, the combination of these two structures has been researched
before: Hinze (2001)
wrote about “priority search trees”, a data structure that allows for
lookup and insertion based on some ordered key, and separately allows
for a
popMin operation, based on some
separate priority. The psqueues package provides a few
implementations of this technique. The API isn’t quite as extensive as,
say, containers, so some functions will
be slightly less efficient (we don’t get a nice general merge function, for example), but we
can basically drop in the OrdPSQ as a
replacement for Map.
type SubTerms v c = OrdPSQ (Down v) (Down Word) (Poly v c)
data Poly v c = c :<+ SubTerms v cI’m using the Down
wrapper here because I want a max heap, rather than a
min-heap. I’m using that wrapper on both the keys and priorities because
OrdPSQ
breaks priority ties according to the keys, and I also want greater keys
returned first, to follow the grlex ordering.
The priority here is the depth of the tree. It tells us the length of the longest monomial contained:
depth :: Poly v c -> Word
depth (_ :<+ ns) = maybe 0 (\(_,Down p,_) -> succ p) (Map.findMin ns)This operation is
,
since finding the minimum entry in OrdPSQ is
.
I’ll also use the following isomorphism, for the lensy things:
entry :: (Num c, Eq c) => Iso' (Maybe (Down Word, Poly v c)) (Poly v c)
entry = iso (maybe (0 :<+ Map.empty) snd) (\p -> if isZero p then Nothing else Just (Down (depth p), p))This lets us chain together lenses that index into an OrdPSQ.
factored :: (Ord v, Num c, Eq c) => [v] -> Lens' (Poly v c) (Poly v c)
factored = foldr (\v ls -> vars . at (Down v) . entry . ls) idFinally, we can implement a function that pops the leading monomial from a polynomial, efficiently:
leading :: (Num c, Eq c, Ord v) => Poly v c -> Maybe (([v],c),Poly v c)
leading p | isZero p = Nothing
leading (n :<+ ns) = Just (retrie (Map.alterMin step ns))
where
retrie ((r,n'),ns') = (r, n' :<+ ns')
step Nothing = ((([],n),0),Nothing)
step (Just (Down v, _, p)) = (((v:vs,c),n), subTrie)
where
Just ((vs,c),p') = leading p
subTrie | isZero p' = Nothing
| otherwise = Just (Down v, Down (depth p'), p')And it matches the earlier enumeration that we built:
prop_leadingMonos :: Poly Var Word -> Property
prop_leadingMonos p = monosDesc p === unfoldr leading pUses
I think this is an interesting data structure, and representation of polynomials. However, I am not very familiar with the computer algebra literature, so I can’t yet tell how this kind of representation relates to the other systems out there. Furthermore, most of the algorithms I have read seem to work implicitly with “leading monomials” etc., leading to the following kind of implementation of division:
divModPrefM :: (Fractional c, Eq c, Ord v) => Poly v c -> ([v],c) -> (Poly v c, Poly v c)
divModPrefM p (vs, i) = factored vs ((, 0) . fmap (/i)) p
divModPref :: (Fractional c, Eq c, Ord v) => Poly v c -> Poly v c -> (Poly v c, Poly v c)
divModPref num divisor = case leading divisor of
Nothing -> error "Divide by zero"
Just (lt, rest) -> go 0 num
where
go !quot !rem = case divModPrefM rem lt of
(0, _) -> (quot, rem)
(q, rem') -> go (quot + q) (rem' - rest * q)I feel that this doesn’t make use of the benefits of the trie-based representation. I have implemented Buchberger’s algorithm (with most of the improvements from Xiu 2012), but I have yet to really research in depth what competitively fast systems do these days (Heisinger and Hofstadler 2025; Cohen and Knopper 2026; Levandovskyy, Schönemann, and Zeid 2020). I’m also interested in seeing what kinds of applications there are for this stuff: I started this project with Weyl algebras in mind, but after looking into it a little more it seems clear that a trie is not a good fit for Weyl algebras.
I would actually be interested to hear if anyone has any pointers to work that has a similar approach to polynomials, or on the kinds of things that people use these noncommutative polynomials for. I find most of the descriptions of these algorithms difficult to parse (since they’re usually written by and for mathematicians rather than computer scientists, and almost never for functional programmers), so I am sure I’m missing some major projects.
Gists
Listed Polynomial with arbitrary variables