Deques, Queues, and Lists in Swift with Indirect
Recursive enums have finally arrived. Woo! The first thing to do with these is to make a recursive list:
public enum List<Element> {
case Nil
indirect case Cons(head: Element, tail: List<Element>)
}The head stores the element,
and tail is a reference to the
rest of the list. As you can imagine, getting at the head is pretty easy, while accessing
elements further along is more difficult. There’s a common pattern for
dealing with these recursive structures: if you have a function that
performs some transformation on a list, it will take the head, perform that transformation on it,
and then call itself recursively on the tail. If it’s given an empty list, it
returns an empty list. For instance, here’s the map function,
defined in Haskell:
map _ [] = []
map f (x:xs) = f x : map f xsThe two lines are analogous to a switch statement in Swift. The
parameters for map are a
transformation function and a list. So, the first line has _ (wildcard) for the function, and
[] (empty) for the list, meaning
it will match any function and an empty list. It returns an empty
list.
The second line matches a function (which it assigns the name f) and then decomposes the list it’s
given into a head (x) and tail
(xs). It then calls f on the head, and prepends (the : operator is
prepends, also called “cons” by convention) the result to itself called
recursively on the tail.
With switch statements and the indirect keyword, we’re getting pretty
close to that level of brevity (terseness?) in Swift:
extension List {
public func map<T>(@noescape transform: Element -> T) -> List<T> {
switch self {
case .Nil: return .Nil
case let .Cons(head, tail): return
.Cons(head: transform(head), tail: tail.map(transform))
}
}
}We can define our own “cons”, to clean it up a little. We’re not
allowed to use :, so I went
with |>, which is,
in my mind, reasonably representative of “cons”.
infix operator |> {
associativity right
precedence 100
}
public func |> <T>(lhs: T, rhs: List<T>) -> List<T> {
return .Cons(head: lhs, tail: rhs)
}
extension List {
public func map<T>(@noescape transform: Element -> T) -> List<T> {
switch self {
case .Nil: return .Nil
case let .Cons(head, tail):
return transform(head) |> tail.map(transform)
}
}
}Pretty soon you can start doing some elegant and exciting things with
lists. The recursive pattern is very well suited to
higher-order functions and other FP staples. Take, for instance, the
reduce function:
extension List {
public func reduce<T>(initial: T, @noescape combine: (T, Element) -> T) -> T {
switch self {
case .Nil: return initial
case let .Cons(h, t):
return t.reduce(combine(initial, h), combine: combine)
}
}
}Or a transposing function:
func transpose<T>(mat: List<List<T>>) -> List<List<T>> {
switch mat {
case let .Cons(x, xs) where x.isEmpty: return transpose(xs)
case let .Cons(.Cons(x, xs), xss):
return (x |> xss.flatMap{$0.first}) |>
transpose(xs |> xss.map{$0.tail})
default: return .Nil
}
}
let jo: List<List<Int>> = [[1, 2, 3], [1, 2, 3], [1, 2, 3]]
transpose(jo) // [[1, 1, 1], [2, 2, 2], [3, 3, 3]]You can do foldr, which is like
reduce, but works in reverse:
extension List {
func foldr<T>(initial: T, @noescape combine: (element: Element, accumulator: T) -> T) -> T {
switch self {
case .Nil: return initial
case let .Cons(x, xs):
return combine(
element: x,
accumulator: xs.foldr(initial, combine: combine)
)
}
}
}Using foldr, you can get all of
the non-empty subsequences of a list:
extension List {
var subsequences: List<List<Element>> {
switch self {
case .Nil: return .Nil
case let .Cons(x, xs):
return [x] |> xs.subsequences.foldr([]) {
(ys, r) in ys |> (x |> ys) |> r
}
}
}
}
let jo: List = [1, 2, 3]
jo.subsequences // [[1], [2], [1, 2], [1, 3], [2, 3], [1, 2, 3]](these examples are all translated from the Haskell standard library) Lists are extremely fun, and some functions you would have found yourself writing on 10-15 lines can be got into 2-3. To get a better feel for playing around with lists, it’s useful to have them conform to some protocols that make them easier to work with in a playground.
For instance, making a list currently looks like this:
let jo: List = 1 |> 2 |> 3 |> .NilWhich is fine, and better than:
let jo: List = .Cons(head: 1, tail: .Cons(head: 2, tail: .Cons(head: 3, tail: .Nil)))but still not fantastic. The obvious next step is making List ArrayLiteralConvertible, but there’s a
small catch. We don’t have an append function for lists (yet). So we
can’t, off the bat, do something like this:
extension List : ArrayLiteralConvertible {
public init(arrayLiteral: Element...) {
var ret: List<Element> = .Nil
for el in arrayLiteral { ret.append(el) }
self = ret
}
}And nor do I think we’d want to. Operations on the end of lists are slow: you have to walk along the entire list every time.
We could reverse the sequence we want to turn into a list,
and prepend as we go. But… that’s inefficient too. Sure, Arrays are fast
to reverse, but other sequences aren’t. For those that can’t be reversed
lazily, you’re storing an extra sequence in memory unnecessarily.
But there’s something that we can use: generators. In Swift,
generators are like super-imperative, crazy-unsafe recursive lists. When
you can the next()
method on a generator, you get the “head” back. Crucially, though:
the generator is left with the tail. Making use of this fact
too often will lead to bugs, but if we wrap it up in private, it’s a
perfect fit:
extension List {
private init<G : GeneratorType where G.Element == Element>(var gen: G) {
if let head = gen.next() {
self = head |> List(gen: gen)
} else {
self = .Nil
}
}
}The potential bug here is kind of interesting. If, instead of an
infix operator for cons, we’d had a method on List that did
the same thing:
extension List {
public func prepended(with: Element) -> List<Element> {
return .Cons(head: with, tail: self)
}
}We’d be able to curry that function in a map(),
and get an init function that’s
very pretty:
extension List {
private init<G : GeneratorType where G.Element == Element>(var g: G) {
self = g.next().map(List(g: g).prepended) ?? .Nil
}
}But it won’t run. Since the recursive call to the function is
curried, it’s resolved before the g.next()
part. Which means that, regardless of whether g returns nil or not, the call will be made,
causing an infinite loop of sadness. To fix it, you have to make the
order of operations clear: do not make a recursive call if
g.next()
returns nil.
extension List {
private init<G : GeneratorType where G.Element == Element>(var gen: G) {
if let head = gen.next() {
self = head |> List(gen: gen)
} else {
self = .Nil
}
}
public init<S : SequenceType where S.Generator.Element == Element>(_ seq: S) {
self = List(gen: seq.generate())
}
}
extension List : ArrayLiteralConvertible {
public init(arrayLiteral: Element...) {
self = List(arrayLiteral.generate())
}
}This all makes it easy to initialise a list. Being able to see the list and its contents is also important. Currently, we’ve got this mess:
![]()
When what we really want is a comma-separated list of the contents.
We also probably want some demarcation at either end, so it’s easier to
recognise nested lists. I’m not sure what the best demarcation would be:
ideally it should be different to an Array’s square brackets, but not
confusing either. I went with [: and :] in the end,
though I’m not terribly happy about it:
![]()
To get that printout on the right-hand-side of your playground, you
need to make your type CustomDebugStringConvertible. There’s
one one interesting problem with this: how do you know the contents of
your list are printable? You can’t extend your struct to have
conditional conformance, like this:
extension List (where Element : CustomDebugStringConvertible) : CustomDebugStringConvertible {...However, you can’t just get a string representation of something that
doesn’t have one. Luckily, String has an
initialiser that takes anything. It uses runtime reflection to
do so. Here’s what the extension ends up looking like:
extension List : CustomDebugStringConvertible {
public var debugDescription: String {
return"[:" + ", ".join(map{String(reflecting: $0)}) + ":]"
}
}To use the join()
function, of course, List needs to
conform to SequenceType. We’ll
need some generator that swaps out the current List struct on
each iteration, and returns the head. You could just use anyGenerator but, since it’s a class,
it’s significantly slower than defining a new struct.
public struct ListGenerator<Element> : GeneratorType, SequenceType {
private var list: List<Element>
public mutating func next() -> Element? {
switch list {
case .Nil: return nil
case let .Cons(head, tail):
list = tail
return head
}
}
public func generate() -> ListGenerator { return self }
}
extension List : SequenceType {
public func generate() -> ListGenerator<Element> {
return ListGenerator(list: self)
}
}And you’ve got a SequenceType
that’s normal-looking and easy to work with.
Laziness
I’m not sure if this is entirely relevant here, but I do
like laziness, so I thought I’d make a version of List that was
lazy. It turns out it’s easy to do: in fact, it was possible before
indirect enums. So, starting with
the standard List
definition:
public enum LazyList<Element> {
case Nil
indirect case Cons(head: Element, tail: LazyList<Element>)
}Let’s make it lazy. The main idea would be to defer the resolution of
tail. What we really want is for
tail to be a function that returns a list, rather than a list
itself.
public enum LazyList<Element> {
case Nil
case Cons(head: Element, tail: () -> LazyList<Element>)
}This is the reason that indirect isn’t needed: because tail
isn’t a list, all that’s stored in the enum is the reference to the
function. This is what indirect
does automatically, or what the Box struct did
manually.
There are some more wrinkles with laziness. For instance, our old infix operator won’t work:
public func |> <T>(lhs: T, rhs: LazyList<T>) -> LazyList<T> {
return .Cons(head: lhs, tail: rhs)
}Again, because tail is meant to be a function that returns a list, not a list itself. This would work, but not in the way we intend it:
public func |> <T>(lhs: T, rhs: LazyList<T>) -> LazyList<T> {
return .Cons(head: lhs, tail: {rhs})
}Whatever’s to the right-hand-side of the operator will get resolved, and then put into the closure, which we don’t want. For instance, this:
func printAndGiveList() -> LazyList<Int> {
print(2)
return .Nil
}
2 |> 1 |> printAndGiveList()Will give you a “LazyList”, but
2 gets printed, meaning that it’s not really behaving
lazily.
@autoclosure to the rescue!
This is a little annotation you put before your parameters that can let
you decide when to evaluate the argument.
public func |> <T>(lhs: T, @autoclosure(escaping) rhs: () -> LazyList<T>) -> LazyList<T> {
return .Cons(head: lhs, tail: rhs)
}The escaping in the brackets is
needed to signify that the closure will last longer than the lifetime of
the scope it is declared in. If you test this new version with the printAndGiveList()
function, you’ll see that 2 does not get printed. In fact, the
behaviour of this operator lets us use a lot of the same code from the
strict list, without the strictness. (The generator
initialiser, for instance: the same code, if used to initialise a lazy
list, will work. In fact, if the underlying sequence that the generator
comes from is lazy, that laziness is maintained in the lazy
list. That’s pretty cool.)
There’s an interesting point to be made, here. The usual definition for a lazy programming language is one in which functions do not evaluate their arguments until they need to. In contrast, eager languages evaluate function arguments before the body of the function. This kind of makes it seem that you could treat Swift as a totally lazy language…
At any rate, this new-and-improved operator works exactly as we want
it. It’s properly lazy. The rest is easy: every time tail was used in List, replace it
with tail().
The Deque
Lists are useful. They let you operate on their first element in time, which makes a lot of sense, since you often find yourself starting there.
They’ve got some disadvantages, though: for one, to get to the nth element, you have to walk along n elements in the list. So while operations of the start are fast, operations on the end are painfully slow. And forget about efficient indexing.
This is where a Deque comes in. When you need to operate on two ends of a collection, a Deque is what you want to be using. Removal of the first and last element, prepending, and appending are all .
It’s made up of two lists: one for the front half, and one, in reverse, for the back half. With that information we’ve enough to get a definition down:
public struct Deque<Element> {
private var front, back: List<Element>
}You’ve got to do similar things that you did to the list to get an
easy-to-work-with struct. CustomDebugStringConvertible, ArrayLiteralConvertible, etc. It’s not
tremendously interesting, so here it is:
extension Deque : CustomDebugStringConvertible {
public var debugDescription: String {
return
", ".join(front.map{String(reflecting: $0)}) +
" | " +
", ".join(back.reverse().map{String(reflecting: $0)})
}
}
extension Deque {
public init(array: [Element]) {
let half = array.endIndex / 2
front = List(array[0..<half])
back = List(array[half..<array.endIndex].reverse())
}
}
extension Deque : ArrayLiteralConvertible {
public init(arrayLiteral: Element...) {
self.init(array: arrayLiteral)
}
}
extension Deque {
public init<S : SequenceType where S.Generator.Element == Element>(_ seq: S) {
self.init(array: Array(seq))
}
}The debug output puts a | between the
two lists:
![]()
This makes it clear how the performance characteristics come about: because the second half is a reversed list, all of the operations on the end of the Deque are operations on the beginning of a list. And that’s where lists are fast.
But there’s an obvious issue. Say we take that list, and start removing the first element from it:
let a = an.tail // 2, 3 | 4, 5, 6
let b = a.tail // 3 | 4, 5, 6
let c = b.tail // | 4, 5, 6
let d = c.tail // ?????The front will end up being empty. The solution to this is the second
important element to a Deque. It needs an invariant: if its number of
elements is greater than one, neither the front list nor the back will
be empty. When the invariant gets violated, it needs to fix it. We can
check that the invariant has been upheld with a switch statement:
extension Deque {
private mutating func check() {
switch (front, back) {
case (.Nil, let .Cons(head, tail)) where !tail.isEmpty: fix()
case (let .Cons(head, tail), .Nil) where !tail.isEmpty: fix()
default:
return
}
}
}The first case is the front is empty, and the back has more than one element, and the second case is the back is empty, and the front has more than one element. To fix it, just chop off the tail of the non-empty list, reverse it, and assign it to the empty list:
extension Deque {
private mutating func check() {
switch (front, back) {
case (.Nil, let .Cons(head, tail)) where !tail.isEmpty:
(front, back) = (tail.reverse(), [head])
case (let .Cons(head, tail), .Nil) where !tail.isEmpty:
(back, front) = (tail.reverse(), [head])
default:
return
}
}
}Now, wherever we have a mutating method that may cause a violation of
the invariant, this check is
called. One particularly cool way to do this is by using didSet:
public struct Deque<Element> {
private var front: List<Element> { didSet { check() } }
private var back : List<Element> { didSet { check() } }
}This will call check()
whenever either list is mutated, ensuring you can’t forget. If a
new Deque is initialised, though, it won’t be called. I don’t
trust myself to remember the check()
on every init, so we can put it into the initialiser:
private init(_ front: List<Element>, _ back: List<Element>) {
(self.front, self.back) = (front, back)
check()
}This is the only initialiser so far, so it’s the only one I’m allowed
to call. However, there may be some cases where I know that the
front and back are balanced. So I want a separate initialiser for those,
for efficiency’s sake. But it’s got to be called init no matter what, so how can I
specify that I want to use the non-checking initialiser, over the
checking one? I could have a function called something like initialiseFromBalanced that returns a
Deque, but I don’t like that. You could use labelled arguments. Erica
Sadun has a cool post on using them with subscripts, and here’s what
it would look like with init:
extension Deque {
private init(balancedFront: List<Element>, balancedBack: List<Element>) {
(front, back) = (balancedFront, balancedBack)
}
}So now we have a default initialiser that automatically balances the Deque, and a specialised one that takes two lists already balanced.
There is an extra function on lists in the check()
function: reverse().
There are a load of different ways to do it. If you’re in the mood for
golf:
let joanne: List = [1, 2, 3, 4, 5, 6]
joanne.reduce(.Nil) { $1 |> $0 } // 6, 5, 4, 3, 2, 1Or, if you’d like to keep it recursive:
extension List {
private func reverse(other: List<Element>) -> List<Element> {
switch self {
case .Nil: return other
case let .Cons(head, tail): return tail.reverse(head |> other)
}
}
public func reverse() -> List<Element> {
return reverse(.Nil)
}
}Obviously, you want to avoid this operation as much as possible. We’ll have to bear that in mind when we’re adding other functions.
So what kind of operations do we want on Deques? Well, removeFirst()
and removeLast()
would be a start:
extension Deque {
public mutating func removeFirst() -> Element {
return front.removeFirst()
}
public mutating func removeLast() -> Element {
return back.removeFirst()
}
}And the function on lists:
extension List {
public mutating func removeFirst() -> Element {
switch self {
case .Nil: fatalError("Cannot call removeFirst() on an empty list")
case let .Cons(head, tail):
self = tail
return head
}
}
}The other functions are easy enough to figure out: dropFirst(),
dropLast(),
etc. And, since it conforms to SequenceType, it gets all of the
sequence methods from the standard library, as well. However, those
methods are designed for other kinds of sequences - Arrays, String.CharacterViews,
etc. There are much more efficient ways to do most of them.
reverse, for instance, is just
this:
extension Deque {
public func reverse() -> Deque<Element> {
return Deque(balancedFront: back, balancedBack: front)
}
}(Since reverse can’t change the number of elements in either list, we
can use the initialiser that takes a balanced front and back.) Other
methods like map(),
filter(),
etc., will just give you back an array. If we wanted to keep the Deque,
we’d have to convert it back, which involves reversing, which is
expensive. So we should do our own methods for those:
extension Deque {
public func map<T>(@noescape transform: Element -> T) -> Deque<T> {
return Deque<T>(
balancedFront: front.map(transform),
balancedBack : back .map(transform)
)
}
}
extension Deque {
public func filter(@noescape includeElement: Element -> Bool) -> Deque<Element> {
return Deque(front.filter(includeElement), back.filter(includeElement))
}
}filter()
changes the number of elements in each list, which could cause violation
of the invariant. So we use the unlabelled initialiser, which
automatically check()s.
Notice that we don’t have to do any reversing here. This is a huge
efficiency gain, but you’ve got to bear in mind that we’re assuming the
order of execution of the closures for filter and map don’t matter. This isn’t always the
case. Take this function, which is supposed to skip two elements of a
sequence:
var i = 0
[Int](1...10).filter { _ in i++ % 3 == 0 } // [1, 4, 7, 10]It won’t work for a Deque:
Deque(1...10).filter { _ in i++ % 3 == 0 } // 1, 4 | 6, 9There’s been talk of a @pure
attribute. The idea is this: put it before your function or closure
name, and the compiler will verify that it has no side effects. It can
only use its arguments as variables, or call other @pure functions. It would be very useful
here, as it wouldn’t allow the i
to be used by filter. Without it,
you’ll probably just have to mention in the docs that the order of
execution is not knowable.
For completeness’ sake, there are also flatMap()s
for the Deque, implemented in a similar fashion to the functions
above:
extension Deque {
public func flatMap<T>(@noescape transform: Element -> Deque<T>) -> Deque<T> {
return Deque<T>(
front.flatMap{List(transform($0))},
back .flatMap{List(transform($0).reverse())}
)
}
public func flatMap<T>(@noescape transform: Element -> T?) -> Deque<T> {
return Deque<T>(
front.flatMap(transform),
back .flatMap(transform)
)
}
}All of this code is available as a playground, here. These two structs are also implemented a little more fully in SwiftSequence.
Since the only real constitutive part of the Deque is a list, it’s
probably possible to implement it lazily, by just substituting in LazyLists. Or, if you were feeling
adventurous, you could have one of the lists lazy, and one strict. This
isn’t as crazy as it sounds: reverse()
can only be performed eagerly, since the entire list needs to
be walked to get to the last element. So the front and back lists have
different functions (slightly). Also, because of the lazy initialisation
of LazyList, swapping between lazy
and strict needn’t be very expensive. I’ll leave it up to someone else
to try, though.