Deques, Queues, and Lists in Swift with Indirect
Recursive enums have finally arrived. Woo! The first thing to do with these is to make a recursive list:
List<Element> {
public enum case Nil
case Cons(head: Element, tail: List<Element>)
indirect }
The head
stores the element,
and tail
is a reference to the
rest of the list. As you can imagine, getting at the head
is pretty easy, while accessing
elements further along is more difficult. There’s a common pattern for
dealing with these recursive structures: if you have a function that
performs some transformation on a list, it will take the head
, perform that transformation on it,
and then call itself recursively on the tail
. If it’s given an empty list, it
returns an empty list. For instance, here’s the map
function,
defined in Haskell:
map _ [] = []
map f (x:xs) = f x : map f xs
The two lines are analogous to a switch statement in Swift. The
parameters for map
are a
transformation function and a list. So, the first line has _
(wildcard) for the function, and
[]
(empty) for the list, meaning
it will match any function and an empty list. It returns an empty
list.
The second line matches a function (which it assigns the name f
) and then decomposes the list it’s
given into a head (x
) and tail
(xs
). It then calls f
on the head, and prepends (the :
operator is
prepends, also called “cons” by convention) the result to itself called
recursively on the tail.
With switch statements and the indirect
keyword, we’re getting pretty
close to that level of brevity (terseness?) in Swift:
List {
extension <T>(@noescape transform: Element -> T) -> List<T> {
public func map{
switch self case .Nil: return .Nil
case let .Cons(head, tail): return
.Cons(head: transform(head), tail: tail.map(transform))
}
}
}
We can define our own “cons”, to clean it up a little. We’re not
allowed to use :
, so I went
with |>
, which is,
in my mind, reasonably representative of “cons”.
|> {
infix operator
associativity right100
precedence }
|> <T>(lhs: T, rhs: List<T>) -> List<T> {
public func return .Cons(head: lhs, tail: rhs)
}
List {
extension <T>(@noescape transform: Element -> T) -> List<T> {
public func map{
switch self case .Nil: return .Nil
case let .Cons(head, tail):
return transform(head) |> tail.map(transform)
}
}
}
Pretty soon you can start doing some elegant and exciting things with
lists. The recursive pattern is very well suited to
higher-order functions and other FP staples. Take, for instance, the
reduce
function:
List {
extension <T>(initial: T, @noescape combine: (T, Element) -> T) -> T {
public func reduce{
switch self case .Nil: return initial
case let .Cons(h, t):
return t.reduce(combine(initial, h), combine: combine)
}
}
}
Or a transposing function:
<T>(mat: List<List<T>>) -> List<List<T>> {
func transpose{
switch mat case let .Cons(x, xs) where x.isEmpty: return transpose(xs)
case let .Cons(.Cons(x, xs), xss):
return (x |> xss.flatMap{$0.first}) |>
transpose(xs |> xss.map{$0.tail})
: return .Nil
default}
}
: List<List<Int>> = [[1, 2, 3], [1, 2, 3], [1, 2, 3]]
let jotranspose(jo) // [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
You can do foldr
, which is like
reduce
, but works in reverse:
List {
extension <T>(initial: T, @noescape combine: (element: Element, accumulator: T) -> T) -> T {
func foldr{
switch self case .Nil: return initial
case let .Cons(x, xs):
return combine(
: x,
element: xs.foldr(initial, combine: combine)
accumulator)
}
}
}
Using foldr
, you can get all of
the non-empty subsequences of a list:
List {
extension var subsequences: List<List<Element>> {
{
switch self case .Nil: return .Nil
case let .Cons(x, xs):
return [x] |> xs.subsequences.foldr([]) {
(ys, r) in ys |> (x |> ys) |> r
}
}
}
}
: List = [1, 2, 3]
let jo.subsequences // [[1], [2], [1, 2], [1, 3], [2, 3], [1, 2, 3]] jo
(these examples are all translated from the Haskell standard library) Lists are extremely fun, and some functions you would have found yourself writing on 10-15 lines can be got into 2-3. To get a better feel for playing around with lists, it’s useful to have them conform to some protocols that make them easier to work with in a playground.
For instance, making a list currently looks like this:
: List = 1 |> 2 |> 3 |> .Nil let jo
Which is fine, and better than:
: List = .Cons(head: 1, tail: .Cons(head: 2, tail: .Cons(head: 3, tail: .Nil))) let jo
but still not fantastic. The obvious next step is making List
ArrayLiteralConvertible
, but there’s a
small catch. We don’t have an append
function for lists (yet). So we
can’t, off the bat, do something like this:
List : ArrayLiteralConvertible {
extension init(arrayLiteral: Element...) {
public var ret: List<Element> = .Nil
for el in arrayLiteral { ret.append(el) }
= ret
self }
}
And nor do I think we’d want to. Operations on the end of lists are slow: you have to walk along the entire list every time.
We could reverse the sequence we want to turn into a list,
and prepend as we go. But… that’s inefficient too. Sure, Array
s are fast
to reverse, but other sequences aren’t. For those that can’t be reversed
lazily, you’re storing an extra sequence in memory unnecessarily.
But there’s something that we can use: generators. In Swift,
generators are like super-imperative, crazy-unsafe recursive lists. When
you can the next()
method on a generator, you get the “head” back. Crucially, though:
the generator is left with the tail. Making use of this fact
too often will lead to bugs, but if we wrap it up in private
, it’s a
perfect fit:
List {
extension private init<G : GeneratorType where G.Element == Element>(var gen: G) {
if let head = gen.next() {
= head |> List(gen: gen)
self } else {
= .Nil
self }
}
}
The potential bug here is kind of interesting. If, instead of an
infix operator for cons, we’d had a method on List
that did
the same thing:
List {
extension prepended(with: Element) -> List<Element> {
public func return .Cons(head: with, tail: self)
}
}
We’d be able to curry that function in a map()
,
and get an init
function that’s
very pretty:
List {
extension private init<G : GeneratorType where G.Element == Element>(var g: G) {
= g.next().map(List(g: g).prepended) ?? .Nil
self }
}
But it won’t run. Since the recursive call to the function is
curried, it’s resolved before the g.next()
part. Which means that, regardless of whether g
returns nil
or not, the call will be made,
causing an infinite loop of sadness. To fix it, you have to make the
order of operations clear: do not make a recursive call if
g.next()
returns nil
.
List {
extension private init<G : GeneratorType where G.Element == Element>(var gen: G) {
if let head = gen.next() {
= head |> List(gen: gen)
self } else {
= .Nil
self }
}
<S : SequenceType where S.Generator.Element == Element>(_ seq: S) {
public init= List(gen: seq.generate())
self }
}
List : ArrayLiteralConvertible {
extension init(arrayLiteral: Element...) {
public = List(arrayLiteral.generate())
self }
}
This all makes it easy to initialise a list. Being able to see the list and its contents is also important. Currently, we’ve got this mess:
When what we really want is a comma-separated list of the contents.
We also probably want some demarcation at either end, so it’s easier to
recognise nested lists. I’m not sure what the best demarcation would be:
ideally it should be different to an Array’s square brackets, but not
confusing either. I went with [:
and :]
in the end,
though I’m not terribly happy about it:
To get that printout on the right-hand-side of your playground, you
need to make your type CustomDebugStringConvertible
. There’s
one one interesting problem with this: how do you know the contents of
your list are printable? You can’t extend your struct to have
conditional conformance, like this:
List (where Element : CustomDebugStringConvertible) : CustomDebugStringConvertible {... extension
However, you can’t just get a string representation of something that
doesn’t have one. Luckily, String
has an
initialiser that takes anything. It uses runtime reflection to
do so. Here’s what the extension ends up looking like:
List : CustomDebugStringConvertible {
extension var debugDescription: String {
public "[:" + ", ".join(map{String(reflecting: $0)}) + ":]"
return}
}
To use the join()
function, of course, List
needs to
conform to SequenceType
. We’ll
need some generator that swaps out the current List
struct on
each iteration, and returns the head. You could just use anyGenerator
but, since it’s a class,
it’s significantly slower than defining a new struct.
<Element> : GeneratorType, SequenceType {
public struct ListGeneratorprivate var list: List<Element>
next() -> Element? {
public mutating func {
switch list case .Nil: return nil
case let .Cons(head, tail):
= tail
list return head
}
}
generate() -> ListGenerator { return self }
public func }
List : SequenceType {
extension generate() -> ListGenerator<Element> {
public func return ListGenerator(list: self)
}
}
And you’ve got a SequenceType
that’s normal-looking and easy to work with.
Laziness
I’m not sure if this is entirely relevant here, but I do
like laziness, so I thought I’d make a version of List
that was
lazy. It turns out it’s easy to do: in fact, it was possible before
indirect
enums. So, starting with
the standard List
definition:
<Element> {
public enum LazyListcase Nil
case Cons(head: Element, tail: LazyList<Element>)
indirect }
Let’s make it lazy. The main idea would be to defer the resolution of
tail
. What we really want is for
tail to be a function that returns a list, rather than a list
itself.
<Element> {
public enum LazyListcase Nil
case Cons(head: Element, tail: () -> LazyList<Element>)
}
This is the reason that indirect
isn’t needed: because tail
isn’t a list, all that’s stored in the enum is the reference to the
function. This is what indirect
does automatically, or what the Box
struct did
manually.
There are some more wrinkles with laziness. For instance, our old infix operator won’t work:
|> <T>(lhs: T, rhs: LazyList<T>) -> LazyList<T> {
public func return .Cons(head: lhs, tail: rhs)
}
Again, because tail is meant to be a function that returns a list, not a list itself. This would work, but not in the way we intend it:
|> <T>(lhs: T, rhs: LazyList<T>) -> LazyList<T> {
public func return .Cons(head: lhs, tail: {rhs})
}
Whatever’s to the right-hand-side of the operator will get resolved, and then put into the closure, which we don’t want. For instance, this:
printAndGiveList() -> LazyList<Int> {
func print(2)
return .Nil
}
2 |> 1 |> printAndGiveList()
Will give you a “LazyList
”, but
2 gets printed, meaning that it’s not really behaving
lazily.
@autoclosure
to the rescue!
This is a little annotation you put before your parameters that can let
you decide when to evaluate the argument.
|> <T>(lhs: T, @autoclosure(escaping) rhs: () -> LazyList<T>) -> LazyList<T> {
public func return .Cons(head: lhs, tail: rhs)
}
The escaping
in the brackets is
needed to signify that the closure will last longer than the lifetime of
the scope it is declared in. If you test this new version with the printAndGiveList()
function, you’ll see that 2 does not get printed. In fact, the
behaviour of this operator lets us use a lot of the same code from the
strict list, without the strictness. (The generator
initialiser, for instance: the same code, if used to initialise a lazy
list, will work. In fact, if the underlying sequence that the generator
comes from is lazy, that laziness is maintained in the lazy
list. That’s pretty cool.)
There’s an interesting point to be made, here. The usual definition for a lazy programming language is one in which functions do not evaluate their arguments until they need to. In contrast, eager languages evaluate function arguments before the body of the function. This kind of makes it seem that you could treat Swift as a totally lazy language…
At any rate, this new-and-improved operator works exactly as we want
it. It’s properly lazy. The rest is easy: every time tail
was used in List
, replace it
with tail()
.
The Deque
Lists are useful. They let you operate on their first element in time, which makes a lot of sense, since you often find yourself starting there.
They’ve got some disadvantages, though: for one, to get to the nth element, you have to walk along n elements in the list. So while operations of the start are fast, operations on the end are painfully slow. And forget about efficient indexing.
This is where a Deque comes in. When you need to operate on two ends of a collection, a Deque is what you want to be using. Removal of the first and last element, prepending, and appending are all .
It’s made up of two lists: one for the front half, and one, in reverse, for the back half. With that information we’ve enough to get a definition down:
Deque<Element> {
public struct private var front, back: List<Element>
}
You’ve got to do similar things that you did to the list to get an
easy-to-work-with struct. CustomDebugStringConvertible
, ArrayLiteralConvertible
, etc. It’s not
tremendously interesting, so here it is:
Deque : CustomDebugStringConvertible {
extension var debugDescription: String {
public return
", ".join(front.map{String(reflecting: $0)}) +
" | " +
", ".join(back.reverse().map{String(reflecting: $0)})
}
}
Deque {
extension init(array: [Element]) {
public = array.endIndex / 2
let half = List(array[0..<half])
front = List(array[half..<array.endIndex].reverse())
back }
}
Deque : ArrayLiteralConvertible {
extension init(arrayLiteral: Element...) {
public .init(array: arrayLiteral)
self}
}
Deque {
extension <S : SequenceType where S.Generator.Element == Element>(_ seq: S) {
public init.init(array: Array(seq))
self}
}
The debug output puts a |
between the
two lists:
This makes it clear how the performance characteristics come about: because the second half is a reversed list, all of the operations on the end of the Deque are operations on the beginning of a list. And that’s where lists are fast.
But there’s an obvious issue. Say we take that list, and start removing the first element from it:
= an.tail // 2, 3 | 4, 5, 6
let a = a.tail // 3 | 4, 5, 6
let b = b.tail // | 4, 5, 6
let c = c.tail // ????? let d
The front will end up being empty. The solution to this is the second
important element to a Deque. It needs an invariant: if its number of
elements is greater than one, neither the front list nor the back will
be empty. When the invariant gets violated, it needs to fix it. We can
check that the invariant has been upheld with a switch
statement:
Deque {
extension private mutating func check() {
switch (front, back) {
case (.Nil, let .Cons(head, tail)) where !tail.isEmpty: fix()
case (let .Cons(head, tail), .Nil) where !tail.isEmpty: fix()
:
defaultreturn
}
}
}
The first case is the front is empty, and the back has more than one element, and the second case is the back is empty, and the front has more than one element. To fix it, just chop off the tail of the non-empty list, reverse it, and assign it to the empty list:
Deque {
extension private mutating func check() {
switch (front, back) {
case (.Nil, let .Cons(head, tail)) where !tail.isEmpty:
(front, back) = (tail.reverse(), [head])
case (let .Cons(head, tail), .Nil) where !tail.isEmpty:
(back, front) = (tail.reverse(), [head])
:
defaultreturn
}
}
}
Now, wherever we have a mutating method that may cause a violation of
the invariant, this check
is
called. One particularly cool way to do this is by using didSet
:
Deque<Element> {
public struct private var front: List<Element> { didSet { check() } }
private var back : List<Element> { didSet { check() } }
}
This will call check()
whenever either list is mutated, ensuring you can’t forget. If a
new Deque is initialised, though, it won’t be called. I don’t
trust myself to remember the check()
on every init, so we can put it into the initialiser:
private init(_ front: List<Element>, _ back: List<Element>) {
(self.front, self.back) = (front, back)
check()
}
This is the only initialiser so far, so it’s the only one I’m allowed
to call. However, there may be some cases where I know that the
front and back are balanced. So I want a separate initialiser for those,
for efficiency’s sake. But it’s got to be called init
no matter what, so how can I
specify that I want to use the non-checking initialiser, over the
checking one? I could have a function called something like initialiseFromBalanced
that returns a
Deque, but I don’t like that. You could use labelled arguments. Erica
Sadun has a cool post on using them with subscripts, and here’s what
it would look like with init
:
Deque {
extension private init(balancedFront: List<Element>, balancedBack: List<Element>) {
(front, back) = (balancedFront, balancedBack)
}
}
So now we have a default initialiser that automatically balances the Deque, and a specialised one that takes two lists already balanced.
There is an extra function on lists in the check()
function: reverse()
.
There are a load of different ways to do it. If you’re in the mood for
golf:
: List = [1, 2, 3, 4, 5, 6]
let joanne.reduce(.Nil) { $1 |> $0 } // 6, 5, 4, 3, 2, 1 joanne
Or, if you’d like to keep it recursive:
List {
extension private func reverse(other: List<Element>) -> List<Element> {
{
switch self case .Nil: return other
case let .Cons(head, tail): return tail.reverse(head |> other)
}
}
reverse() -> List<Element> {
public func return reverse(.Nil)
}
}
Obviously, you want to avoid this operation as much as possible. We’ll have to bear that in mind when we’re adding other functions.
So what kind of operations do we want on Deques? Well, removeFirst()
and removeLast()
would be a start:
Deque {
extension removeFirst() -> Element {
public mutating func return front.removeFirst()
}
removeLast() -> Element {
public mutating func return back.removeFirst()
}
}
And the function on lists:
List {
extension removeFirst() -> Element {
public mutating func {
switch self case .Nil: fatalError("Cannot call removeFirst() on an empty list")
case let .Cons(head, tail):
= tail
self return head
}
}
}
The other functions are easy enough to figure out: dropFirst()
,
dropLast()
,
etc. And, since it conforms to SequenceType
, it gets all of the
sequence methods from the standard library, as well. However, those
methods are designed for other kinds of sequences - Array
s, String.CharacterView
s,
etc. There are much more efficient ways to do most of them.
reverse
, for instance, is just
this:
Deque {
extension reverse() -> Deque<Element> {
public func return Deque(balancedFront: back, balancedBack: front)
}
}
(Since reverse can’t change the number of elements in either list, we
can use the initialiser that takes a balanced front and back.) Other
methods like map()
,
filter()
,
etc., will just give you back an array. If we wanted to keep the Deque,
we’d have to convert it back, which involves reversing, which is
expensive. So we should do our own methods for those:
Deque {
extension <T>(@noescape transform: Element -> T) -> Deque<T> {
public func mapreturn Deque<T>(
: front.map(transform),
balancedFront: back .map(transform)
balancedBack )
}
}
Deque {
extension filter(@noescape includeElement: Element -> Bool) -> Deque<Element> {
public func return Deque(front.filter(includeElement), back.filter(includeElement))
}
}
filter()
changes the number of elements in each list, which could cause violation
of the invariant. So we use the unlabelled initialiser, which
automatically check()
s.
Notice that we don’t have to do any reversing here. This is a huge
efficiency gain, but you’ve got to bear in mind that we’re assuming the
order of execution of the closures for filter
and map
don’t matter. This isn’t always the
case. Take this function, which is supposed to skip two elements of a
sequence:
var i = 0
[Int](1...10).filter { _ in i++ % 3 == 0 } // [1, 4, 7, 10]
It won’t work for a Deque:
Deque(1...10).filter { _ in i++ % 3 == 0 } // 1, 4 | 6, 9
There’s been talk of a @pure
attribute. The idea is this: put it before your function or closure
name, and the compiler will verify that it has no side effects. It can
only use its arguments as variables, or call other @pure
functions. It would be very useful
here, as it wouldn’t allow the i
to be used by filter
. Without it,
you’ll probably just have to mention in the docs that the order of
execution is not knowable.
For completeness’ sake, there are also flatMap()
s
for the Deque, implemented in a similar fashion to the functions
above:
Deque {
extension <T>(@noescape transform: Element -> Deque<T>) -> Deque<T> {
public func flatMapreturn Deque<T>(
.flatMap{List(transform($0))},
front.flatMap{List(transform($0).reverse())}
back )
}
<T>(@noescape transform: Element -> T?) -> Deque<T> {
public func flatMapreturn Deque<T>(
.flatMap(transform),
front.flatMap(transform)
back )
}
}
All of this code is available as a playground, here. These two structs are also implemented a little more fully in SwiftSequence.
Since the only real constitutive part of the Deque is a list, it’s
probably possible to implement it lazily, by just substituting in LazyList
s. Or, if you were feeling
adventurous, you could have one of the lists lazy, and one strict. This
isn’t as crazy as it sounds: reverse()
can only be performed eagerly, since the entire list needs to
be walked to get to the last element. So the front and back lists have
different functions (slightly). Also, because of the lazy initialisation
of LazyList
, swapping between lazy
and strict needn’t be very expensive. I’ll leave it up to someone else
to try, though.