We describe programs in Guile’s CPS language as being a kind of “soup” because all continuations in the program are mixed into the same “pot”, so to speak, without explicit markers as to what function or scope a continuation is in. A program in CPS is a map from continuation labels to continuation values. As discussed in the introduction, a continuation label is an integer. No label may be negative.
As a matter of convention, label 0 should map to the $kfun
continuation of the entry to the program, which should be a function of
no arguments. The body of a function consists of the labelled
continuations that are reachable from the function entry. A program can
refer to other functions, either via $fun
and $rec
in
higher-order CPS, or via $const-fun
, $callk
, and allocated
closures in first-order CPS. The program logically contains all
continuations of all functions reachable from the entry function. A
compiler pass may leave unreachable continuations in a program;
subsequent compiler passes should ensure that their transformations and
analyses only take reachable continuations into account. It’s OK though
if transformation runs over all continuations if including the
unreachable continuations has no effect on the transformations on the
live continuations.
The “soup” itself is implemented as an intmap, a functional
array-mapped trie specialized for integer keys. Intmaps associate
integers with values of any kind. Currently intmaps are a private data
structure only used by the CPS phase of the compiler. To work with
intmaps, load the (language cps intmap)
module:
(use-modules (language cps intmap))
Intmaps are functional data structures, so there is no constructor as such: one can simply start with the empty intmap and add entries to it.
(intmap? empty-intmap) ⇒ #t (define x (intmap-add empty-intmap 42 "hi")) (intmap? x) ⇒ #t (intmap-ref x 42) ⇒ "hi" (intmap-ref x 43) ⇒ error: 43 not present (intmap-ref x 43 (lambda (k) "yo!")) ⇒ "yo" (intmap-add x 42 "hej") ⇒ error: 42 already present
intmap-ref
and intmap-add
are the core of the intmap
interface. There is also intmap-replace
, which replaces the
value associated with a given key, requiring that the key was present
already, and intmap-remove
, which removes a key from an intmap.
Intmaps have a tree-like structure that is well-suited to set operations
such as union and intersection, so there are also the binary
intmap-union
and intmap-intersect
procedures. If the
result is equivalent to either argument, that argument is returned
as-is; in that way, one can detect whether the set operation produced a
new result simply by checking with eq?
. This makes intmaps
useful when computing fixed points.
If a key is present in both intmaps and the associated values are not
the same in the sense of eq?
, the resulting value is determined
by a “meet” procedure, which is the optional last argument to
intmap-union
, intmap-intersect
, and also to
intmap-add
, intmap-replace
, and similar functions. The
meet procedure will be called with the two values and should return the
intersected or unioned value in some domain-specific way. If no meet
procedure is given, the default meet procedure will raise an error.
To traverse over the set of values in an intmap, there are the
intmap-next
and intmap-prev
procedures. For example, if
intmap x has one entry mapping 42 to some value, we would have:
(intmap-next x) ⇒ 42 (intmap-next x 0) ⇒ 42 (intmap-next x 42) ⇒ 42 (intmap-next x 43) ⇒ #f (intmap-prev x) ⇒ 42 (intmap-prev x 42) ⇒ 42 (intmap-prev x 41) ⇒ #f
There is also the intmap-fold
procedure, which folds over keys
and values in the intmap from lowest to highest value, and
intmap-fold-right
which does so in the opposite direction. These
procedures may take up to 3 seed values. The number of values that the
fold procedure returns is the number of seed values.
(define q (intmap-add (intmap-add empty-intmap 1 2) 3 4)) (intmap-fold acons q '()) ⇒ ((3 . 4) (1 . 2)) (intmap-fold-right acons q '()) ⇒ ((1 . 2) (3 . 4))
When an entry in an intmap is updated (removed, added, or changed), a new intmap is created that shares structure with the original intmap. This operation ensures that the result of existing computations is not affected by future computations: no mutation is ever visible to user code. This is a great property in a compiler data structure, as it lets us hold a copy of a program before a transformation and use it while we build a post-transformation program. Updating an intmap is O(log n) in the size of the intmap.
However, the O(log n) allocation costs are sometimes too much,
especially in cases when we know that we can just update the intmap in
place. As an example, say we have an intmap mapping the integers 1 to
100 to the integers 42 to 141. Let’s say that we want to transform this
map by adding 1 to each value. There is already an efficient
intmap-map
procedure in the (language cps utils)
module,
but if we didn’t know about that we might do:
(define (intmap-increment map) (let lp ((k 0) (map map)) (let ((k (intmap-next map k))) (if k (let ((v (intmap-ref map k))) (lp (1+ k) (intmap-replace map k (1+ v)))) map))))
Observe that the intermediate values created by intmap-replace
are completely invisible to the program – only the last result of
intmap-replace
value is needed. The rest might as well share
state with the last one, and we could update in place. Guile allows
this kind of interface via transient intmaps, inspired by
Clojure’s transient interface (http://clojure.org/transients).
The in-place intmap-add!
and intmap-replace!
procedures
return transient intmaps. If one of these in-place procedures is called
on a normal persistent intmap, a new transient intmap is created. This
is an O(1) operation. In all other respects the interface is like their
persistent counterparts, intmap-add
and intmap-replace
.
If an in-place procedure is called on a transient intmap, the intmap is
mutated in-place and the same value is returned.
If a persistent operation like intmap-add
is called on a
transient intmap, the transient’s mutable substructure is then marked as
persistent, and intmap-add
then runs on a new persistent intmap
sharing structure but not state with the original transient. Mutating a
transient will cause enough copying to ensure that it can make its
change, but if part of its substructure is already “owned” by it, no
more copying is needed.
We can use transients to make intmap-increment
more efficient.
The two changed elements have been marked like this.
(define (intmap-increment map) (let lp ((k 0) (map map)) (let ((k (intmap-next map k))) (if k (let ((v (intmap-ref map k))) (lp (1+ k) (intmap-replace! map k (1+ v)))) (persistent-intmap map)))))
Be sure to tag the result as persistent using the
persistent-intmap
procedure to prevent the mutability from
leaking to other parts of the program. For added paranoia, you could
call persistent-intmap
on the incoming map, to ensure that if it
were already transient, that the mutations in the body of
intmap-increment
wouldn’t affect the incoming value.
In summary, programs in CPS are intmaps whose values are continuations.
See the source code of (language cps utils)
for a number of
useful facilities for working with CPS values.