Tree Intermediate Language (Tree-IL) is a structured intermediate language that is close in expressive power to Scheme. It is an expanded, pre-analyzed Scheme.
Tree-IL is “structured” in the sense that its representation is
based on records, not S-expressions. This gives a rigidity to the
language that ensures that compiling to a lower-level language only
requires a limited set of transformations. For example, the Tree-IL
type <const>
is a record type with two fields, src
and
exp
. Instances of this type are created via make-const
.
Fields of this type are accessed via the const-src
and
const-exp
procedures. There is also a predicate, const?
.
See Records, for more information on records.
All Tree-IL types have a src
slot, which holds source location
information for the expression. This information, if present, will be
residualized into the compiled object code, allowing backtraces to
show source information. The format of src
is the same as that
returned by Guile’s source-properties
function. See Source Properties, for more information.
Although Tree-IL objects are represented internally using records,
there is also an equivalent S-expression external representation for
each kind of Tree-IL. For example, the S-expression representation
of #<const src: #f exp: 3>
expression would be:
(const 3)
Users may program with this format directly at the REPL:
scheme@(guile-user)> ,language tree-il Happy hacking with Tree Intermediate Language! To switch back, type `,L scheme'. tree-il@(guile-user)> (call (primitive +) (const 32) (const 10)) ⇒ 42
The src
fields are left out of the external representation.
One may create Tree-IL objects from their external representations via
calling parse-tree-il
, the reader for Tree-IL. If any source
information is attached to the input S-expression, it will be
propagated to the resulting Tree-IL expressions. This is probably the
easiest way to compile to Tree-IL: just make the appropriate external
representations in S-expression format, and let parse-tree-il
take care of the rest.
An empty expression. In practice, equivalent to Scheme’s (if #f
#f)
.
A reference to a “primitive”. A primitive is a procedure that, when
compiled, may be open-coded. For example, cons
is usually
recognized as a primitive, so that it compiles down to a single
instruction.
Compilation of Tree-IL usually begins with a pass that resolves some
<module-ref>
and <toplevel-ref>
expressions to
<primitive-ref>
expressions. The actual compilation pass has
special cases for calls to certain primitives, like apply
or
cons
.
A reference to a lexically-bound variable. The name is the original name of the variable in the source program. gensym is a unique identifier for this variable.
Sets a lexically-bound variable.
A reference to a variable in a specific module. mod should be
the name of the module, e.g. (guile-user)
.
If public? is true, the variable named name will be looked
up in mod’s public interface, and serialized with @
;
otherwise it will be looked up among the module’s private bindings,
and is serialized with @@
.
Sets a variable in a specific module.
References a variable from the current procedure’s module.
Sets a variable in the current procedure’s module.
Defines a new top-level variable in the current procedure’s module.
A conditional. Note that else is not optional.
A procedure call.
A call to a primitive. Equivalent to (call (primitive name)
. args)
. This construct is often more convenient to generate and
analyze than <call>
.
As part of the compilation process, instances of (call (primitive
name) . args)
are transformed into primcalls.
A sequence. The semantics is that head is evaluated first, and any resulting values are ignored. Then tail is evaluated, in tail position.
A closure. meta is an association list of properties for the
procedure. body is a single Tree-IL expression of type
<lambda-case>
. As the <lambda-case>
clause can chain to
an alternate clause, this makes Tree-IL’s <lambda>
have the
expressiveness of Scheme’s case-lambda
.
One clause of a case-lambda
. A lambda
expression in
Scheme is treated as a case-lambda
with one clause.
req is a list of the procedure’s required arguments, as symbols.
opt is a list of the optional arguments, or #f
if there
are no optional arguments. rest is the name of the rest
argument, or #f
.
kw is a list of the form, (allow-other-keys?
(keyword name var) ...)
, where keyword is the
keyword corresponding to the argument named name, and whose
corresponding gensym is var, or #f
if there are no keyword
arguments. inits are tree-il expressions corresponding to all of
the optional and keyword arguments, evaluated to bind variables whose
value is not supplied by the procedure caller. Each init
expression is evaluated in the lexical context of previously bound
variables, from left to right.
gensyms is a list of gensyms corresponding to all arguments: first all of the required arguments, then the optional arguments if any, then the rest argument if any, then all of the keyword arguments.
body is the body of the clause. If the procedure is called with
an appropriate number of arguments, body is evaluated in tail
position. Otherwise, if there is an alternate, it should be a
<lambda-case>
expression, representing the next clause to try.
If there is no alternate, a wrong-number-of-arguments error is
signaled.
Lexical binding, like Scheme’s let
. names are the original
binding names, gensyms are gensyms corresponding to the
names, and vals are Tree-IL expressions for the values.
exp is a single Tree-IL expression.
A version of <let>
that creates recursive bindings, like
Scheme’s letrec
, or letrec*
if in-order? is true.
A dynamic prompt. Instates a prompt named tag, an expression,
during the dynamic extent of the execution of body, also an
expression. If an abort occurs to this prompt, control will be passed
to handler, also an expression, which should be a procedure. The
first argument to the handler procedure will be the captured
continuation, followed by all of the values passed to the abort. If
escape-only? is true, the handler should be a <lambda>
with
a single <lambda-case>
body expression with no optional or
keyword arguments, and no alternate, and whose first argument is
unreferenced. See Prompts, for more information.
An abort to the nearest prompt with the name tag, an expression.
args should be a list of expressions to pass to the prompt’s
handler, and tail should be an expression that will evaluate to
a list of additional arguments. An abort will save the partial
continuation, which may later be reinstated, resulting in the
<abort>
expression evaluating to some number of values.
There are two Tree-IL constructs that are not normally produced by higher-level compilers, but instead are generated during the source-to-source optimization and analysis passes that the Tree-IL compiler does. Users should not generate these expressions directly, unless they feel very clever, as the default analysis pass will generate them as necessary.
Like Scheme’s receive
– binds the values returned by
evaluating exp
to the lambda
-like bindings described by
gensyms. That is to say, gensyms may be an improper list.
<let-values>
is an optimization of a <call>
to the
primitive, call-with-values
.
Like <letrec>
, but only for vals that are unset
lambda
expressions.
fix
is an optimization of letrec
(and let
).
Tree-IL is a convenient compilation target from source languages. It can be convenient as a medium for optimization, though CPS is usually better. The strength of Tree-IL is that it does not fix order of evaluation, so it makes some code motion a bit easier.
Optimization passes performed on Tree-IL currently include: