There are currently about 150 instructions in Guile’s virtual machine. These instructions represent atomic units of a program’s execution. Ideally, they perform one task without conditional branches, then dispatch to the next instruction in the stream.
Instructions themselves are composed of 1 or more 32-bit units. The low 8 bits of the first word indicate the opcode, and the rest of instruction describe the operands. There are a number of different ways operands can be encoded.
sn
An unsigned n-bit integer, indicating the sp
-relative index
of a local variable.
fn
An unsigned n-bit integer, indicating the fp
-relative index
of a local variable. Used when a continuation accepts a variable number
of values, to shuffle received values into known locations in the
frame.
cn
An unsigned n-bit integer, indicating a constant value.
l24
An offset from the current ip
, in 32-bit units, as a signed
24-bit value. Indicates a bytecode address, for a relative jump.
zi16
i16
i32
An immediate Scheme value (see Immediate Objects), encoded directly
in 16 or 32 bits. zi16
is sign-extended; the others are
zero-extended.
a32
b32
An immediate Scheme value, encoded as a pair of 32-bit words.
a32
and b32
values always go together on the same opcode,
and indicate the high and low bits, respectively. Normally only used on
64-bit systems.
n32
A statically allocated non-immediate. The address of the non-immediate
is encoded as a signed 32-bit integer, and indicates a relative offset
in 32-bit units. Think of it as SCM x = ip + offset
.
r32
Indirect scheme value, like n32
but indirected. Think of it as
SCM *x = ip + offset
.
l32
lo32
An ip-relative address, as a signed 32-bit integer. Could indicate a
bytecode address, as in make-closure
, or a non-immediate address,
as with static-patch!
.
l32
and lo32
are the same from the perspective of the
virtual machine. The difference is that an assembler might want to
allow an lo32
address to be specified as a label and then some
number of words offset from that label, for example when patching a
field of a statically allocated object.
v32:x8-l24
Almost all VM instructions have a fixed size. The jtable
instruction used to perform optimized case
branches is an
exception, which uses a v32
trailing word to indicate the number
of additional words in the instruction, which themselves are encoded as
x8-l24
values.
b1
A boolean value: 1 for true, otherwise 0.
xn
An ignored sequence of n bits.
An instruction is specified by giving its name, then describing its operands. The operands are packed by 32-bit words, with earlier operands occupying the lower bits.
For example, consider the following instruction specification:
f24:proc x8:_ c24:nlocals
¶The first word in the instruction will start with the 8-bit value corresponding to the call opcode in the low bits, followed by proc as a 24-bit value. The second word starts with 8 dead bits, followed by the index as a 24-bit immediate value.
For instructions with operands that encode references to the stack, the
interpretation of those stack values is up to the instruction itself.
Most instructions expect their operands to be tagged SCM values
(scm
representation), but some instructions expect unboxed
integers (u64
and s64
representations) or floating-point
numbers (f64
representation). It is assumed that the bits for a
u64
value are the same as those for an s64
value, and that
s64
values are stored in two’s complement.
Instructions have static types: they must receive their operands in the format they expect. It’s up to the compiler to ensure this is the case.
Unless otherwise mentioned, all operands and results are in the
scm
representation.