As mentioned before, Guile compiles all code to bytecode, and that bytecode is contained in ELF images. See Object File Format, for more on Guile’s use of ELF.
To produce a bytecode image, Guile provides an assembler and a linker.
The assembler, defined in the (system vm assembler)
module, has a
relatively straightforward imperative interface. It provides a
make-assembler
function to instantiate an assembler and a set of
emit-inst
procedures to emit instructions of each kind.
The emit-inst
procedures are actually generated at
compile-time from a machine-readable description of the VM. With a few
exceptions for certain operand types, each operand of an emit procedure
corresponds to an operand of the corresponding instruction.
Consider allocate-words
, from see Memory Access Instructions.
It is documented as:
s12:dst s12:nwords
¶Therefore the emit procedure has the form:
All emit procedure take the assembler as their first argument, and return no useful values.
The argument types depend on the operand types. See Instruction Set. Most are integers within a restricted range, though labels are generally expressed as opaque symbols. Besides the emitters that correspond to instructions, there are a few additional helpers defined in the assembler module.
Define a label at the current program point.
Associate source with the current program point.
Macro-instructions to implement compilation-unit caches. A single cache cell corresponding to key will be allocated for the compilation unit.
Load the Scheme datum constant into dst.
Delimit the bounds of a procedure, with the given label and the metadata properties.
Load a procedure with the given label into local dst. This macro-instruction should only be used with procedures without free variables – procedures that are not closures.
Delimit a clause of a procedure.
The linker is a complicated beast. Hackers interested in how it works
would do well do read Ian Lance Taylor’s series of articles on linkers.
Searching the internet should find them easily. From the user’s
perspective, there is only one knob to control: whether the resulting
image will be written out to a file or not. If the user passes
#:to-file? #t
as part of the compiler options (see The Scheme Compiler), the linker will align the resulting segments on page
boundaries, and otherwise not.
Link an ELF image, and return the bytevector. If page-aligned? is true, Guile will align the segments with different permissions on page-sized boundaries, in order to maximize code sharing between different processes. Otherwise, padding is minimized, to minimize address space consumption.
To write an image to disk, just use put-bytevector
from
(ice-9 binary-ports)
.
Compiling object code to the fake language, value
, is performed
via loading objcode into a program, then executing that thunk with
respect to the compilation environment. Normally the environment
propagates through the compiler transparently, but users may specify the
compilation environment manually as well, as a module. Procedures to
load images can be found in the (system vm loader)
module:
(use-modules (system vm loader))
Load object code from a file named file. The file will be mapped
into memory via mmap
, so this is a very fast operation.
Load object code from a bytevector. The data will be copied out of the bytevector in order to ensure proper alignment of embedded Scheme values.
Additionally there are procedures to find the ELF image for a given pointer, or to list all mapped ELF images:
Given the integer value ptr, find and return the ELF image that
contains that pointer, as a bytevector. If no image is found, return
#f
. This routine is mostly used by debuggers and other
introspective tools.
Return all mapped ELF images, as a list of bytevectors.