As explained above, the SCM
type can represent all Scheme values.
Some values fit entirely into a SCM
value (such as small
integers), but other values require additional storage in the heap (such
as strings and vectors). This additional storage is managed
automatically by Guile. You don’t need to explicitly deallocate it
when a SCM
value is no longer used.
Two things must be guaranteed so that Guile is able to manage the storage automatically: it must know about all blocks of memory that have ever been allocated for Scheme values, and it must know about all Scheme values that are still being used. Given this knowledge, Guile can periodically free all blocks that have been allocated but are not used by any active Scheme values. This activity is called garbage collection.
Guile’s garbage collector will automatically discover references to
SCM
objects that originate in global variables, static data
sections, function arguments or local variables on the C and Scheme
stacks, and values in machine registers. Other references to SCM
objects, such as those in other random data structures in the C heap
that contain fields of type SCM
, can be made visible to the
garbage collector by calling the functions scm_gc_protect_object
or
scm_permanent_object
. Collectively, these values form the “root
set” of garbage collection; any value on the heap that is referenced
directly or indirectly by a member of the root set is preserved, and all
other objects are eligible for reclamation.
In Guile, garbage collection has two logical phases: the mark
phase, in which the collector discovers the set of all live objects,
and the sweep phase, in which the collector reclaims the resources
associated with dead objects. The mark phase pauses the program and
traces all SCM
object references, starting with the root set.
The sweep phase actually runs concurrently with the main program,
incrementally reclaiming memory as needed by allocation.
In the mark phase, the garbage collector traces the Scheme stack and
heap precisely. Because the Scheme stack and heap are managed by
Guile, Guile can know precisely where in those data structures it might
find references to other heap objects. This is not the case,
unfortunately, for pointers on the C stack and static data segment.
Instead of requiring the user to inform Guile about all variables in C
that might point to heap objects, Guile traces the C stack and static
data segment conservatively. That is to say, Guile just treats
every word on the C stack and every C global variable as a potential
reference in to the Scheme heap4. Any value that looks like a pointer to a GC-managed
object is treated as such, whether it actually is a reference or not.
Thus, scanning the C stack and static data segment is guaranteed to find
all actual references, but it might also find words that only
accidentally look like references. These “false positives” might keep
SCM
objects alive that would otherwise be considered dead. While
this might waste memory, keeping an object around longer than it
strictly needs to is harmless. This is why this technique is called
“conservative garbage collection”. In practice, the wasted memory
seems to be no problem, as the static C root set is almost always finite
and small, given that the Scheme stack is separate from the C stack.
The stack of every thread is scanned in this way and the registers of the CPU and all other memory locations where local variables or function parameters might show up are included in this scan as well.
The consequence of the conservative scanning is that you can just
declare local variables and function parameters of type SCM
and
be sure that the garbage collector will not free the corresponding
objects.
However, a local variable or function parameter is only protected as
long as it is really on the stack (or in some register). As an
optimization, the C compiler might reuse its location for some other
value and the SCM
object would no longer be protected. Normally,
this leads to exactly the right behavior: the compiler will only
overwrite a reference when it is no longer needed and thus the object
becomes unprotected precisely when the reference disappears, just as
wanted.
There are situations, however, where a SCM
object needs to be
around longer than its reference from a local variable or function
parameter. This happens, for example, when you retrieve some pointer
from a foreign object and work with that pointer directly. The
reference to the SCM
foreign object might be dead after the
pointer has been retrieved, but the pointer itself (and the memory
pointed to) is still in use and thus the foreign object must be
protected. The compiler does not know about this connection and might
overwrite the SCM
reference too early.
To get around this problem, you can use scm_remember_upto_here_1
and its cousins. It will keep the compiler from overwriting the
reference. See Foreign Object Memory Management.
Note that Guile does not scan
the C heap for references, so a reference to a SCM
object from a
memory segment allocated with malloc
will have to use some other
means to keep the SCM
object alive. See Function related to Garbage Collection.