See: Description
Interface | Description |
---|---|
Array<E> |
General interface to arrays of arbitrary dimension.
|
AttributePredicate |
A predicate that (only) matches a ATTRIBUTE_VALUE.
|
AVector<E> | |
CharSeq |
A sequence where each element is a character (Unicode code point).
|
Consumable |
An object that can send its contents to a Consumer.
|
Consumer |
A Consumer is something that will accept data (output),
and do something with it.
|
ElementPredicate |
A predicate that (only) matches a ELEMENT_VALUE.
|
GVector<E> |
A "generalized vector" - a randomly-acessible sequence.
|
IntSequence | |
ItemPredicate |
A predicate (or type) on an item in a sequence.
|
NodePredicate |
A predicate that (only) matches only "nodes" in the XML sense.
|
PositionConsumer |
An object that can be "fed" a TreePosition, and will do something with it.
|
Sequence<E> |
A Sequence is an ordered list of elements.
|
XConsumer |
A Consumer extended with XML-specific methods.
|
Class | Description |
---|---|
AbstractCharVector<E> | |
AbstractSequence<E> |
A collection of default methods for implementing sequence-like classes.
|
Arrays | |
BitVector |
Simple adjustable-length vector of Boolean values.
|
Blob |
Binary data which may represent text or other information.
|
ByteVector<E> |
Simple adjustable-length vector of signed or unsigned 8-bit integers (bytes).
|
CharBuffer |
Editable character sequence using a buffer-gap implementation and
self-adjusting position.
|
CharVector |
Simple adjustable-length vector whose elements are 16-bit chars.
|
ComposedArray<E> |
Indexing "composes" with a set of indexing arrays.
|
ComposedArray.AsSequence<E> |
Same as ComposedArray but also implements AVector.
|
ConsumerWriter |
A Writer that wraps (filters) a Consumer.
|
Convert |
A class that encapsulates conversions between primitive and Object.
|
EmptyList |
This singleton class represents an empty list.
|
EofClass | |
ExtPosition<E,ESEQ extends AbstractSequence<E>> |
A SeqPosition for sequences that need more than a Pos int for a position.
|
ExtSequence<E> |
Abstract helper class for Sequences that use an ExtPosition.
|
F32Vector |
Simple adjustable-length vector of 32-bit floats.
|
F64Vector |
Simple adjustable-length vector of 64-bit doubles.
|
FilterConsumer |
A Consumer that wraps some other Consumer.
|
FlattenedArray<E> |
View an array as a vector, with the former's elements in row-major order.
|
FString |
Simple adjustable-length vector whose elements are 32-bit code points
Used for the Scheme string type.
|
FVector<E> |
Simple adjustable-length vector of objects.
|
GeneralArray<E> |
A class to handle general multi-dimensional arrays.
|
GeneralArray1<E> | |
ImmutablePair | |
IndirectIndexedSeq<E> |
Wrap a List (or an indexed selection of it) as a Sequence.
|
IntVector<E> |
Simple adjustable-length vector of signed or unsigned 32-bit integers (ints).
|
IString |
A string implementation with contant-time codepoint indexing.
|
IString.SubString | |
LList |
Semi-abstract class for traditions Lisp-style lists.
|
LongVector<E> |
Simple adjustable-length vector of signed or unsigned 64-bit integers (longs).
|
Pair |
A "pair" object, as used in Lisp-like languages.
|
PairWithPosition |
A
Pair with the file name and position it was read from. |
PositionManager | |
PrimIntegerVector<E> | |
PrintConsumer |
A Consumer that extends a PrintWriter.
|
Range<E> | |
Range.IntRange | |
RAPair | |
RAPair.Node | |
S16Vector |
Simple adjustable-length vector of signed 16-bit integers (shorts).
|
S32Vector |
Simple adjustable-length vector of signed 32-bit integers (ints).
|
S64Vector |
Simple adjustable-length vector of signed 64-bit integers (longs).
|
S8Vector |
Simple adjustable-length vector of signed 8-bit integers (bytes).
|
SeqPosition<E,ESEQ extends AbstractSequence<E>> |
A position in a sequence (list).
|
Sequences | |
Sequences.CharacterIterator |
Iterator subclass to iterate of CharSequences.
|
ShortVector<E> |
Simple adjustable-length vector of signed or unsigned 16-bit integers (shorts).
|
SimpleVector<E> |
A generic simple vector.
|
StableManager |
Implements a stable sequence with sticky positions.
|
Strings |
Various static utility methods for general strings (CharSeqs).
|
SubSequence<E> |
A sequence consisting of a sub-range of the elements of a base sequence.
|
TransformedArray<E> |
Indexes are mapped.
|
TreeList |
A compact representation of a tree, that is a nested list structure.
|
TreePosition |
A position that can also go down and up in a tree.
|
U16Vector |
Simple adjustable-length vector of unsigned 16-bit integers (shorts).
|
U32Vector |
Simple adjustable-length vector of unsigned 32-bit integers (ints).
|
U64Vector |
Simple adjustable-length vector of unsigned 64-bit integers (longs).
|
U8Vector |
Simple adjustable-length vector of unsigned 8-bit integers (bytes).
|
UnescapedData |
Used for text that is supposed to be written out verbatim.
|
VoidConsumer |
A Consumer that does nothing.
|
Contains utility classes and interfaces for sequences (lists), arrays, and trees.
Features:
gnu.lists
; instead you implement suitable methods
in the sequence class itself. This approach may seem strange, but it has
important performance advantages, especially in applications
where you want to go up and down nested sequences (i.e. in trees, such
as XML elements), or remember sets of positions (such as XML node-sets).TreeList
class can compactly represent
a nested heterogenous tree. One intended usage is as a very efficient
representation for XML data.Consumer
interface is an abstraction of "data sinks" -
objects that can receive data and do something with them. It is
similar to SAX's DocumentHandler but at a more abstract (and sometimes
more efficient) level.
A position points between two elements in its "containing" sequence, or it points before the first element (if any), or after the last. Sometimes, we loosely say that the position refers to or points to an element of the sequence; in that case we mean the position is immediately before the element.
An iterator is an object that has a current position,
but that can be moved so it gets a new position.
What needs to be emphasized is that all
Sequence
implementations.
use the same SeqPosition
class to implement positions and iteration.
In other "collections frameworks" each sequence class has its corresponding
iterator class, but in this framework you instead add the methods
that handle iteration to the sequence class itself, extending
the AbstractSequence
class.
A consequence of this is that if you have an unused
SeqPosition
object, you can
use it to iterate over any Sequence
you choose.
This potentially saves memory allocation, which gains you the most
when iterating over a nested sequence structure, like a tree of XML data.
We do this by requiring that any position be representable using a pair
of an int
and an Object
.
In other words, the state of an iterator is
restricted to be an (int
, Object
) pair.
Of course that is all you could need, since if you need more
state you can always create a helper object,
and store the extra state there. The assumption we make though is that for
most Sequence
s, an (int
, Object
)
would be enough, without creating any new objects (beyond,
sometimes, the SeqPosition
itself).
The int
part of a position-state is normally
named ipos
, and the Object
part of a
position-state is named xpos
.
Neither of these values have any meaning in themselves, they only have
meaning as interpreted by the specific Sequence
.
For example, ipos
might be the offset of the position in
the sequence, but it might also some "magic cookie".
If a Sequence
supports insertions and deletions,
and you want positions to be automatically adjusted, then a position
has to be a magic cookie rather than an offset.
(A typical implementation is for such a position to be an index
into a table of positions managed by the Sequence
itself.)
So a complete position is actually a (int
, Object
,
AbstractSequence
) triple, where the AbstractSequence
component is the object that interprets the (int
,
Object
) pair.
Normally, the AbstractSequence
part of a position triple is the
Sequence
"containing" the position, but it can be any
AbstractSequence
that implements the various methods that provide the semantics
for the (int
, Object
) pair.
The PositionContainer
interface is a more abstract version of SeqPosition
, in
that it can represents more than one position.
For example, instead of an array of separately allocated
SeqPosition
objects, you might use some data structure
that implements PositionContainer
,
which is abstractly a list of (int
, Object
,
Sequence
)-triples.
You typically use an iteration framework it process the elements
of a sequence in such a way that the data consumer is active
and in control, while the sequence itself (the data producer) is a
passive object. The Consumer
works the other way round: The data producer is active and in control,
while the data consumer is passive, consuming elements when requested
by the data producer. The Consumer
is a more abstract
variant of SAX's
ContentHandler
interface for processing "events";
however Consumer
is intended for general value sequences,
and is less tied to XML.
int iter = 0; // standard start position for (;;) { iter = list.nextPos(iter); if (iter == 0) break; Object value = list.getPosPrevious(iter); ... use value ...; }
The position encoding for sequences implemented using an array is simple.
Assume the next index (as returned by nextIndex
) is i.
If the position is a before/backwards position, then position value
is (
i<<1)
.
If the position is an after/forwards position, then position value
is ((
i<<1))|1
.
But what each each item in the buffer has variable size?
One example is a UTF-8 string. Another example is a buffer of text
where each "item" is a line.
Then we have the choice: Should the position value encode the
index (making nextIndex
and get
cheap), or
should it encode the offset into the buffer (making sequential access cheap)?
Since sequential access is preferred, we do the latter.
Thus for a before/backwards position, if the buffer offset for
item i is pi, then the position value
is (
pi << 1)
.
However, there is a complication when moving forwards using
a ListIterator
since using set
or remove
affects the previous item. It may be much more expensive to
calculate the buffer position of the previous item than the next item.
(Given pi it may be cheap to
calculate pi+1 but expensive to
calculate pi-1.)
Therefore, we suggest using
(((
pi-1+1)<<1))|1
,
where p-1 is defined as -1.
The addition of 1 allows us to handle the case of i=0 without
conflicting with the use of -1 to mean end-of-sequence.
Plus it makes the previous case of a simple array a special case.