Newlines are written according to the stream's EXT:ENCODING
, see the
function STREAM-EXTERNAL-FORMAT
and the description of EXT:ENCODING
s,
in particular, line terminators.
The default behavior is as follows:
When reading from a file, CR/LF is converted to #\Newline
(the usual convention on DOS), and CR not followed by LF is
converted to #\Newline as well (the usual conversion on MacOS, also used
by some programs on Win32).
If you do not want this, i.e., if you really want to distinguish
LF, CR and CR/LF, you have to resort to
binary input (function READ-BYTE
).
Justification. Unicode Newline Guidelines say: “Even if you know which characters represents NLF on your particular platform, on input and in interpretation, treat CR, LF, CRLF, and NEL the same. Only on output do you need to distinguish between them.”
Rationale. In CLISP, #\Newline is identical to #\Linefeed
(which is specifically permitted by the [ANSI CL standard] in
[sec_13-1-7] “Character Names”).
Consider a file containing exactly this string:
(
Suppose we open it with CONCATENATE
'STRING
"foo" (STRING
#\Linefeed)
"bar" (STRING
#\Return) (STRING
#\Linefeed))(
.
What should OPEN
"foo" :EXTERNAL-FORMAT
:DOS
)READ-LINE
return?
Right now, it returns "foo"
(the second READ-LINE
returns "bar"
and reaches end-of-stream
).
If our i/o were “faithful”, READ-LINE
would have
returned the string (
, i.e., a string with an embedded #\Newline
between "foo"
and "bar" (because a single #\Linefeed is not a
#\Newline in the specified CONCATENATE
'STRING
"foo" (STRING
#\Linefeed) "bar"):EXTERNAL-FORMAT
, it will not make READ-LINE
return,
but it is a CLISP #\Newline!) Even though the specification for
READ-LINE
does not explicitly forbids newlines inside the returned
string, such behavior would be quite surprising, to say the least.
Moreover, this line (with an embedded #\Newline) would be written as two
lines (when writing to a STREAM
with :EXTERNAL-FORMAT
of :DOS
), because
the embedded #\Newline would be written as CR+LF.
These notes document CLISP version 2.49 | Last modified: 2010-07-07 |