List of Examples
REGEXP:MATCH
REGEXP:REGEXP-QUOTE
The “REGEXP” module implements the POSIX
regular expressions
matching by calling the standard C system facilities.
The syntax of these regular expressions is described in many places,
such as your local <regex.h
> manual and Emacs info pages.
This module is present in the base linking set by default.
When this module is present, *FEATURES*
contains the symbol :REGEXP
.
Regular Expression API
(REGEXP:MATCH
pattern
string
&KEY
(:START
0) :END
:EXTENDED
:IGNORE-CASE
:NEWLINE
:NOSUB
:NOTBOL
:NOTEOL
)
This macro returns as first value a REGEXP:MATCH
structure
containing the indices of the start and end of the first match for the
regular expression pattern
in string
; or no values if there is no match.
Additionally, a REGEXP:MATCH
structure is returned for every matched
"\(...\)"
group in pattern
, in the
order that the open parentheses appear in pattern
.
If start
is non-NIL
, the search starts at that index in string
.
If end
is non-NIL
, only (
is considered.
SUBSEQ
string
start
end
)
Example 33.1. REGEXP:MATCH
(REGEXP:MATCH
"quick" "The quick brown fox jumped quickly.") ⇒#S(
(REGEXP:MATCH
:START 4 :END 9)REGEXP:MATCH
"quick" "The quick brown fox jumped quickly." :start 8) ⇒#S(
(REGEXP:MATCH
:START 27 :END 32)REGEXP:MATCH
"quick" "The quick brown fox jumped quickly." :start 8 :end 30) ⇒(
NIL
REGEXP:MATCH
"\\([a-z]*\\)[0-9]*\\(bar\\)" "foo12bar") ⇒#S(
; ⇒REGEXP:MATCH
:START 0 :END 8)#S(
; ⇒REGEXP:MATCH
:START 0 :END 3)#S(
REGEXP:MATCH
:START 5 :END 8)
(REGEXP:MATCH-START
match
)
(REGEXP:MATCH-END
match
)
match
; SETF
-able.
(REGEXP:MATCH-STRING
string
match
)
string
corresponding
to the given pair of start and end indices of match
.
The result is shared with string
.
If you want a fresh STRING
, use COPY-SEQ
or
COERCE
to SIMPLE-STRING
.(REGEXP:REGEXP-QUOTE
string
&OPTIONAL
extended
)
This function returns a regular expression STRING
that matches exactly string
and nothing else.
This allows you to request an exact string match when calling a
function that wants a regular expression.
One use of REGEXP:REGEXP-QUOTE
is to combine an exact string match with
context described as a regular expression.
When extended
is non-NIL
, also
quote #\+ and #\?.
(REGEXP:REGEXP-COMPILE
string
&KEY
:EXTENDED
:IGNORE-CASE
:NEWLINE
:NOSUB
)
string
into an
object suitable for REGEXP:REGEXP-EXEC
.(REGEXP:REGEXP-EXEC
pattern
string
&KEY
:RETURN-TYPE :BOOLEAN
(:START
0) :END
:NOTBOL
:NOTEOL
)
Execute the pattern
, which must be a compiled
regular expression returned by REGEXP:REGEXP-COMPILE
, against the appropriate
portion of the string
.
Returns REGEXP:MATCH
structures as multiple values (one for each
subexpression which successfully matched and one for the whole pattern),
unless :BOOLEAN
was non-NIL
, in which case
return T
as an indicator of success, but do not allocate anything.
If :RETURN-TYPE
is LIST
(or
VECTOR
), the REGEXP:MATCH
structures are returned as a LIST
(or
a VECTOR
) instead. If there are more than MULTIPLE-VALUES-LIMIT
REGEXP:MATCH
structures to return, a LIST
is returned instead of
multiple values.
(REGEXP:REGEXP-SPLIT
pattern
string
&KEY
(:START
0) :END
:EXTENDED
:IGNORE-CASE
:NEWLINE
:NOSUB
:NOTBOL
:NOTEOL
)
string
(all
sharing the structure with string
) separated by pattern
(a
regular expression STRING
or a return value of REGEXP:REGEXP-COMPILE
)
(REGEXP:WITH-LOOP-SPLIT
(variable
stream
pattern
&KEY
(:START
0) :END
:EXTENDED
:IGNORE-CASE
:NEWLINE
:NOSUB
:NOTBOL
:NOTEOL
) &BODY
body
)
stream
, split them with
REGEXP:REGEXP-SPLIT
on pattern
, and bind the resulting list to
variable
.:EXTENDED
:IGNORE-CASE
:NEWLINE
:NOSUB
regex.h
> for their meaning.:NOTBOL
:NOTEOL
regex.h
> for their meaning.REGEXP:REGEXP-MATCHER
CUSTOM:*APROPOS-MATCHER*
.
This will work only when your LOCALE
is CHARSET:UTF-8
because CLISP uses CHARSET:UTF-8
internally and POSIX constrains
<regex.h
> to use the current LOCALE
.Example 33.3. Count unix shell users
The following code computes the number of people who use a particular shell:
#!/usr/local/bin/clisp -C (DEFPACKAGE
"REGEXP-TEST" (:use "LISP" "REGEXP")) (IN-PACKAGE
"REGEXP-TEST") (let ((h (make-hash-table :test #'equal :size 10)) (n 0)) (with-open-file (f "/etc/passwd") (with-loop-split (s f ":") (incf (gethash (seventh s) h 0)))) (with-hash-table-iterator (i h) (loop (multiple-value-bind (r k v) (i) (unless r (return)) (format t "[~d] ~s~30t== ~5:d~%" (incf n) k v)))))
For comparison, the almost same (except nice output formatting) can be done by the following Perl:
#!/usr/local/bin/perl -w use diagnostics; use strict; my $IN = $ARGV[0]; open(INF,"< $IN") || die "$0: cannot read file [$IN]: $!\n;"; my %hash; while (<INF>) { chop; my @all = split($ARGV[1]); my $shell = ($#all >= 6 ? $all[6] : ""); if ($hash{$shell}) { $hash{$shell} ++; } else { $hash{$shell} = 1; } } my $ii = 0; for my $kk (keys(%hash)) { print "[",++$ii,"] \"",$kk,"\" -- ",$hash{$kk},"\n"; } close(INF);
These notes document CLISP version 2.49 | Last modified: 2010-07-07 |