
This section is not finished documentation, but rather a collection of pointers towards some of the interesting, non-standard features of Rx.
Rx supports some unusual regexp syntax.
[[:cut N:]] sets pmatch[0].final_tag to N and causes the
matching to stop instantly. If N is 0, the overall match fails,
otherwise it succeeds.
[[:(:]] ... [[:):]] is just like \( ... \) except that in
the first case, no pmatch entries are changed, and the subexpression is
not counted in the numbering of parenthesized subexpressions.
[[:(:]] ... [[:):]] can be used when you do not need to know
where a subexpression matched but are only using parentheses to effect
the parsing of the regexp.
There are two reasons to use [[:(:]] ... [[:):]]:
1. regexec will run faster.
2. Currently, only 8 backreferencable subexpressions are supported:
\1 .. \9. Using [[:(:]] ... [[:):]] is a way to conserve
backreferencable subexpression names in an expression with many
parentheses.
regncomp and regnexec are non-standard generalizations of
regcomp and regexec.
Two mysterious parmaters can be used to trade-off performance and memory use.
At compile-time they are RX_DEFAULT_DFA_CACHE_SIZE and
RX_DEFAULT_NFA_DELAY.
If you want to mess with these (I generally don't advise it), I suggest experimenting for your particular application/memory situation; frob these by powers of two and try out the results on what you expect will be typical regexp workloads.
You can also set those parameters at run-time (before calling any regexp functions) by tweaking the corresponding variables:
rx_default_cache->bytes_allowed
and
rx_basic_unfaniverse_delay
rx_make_solutions, rx_next_solution, and
rx_free_solutions are a lower level alternative to the posix
functions. Using those functions, you can compare a compiled regexp to
a string that is not contiguous in memory or even a string that is not
entirely in memory at any one time.
The code in rxposix.c points out how those functions are used.
If you are only interested in pure regular expressions (no pmatch data,
no backreferences, and no counted subexpressions), you can parse a
regexp using rx_parse, convert it to an nfa using rx_unfa,
and run the dfa using rx_init_system, rx_advance_to_final,
and rx_terminate_system. The dfa Scheme primitives in
`rgx.c' may provide some guide.
Go to the first, previous, next, last section, table of contents.