Commit d8743861 authored by Per Cederqvist's avatar Per Cederqvist
Browse files

Initial revision

parent a433d01a
Richard Stallman -- original version and continuing revisions of
regex.c and regex.h, and original version of the documentation.
Karl Berry and Kathryn Hargreaves -- extensive modifications to above,
and all test files.
Jim Blandy -- original version of re_set_registers, revisions to regex.c.
Joe Arceneaux, David MacKenzie, Mike Haertel, Charles Hannum, and
probably others -- revisions to regex.c.
Thu Sep 17 19:47:16 1992 Karl Berry (
* Version 0.11.
Wed Sep 16 08:17:10 1992 Karl Berry (karl@hayley)
* regex.c (INIT_FAIL_STACK): rewrite as statements instead of a
complicated comma expr, to avoid compiler warnings (and also
(re_compile_fastmap, re_match_2): change callers.
* regex.c (POP_FAILURE_POINT): cast pop of regstart and regend
to avoid compiler warnings.
* regex.h (RE_NEWLINE_ORDINARY): remove this syntax bit, and
remove uses.
* regex.c (at_{beg,end}line_loc_p): go the last mile: remove
the RE_NEWLINE_ORDINARY case which made the ^ in \n^ be an anchor.
Tue Sep 15 09:55:29 1992 Karl Berry (karl@hayley)
* regex.c (at_begline_loc_p): new fn.
(at_endline_loc_p): simplify at_endline_op_p.
(regex_compile): in ^/$ cases, call the above.
* regex.c (POP_FAILURE_POINT): rewrite the fn as a macro again,
as lord's profiling indicates the function is 20% of the time.
(re_match_2): callers changed.
* (AC_MEMORY_H): remove, since we never use memcpy et al.
Mon Sep 14 17:49:27 1992 Karl Berry (karl@hayley)
* (makeargs): include MFLAGS.
Sun Sep 13 07:41:45 1992 Karl Berry (karl@hayley)
* regex.c (regex_compile): in \1..\9 case, make it always
invalid to use \<digit> if there is no preceding <digit>th subexpr.
* regex.h (RE_NO_MISSING_BK_REF): remove this syntax bit.
* regex.c (regex_compile): remove support for invalid empty groups.
* regex.h (RE_NO_EMPTY_GROUPS): remove this syntax bit.
* regex.c (FREE_VARIABLES) [!REGEX_MALLOC]: define as alloca (0),
to reclaim memory.
* regex.h (RE_SYNTAX_POSIX_SED): don't bother with this.
Sat Sep 12 13:37:21 1992 Karl Berry (karl@hayley)
* README: incorporate emacs.diff.
* regex.h (_RE_ARGS) [!__STDC__]: define as empty parens.
* add AC_ALLOCA.
* Put test files in subdir test, documentation in subdir doc.
Adjust and accordingly.
Thu Sep 10 10:29:11 1992 Karl Berry (karl@hayley)
* regex.h (RE_SYNTAX_{POSIX_,}SED): new definitions.
Wed Sep 9 06:27:09 1992 Karl Berry (karl@hayley)
* Version 0.10.
Tue Sep 8 07:32:30 1992 Karl Berry (karl@hayley)
* xregex.texinfo: put the day of month into the date.
* (realclean): remove Texinfo-generated files.
(distclean): remove empty sorted index files.
(clean): remove dvi files, etc.
* test for more Unix variants.
* fileregex.c: new file. (fileregex): new target.
* iregex.c (main): move variable decls to smallest scope.
* regex.c (FREE_VARIABLES): free reg_{,info_}dummy.
(re_match_2): check that the allocation for those two succeeded.
* regex.c (FREE_VAR): replace FREE_NONNULL with this.
(FREE_VARIABLES): call it.
(re_match_2) [REGEX_MALLOC]: initialize all our vars to NULL.
* tregress.c (do_match): generalize simple_match.
(SIMPLE_NONMATCH): new macro.
(SIMPLE_MATCH): change from routine.
* (regex.texinfo): make file readonly, so we don't
edit it by mistake.
* many files (re_default_syntax): rename to `re_syntax_options';
call re_set_syntax instead of assigning to the variable where
Mon Sep 7 10:12:16 1992 Karl Berry (karl@hayley)
* syntax.skel: don't use prototypes.
* {configure,Makefile}.in: new files.
* regex.c: include <string.h> `#if USG || STDC_HEADERS'; remove
obsolete test for `POSIX', and test for BSRTING.
Include <strings.h> if we are not USG or STDC_HEADERS.
Do not include <unistd.h>. What did we ever need that for?
* regex.h (RE_NO_EMPTY_ALTS): remove this.
(RE_SYNTAX_AWK): remove from here, too.
* regex.c (regex_compile): remove the check.
* xregex.texinfo (Alternation Operator): update.
* other.c (test_others): remove tests for this.
* regex.h (RE_DUP_MAX): undefine if already defined.
* regex.h: (RE_SYNTAX_POSIX*): redo to allow more operators, and
define new syntaxes with the minimal set.
* syntax.skel (main): used sscanf instead of scanf.
* regex.h (RE_SYNTAX_*GREP): new definitions from mike.
* regex.c (regex_compile): initialize the upper bound of
intervals at the beginning of the interval, not the end.
* regex.c (handle_bar): rename to `handle_alt', for consistency.
* regex.c ({store,insert}_{op1,op2}): new routines (except the last).
({STORE,INSERT}_JUMP{,2}): macros to replace the old routines,
which took arguments in different orders, and were generally weird.
* regex.c (PAT_PUSH*): rename to `BUF_PUSH*' -- we're not
appending info to the pattern!
Sun Sep 6 11:26:49 1992 Karl Berry (karl@hayley)
* regex.c (regex_compile): delete the variable
`following_left_brace', since we never use it.
* regex.c (print_compiled_pattern): don't print the fastmap if
it's null.
* regex.c (re_compile_fastmap): handle
`on_failure_keep_string_jump' like `on_failure_jump'.
* regex.c (re_match_2): in `charset{,_not' case, cast the bit
count to unsigned, not unsigned char, in case we have a full
32-byte bit list.
* tregress.c (simple_match): remove.
(simple_test): rename as `simple_match'.
(simple_compile): print the error string if the compile failed.
* regex.c (DO_RANGE): rewrite as a function, `compile_range', so
we can debug it. Change pattern characters to unsigned char
*'s, and change the range variable to an unsigned.
(regex_compile): change calls.
Sat Sep 5 17:40:49 1992 Karl Berry (karl@hayley)
* regex.h (_RE_ARGS): new macro to put in argument lists (if
ANSI) or omit them (if K&R); don't declare routines twice.
* many files (obscure_syntax): rename to `re_default_syntax'.
Fri Sep 4 09:06:53 1992 Karl Berry (karl@hayley)
* GNUmakefile (extraclean): new target.
(realclean): delete the info files.
Wed Sep 2 08:14:42 1992 Karl Berry (karl@hayley)
* regex.h: doc fix.
Sun Aug 23 06:53:15 1992 Karl Berry (karl@hayley)
* regex.[ch] (re_comp): no const in the return type (from djm).
Fri Aug 14 07:25:46 1992 Karl Berry (karl@hayley)
* regex.c (DO_RANGE): declare variables as unsigned chars, not
signed chars (from jimb).
Wed Jul 29 18:33:53 1992 Karl Berry (
* Version 0.9.
* GNUmakefile (distclean): do not remove regex.texinfo.
(realclean): remove it here.
* tregress.c (simple_test): initialize buf.buffer.
Sun Jul 26 08:59:38 1992 Karl Berry (karl@hayley)
* regex.c (push_dummy_failure): new opcode and corresponding
case in the various routines. Pushed at the end of
* regex.c (jump_past_next_alt): rename to `jump_past_alt', for
(no_pop_jump): rename to `jump'.
* regex.c (regex_compile) [DEBUG]: terminate printing of pattern
with a newline.
* NEWS: new file.
* tregress.c (simple_{compile,match,test}): routines to simplify all
these little tests.
* tregress.c: test for matching as much as possible.
Fri Jul 10 06:53:32 1992 Karl Berry (karl@hayley)
* Version 0.8.
Wed Jul 8 06:39:31 1992 Karl Berry (karl@hayley)
* regex.c (SIGN_EXTEND_CHAR): #undef any previous definition, as
ours should always work properly.
Mon Jul 6 07:10:50 1992 Karl Berry (karl@hayley)
* iregex.c (main) [DEBUG]: conditionalize the call to
* iregex.c (main): initialize buf.buffer to NULL.
* tregress (test_regress): likewise.
* regex.c (alloca) [sparc]: #if on HAVE_ALLOCA_H instead.
* tregress.c (test_regress): didn't have jla's test quite right.
Sat Jul 4 09:02:12 1992 Karl Berry (karl@hayley)
* regex.c (re_match_2): only REGEX_ALLOCATE all the register
vectors if the pattern actually has registers.
(match_end): new variable to avoid having to use best_regend[0].
* regex.c (IS_IN_FIRST_STRING): rename to FIRST_STRING_P.
* regex.c: doc fixes.
* tregess.c (test_regress): new fastmap test forwarded by rms.
* tregress.c (test_regress): initialize the fastmap field.
* tregress.c (test_regress): new test from jla that aborted
in re_search_2.
Fri Jul 3 09:10:05 1992 Karl Berry (karl@hayley)
* tregress.c (test_regress): add tests for translating charsets,
from kaoru.
* GNUmakefile (common): add alloca.o.
* alloca.c: new file, copied from bison.
* other.c (test_others): remove var `buf', since it's no longer used.
* Below changes from ro@TechFak.Uni-Bielefeld.DE.
* tregress.c (test_regress): initialize buf.allocated.
* regex.c (re_compile_fastmap): initialize `succeed_n_p'.
* GNUmakefile (regex): depend on $(common).
Wed Jul 1 07:12:46 1992 Karl Berry (karl@hayley)
* Version 0.7.
* regex.c: doc fixes.
Mon Jun 29 08:09:47 1992 Karl Berry (karl@fosse)
* regex.c (pop_failure_point): change string vars to
`const char *' from `unsigned char *'.
* regex.c: consolidate debugging stuff.
(print_partial_compiled_pattern): avoid enum clash.
Mon Jun 29 07:50:27 1992 Karl Berry (karl@hayley)
* xmalloc.c: new file.
* GNUmakefile (common): add it.
* iregex.c (print_regs): new routine (from jimb).
(main): call it.
Sat Jun 27 10:50:59 1992 Jim Blandy (
* xregex.c (re_match_2): When we have accepted a match and
restored d from best_regend[0], we need to set dend
appropriately as well.
Sun Jun 28 08:48:41 1992 Karl Berry (karl@hayley)
* tregress.c: rename from regress.c.
* regex.c (print_compiled_pattern): improve charset case to ease
Also, don't distinguish between Emacs and non-Emacs
{not,}wordchar opcodes.
* regex.c (print_fastmap): move here.
* test.c: from here.
* regex.c (print_{{partial,}compiled_pattern,double_string}):
rename from ..._printer. Change calls here and in test.c.
* regex.c: create from xregex.c and regexinc.c for once and for
all, and change the debug fns to be extern, instead of static.
* GNUmakefile: remove traces of xregex.c.
* test.c: put in externs, instead of including regexinc.c.
* xregex.c: move interactive main program and scanstring to iregex.c.
* iregex.c: new file.
* upcase.c, printchar.c: new files.
* various doc fixes and other cosmetic changes throughout.
* regexinc.c (compiled_pattern_printer): change variable name,
for consistency.
(partial_compiled_pattern_printer): print other info about the
compiled pattern, besides just the opcodes.
* xregex.c (regex_compile) [DEBUG]: print the compiled pattern
when we're done.
* xregex.c (re_compile_fastmap): in the duplicate case, set
`can_be_null' and return.
Also, set `bufp->can_be_null' according to a new variable,
Also, rewrite main while loop to not test `p != NULL', since
we never set it that way.
Also, eliminate special `can_be_null' value for the endline case.
(re_search_2): don't test for the special value.
* regex.h (struct re_pattern_buffer): remove the definition.
Sat Jun 27 15:00:40 1992 Karl Berry (karl@hayley)
* xregex.c (re_compile_fastmap): remove the `RE_' from
Also, assert the fastmap in the pattern buffer is non-null.
Also, reset `succeed_n_p' after we've
paid attention to it, instead of every time through the loop.
Also, in the `anychar' case, only clear fastmap['\n'] if the
syntax says to, and don't return prematurely.
Also, rearrange cases in some semblance of a rational order.
* regex.h (REG_RE_MATCH_NULL_AT_END): remove the `RE_' from the name.
* other.c: take bug reports from here.
* regress.c: new file for them.
* GNUmakefile (test): add it.
* main.c (main): new possible test.
* test.h (test_type): new value in enum.
Thu Jun 25 17:37:43 1992 Karl Berry (karl@hayley)
* xregex.c (scanstring) [test]: new function from jimb to allow some
(main) [test]: call it (on the string, not the pattern).
* xregex.c (main): make return type `int'.
Wed Jun 24 10:43:03 1992 Karl Berry (karl@hayley)
* xregex.c (pattern_offset_t): change to `int', for the benefit
of patterns which compile to more than 2^15 bytes.
* xregex.c (GET_BUFFER_SPACE): remove spurious braces.
* xregex.texinfo (Using Registers): put in a stub to ``document''
the new function.
* regex.h (re_set_registers) [!__STDC__]: declare.
* xregex.c (re_set_registers): declare K&R style (also move to a
different place in the file).
Mon Jun 8 18:03:28 1992 Jim Blandy (
* regex.h (RE_NREGS): Doc fix.
* xregex.c (re_set_registers): New function.
* regex.h (re_set_registers): Declaration for new function.
Fri Jun 5 06:55:18 1992 Karl Berry (karl@hayley)
* main.c (main): `return 0' instead of `exit (0)'. (From Paul Eggert)
* regexinc.c (SIGN_EXTEND_CHAR): cast to unsigned char.
(extract_number, EXTRACT_NUMBER): don't bother to cast here.
Tue Jun 2 07:37:53 1992 Karl Berry (karl@hayley)
* Version 0.6.
* Change copyrights to `1985, 89, ...'.
* regex.h (REG_RE_MATCH_NULL_AT_END): new macro.
* xregex.c (re_compile_fastmap): initialize `can_be_null' to
`p==pend', instead of in the test at the top of the loop (as
it was, it was always being set).
Also, set `can_be_null'=1 if we would jump to the end of the
pattern in the `on_failure_jump' cases.
(re_search_2): check if `can_be_null' is 1, not nonzero. This
was the original test in rms' regex; why did we change this?
* xregex.c (re_compile_fastmap): rename `is_a_succeed_n' to
Sat May 30 08:09:08 1992 Karl Berry (karl@hayley)
* xregex.c (re_compile_pattern): declare `regnum' as `unsigned',
not `regnum_t', for the benefit of those patterns with more
than 255 groups.
* xregex.c: rename `failure_stack' to `fail_stack', for brevity;
likewise for `match_nothing' to `match_null'.
* regexinc.c (REGEX_REALLOCATE): take both the new and old
sizes, and copy only the old bytes.
* xregex.c (DOUBLE_FAILURE_STACK): pass both old and new.
* This change from Thorsten Ohl.
Fri May 29 11:45:22 1992 Karl Berry (karl@hayley)
* regexinc.c (SIGN_EXTEND_CHAR): define as `(signed char) c'
instead of relying on __CHAR_UNSIGNED__, to work with
compilers other than GCC. From Per Bothner.
* main.c (main): change return type to `int'.
Mon May 18 06:37:08 1992 Karl Berry (karl@hayley)
* regex.h (RE_SYNTAX_AWK): typo in RE_RE_UNMATCHED...
Fri May 15 10:44:46 1992 Karl Berry (karl@hayley)
* Version 0.5.
Sun May 3 13:54:00 1992 Karl Berry (karl@hayley)
* regex.h (struct re_pattern_buffer): now it's just `regs_allocated'.
* xregex.c (regexec, re_compile_pattern): set the field appropriately.
(re_match_2): and use it. bufp can't be const any more.
Fri May 1 15:43:09 1992 Karl Berry (karl@hayley)
* regexinc.c: unconditionally include <sys/types.h>, first.
* regex.h (struct re_pattern_buffer): rename
`caller_allocated_regs' to `regs_allocated_p'.
* xregex.c (re_compile_pattern): same change here.
(regexec): and here.
(re_match_2): reallocate registers if necessary.
Fri Apr 10 07:46:50 1992 Karl Berry (karl@hayley)
* regex.h (RE_SYNTAX{_POSIX,}_AWK): new definitions from Arnold.
Sun Mar 15 07:34:30 1992 Karl Berry (karl at hayley)
* GNUmakefile (dist): versionize regex.{c,h,texinfo}.
Tue Mar 10 07:05:38 1992 Karl Berry (karl at hayley)
* Version 0.4.
* xregex.c (PUSH_FAILURE_POINT): always increment the failure id.
(DEBUG_STATEMENT) [DEBUG]: execute the statement even if `debug'==0.
* xregex.c (pop_failure_point): if the saved string location is
null, keep the current value.
(re_match_2): at fail, test for a dummy failure point by
checking the restored pattern value, not string value.
(re_match_2): new case, `on_failure_keep_string_jump'.
(regex_compile): output this opcode in the .*\n case.
* regexinc.c (re_opcode_t): define the opcode.
(partial_compiled_pattern_pattern): add the new case.
Mon Mar 9 09:09:27 1992 Karl Berry (karl at hayley)
* xregex.c (regex_compile): optimize .*\n to output an
unconditional jump to the ., instead of pushing failure points
each time through the loop.
* xregex.c (DOUBLE_FAILURE_STACK): compute the maximum size
ourselves (and correctly); change callers.
Sun Mar 8 17:07:46 1992 Karl Berry (karl at hayley)
* xregex.c (failure_stack_elt_t): change to `const char *', to
avoid warnings.
* regex.h (re_set_syntax): declare this.
* xregex.c (pop_failure_point) [DEBUG]: conditionally pass the
original strings and sizes; change callers.
Thu Mar 5 16:35:35 1992 Karl Berry (karl at
* xregex.c (regnum_t): new type for register/group numbers.
(compile_stack_elt_t, regex_compile): use it.
* xregex.c (regexec): declare len as `int' to match re_search.
* xregex.c (re_match_2): don't declare p1 twice.
* xregex.c: change `while (1)' to `for (;;)' to avoid silly
compiler warnings.
* regex.h [__STDC__]: use #if, not #ifdef.
* regexinc.c (REGEX_REALLOCATE): cast the result of alloca to
(char *), to avoid warnings.
* xregex.c (regerror): declare variable as const.
* xregex.c (re_compile_pattern, re_comp): define as returning a const
char *.
* regex.h (re_compile_pattern, re_comp): likewise.
Thu Mar 5 15:57:56 1992 Karl Berry (karl@hal)
* xregex.c (regcomp): declare `syntax' as unsigned.
* xregex.c (re_match_2): try to avoid compiler warnings about
unsigned comparisons.
* GNUmakefile (test-xlc): new target.
* regex.h (reg_errcode_t): remove trailing comma from definition.
* regexinc.c (re_opcode_t): likewise.
Thu Mar 5 06:56:07 1992 Karl Berry (karl at hayley)
* GNUmakefile (dist): add version numbers automatically.
(versionfiles): new variable.
(regex.{c,texinfo}): don't add version numbers here.
* regex.h: put in placeholder instead of the version number.
Fri Feb 28 07:11:33 1992 Karl Berry (karl at hayley)
* xregex.c (re_error_msg): declare const, since it is.
Sun Feb 23 05:41:57 1992 Karl Berry (karl at fosse)
* xregex.c (PAT_PUSH{,_2,_3}, ...): cast args to avoid warnings.
(regex_compile, regexec): return REG_NOERROR, instead
of 0, on success.
(boolean): define as char, and #define false and true.
* regexinc.c (STREQ): cast the result.
Sun Feb 23 07:45:38 1992 Karl Berry (karl at hayley)
* GNUmakefile (test-cc, test-hc, test-pcc): new targets.
* (extract_number, extract_number_and_incr) [DEBUG]:
only define if we are debugging.
* xregex.c [_AIX]: do #pragma alloca first if necessary.
* regexinc.c [_AIX]: remove the #pragma from here.
* regex.h (reg_syntax_t): declare as unsigned, and redo the enum
as #define's again. Some compilers do stupid things with enums.
Thu Feb 20 07:19:47 1992 Karl Berry (karl at hayley)
* Version 0.3.
* xregex.c, regex.h (newline_anchor_match_p): rename to
`newline_anchor'; dumb idea to change the name.
Tue Feb 18 07:09:02 1992 Karl Berry (karl at hayley)
* regexinc.c: go back to original, i.e., don't include
<string.h> or define strchr.
* xregex.c (regexec): don't bother with adding characters after
newlines to the fastmap; instead, just don't use a fastmap.
* xregex.c (regcomp): set the buffer and fastmap fields to zero.
* xregex.texinfo (GNU r.e. compiling): have to initialize more
than two fields.
* regex.h (struct re_pattern_buffer): rename `newline_anchor' to
`newline_anchor_match_p', as we're back to two cases.
* xregex.c (regcomp, re_compile_pattern, re_comp): change
(re_match_2): at begline and endline, POSIX is not a special
case anymore; just check newline_anchor_match_p.
Thu Feb 13 16:29:33 1992 Karl Berry (karl at hayley)
* xregex.c (*empty_string*): rename to *null_string*, for brevity.
Wed Feb 12 06:36:22 1992 Karl Berry (karl at hayley)
* xregex.c (re_compile_fastmap): at endline, don't set fastmap['\n'].
(re_match_2): rewrite the begline/endline cases to take account
of the new field newline_anchor.
Tue Feb 11 14:34:55 1992 Karl Berry (karl at hayley)
* regexinc.c [!USG etc.]: include <strings.h> and define strchr
as index.
* xregex.c (re_search_2): when searching backwards, declare `c'
as a char and use casts when using it as an array subscript.
* xregex.c (regcomp): if REG_NEWLINE, set
RE_HAT_LISTS_NOT_NEWLINE. Set the `newline_anchor' field
(regex_compile): compile [^...] as matching a \n according to
the syntax bit.
(regexec): if doing REG_NEWLINE stuff, compile a fastmap and add
characters after any \n's to the newline.
* regex.h (RE_HAT_LISTS_NOT_NEWLINE): new syntax bit.
(struct re_pattern_buffer): rename `posix_newline' to
`newline_anchor', define constants for its values.
Mon Feb 10 07:22:50 1992 Karl Berry (karl at hayley)
* xregex.c (re_compile_fastmap): combine the code at the top and
bottom of the loop, as it's essentially identical.
Sun Feb 9 10:02:19 1992 Karl Berry (karl at hayley)
* xregex.texinfo (POSIX Translate Tables): remove this, as it
doesn't match the spec.
* xregex.c (re_compile_fastmap): if we finish off a path, go
back to the top (to set can_be_null) instead of returning