summaryrefslogtreecommitdiff
path: root/posix/regex_internal.h
AgeCommit message (Collapse)Author
2009-01-08* posix/regcomp.c (re_compile_fastmap_iter): Use __mbrtowc.Ulrich Drepper
* posix/regex_internal.c (build_wcs_buffer, build_wcs_upper_buffer, re_string_skip_chars, re_string_reconstruct): Likewise. * posix/regex_internal.h [!_LIBC] (__mbrtowc): New #define.
2008-12-06* posix/regex_internal.h (build_wcs_upper_buffer):Ulrich Drepper
Return type is reg_error_t.
2007-08-26* posix/regex_internal.h: Prevent some declarations and definitionsUlrich Drepper
to be seen when used in tests.
2005-12-11* posix/regex_internal.h: Include <stdint.h> if available.Ulrich Drepper
2005-12-06 Paolo Bonzini <bonzini@gnu.org> * posix/regex_internal.h (SIZE_MAX): Provide a default definition.
2005-10-15[BZ #1221]Ulrich Drepper
* posix/regex_internal.h: Remove last traces of RE_NO_INTERNAL_PROTOTYPES.
2005-10-13[BZ #1248]Ulrich Drepper
2005-08-26 Paul Eggert <eggert@cs.ucla.edu> [BZ #1248] * posix/regex_internal.h (bitset_not, bitset_merge, bitset_not_merge, bitset_mask, re_string_allocate, re_string_construct, re_string_reconstruct, re_string_destruct, re_string_elem_size_at, re_string_char_size_at, re_string_wchar_at, re_string_peek_byte_case, re_string_fetch_byte_case, re_node_set_alloc, re_node_set_init_1, re_node_set_init_2, re_node_set_init_copy, re_node_set_add_intersect, re_node_set_init_union, re_node_set_merge, re_node_set_insert, re_node_set_insert_last, re_node_set_compare, re_node_set_contains, re_node_set_remove_at, re_dfa_add_node, re_acquire_state, re_acquire_state_context): Remove unnecessary forward decls. (re_string_char_size_at, re_string_wchar_at, re_string_elem_size_at): Put __attribute at function definition, now that the function decl has been removed. * posix/regex_internal.c (re_string_peek_byte_case, re_string_fetch_byte_case, re_node_set_compare, re_node_set_contains): Likewise.
2005-09-28[BZ #1302]Ulrich Drepper
2005-09-06 Paul Eggert <eggert@cs.ucla.edu> Ulrich Drepper <drepper@redhat.com> [BZ #1302] Change bitset word type from unsigned int to unsigned long int, as this has better performance on typical 64-bit hosts. Change bitset type name to bitset_t. * posix/regcomp.c (build_equiv_class, build_charclass): (build_range_exp, build_collating_symbol): Prefer bitset_t to re_bitset_ptr_t in prototypes, when the actual argument is a bitset. This is merely a style issue, but it makes it clearer that an entire array is expected. (re_compile_fastmap_iter, init_dfa, init_word_char, optimize_subexps, lower_subexp): Adjust for new bitset_t definition. (lower_subexp, parse_bracket_exp, built_charclass_op): Likewise. * posix/regex_internal.h (bitset_set, bitset_clear, bitset_contain, bitset_not, bitset_merge, bitset_set_all, bitset_mask): Likewise. * posix/regexec.c (check_dst_limits_calc_pos_1, check_subexp_matching_top, build_trtable, group_nodes_into_DFAstates): Likewise. * posix/regcomp.c (utf8_sb_map): Don't assume initializer == 0xffffffff. * posix/regex_internal.h (BITSET_WORD_BITS): Renamed from UINT_BITS. All uses changed. (BITSET_WORDS): Renamed from BITSET_UINTS. All uses changed. (bitset_word_t): New type, replacing 'unsigned int' for bitset uses. All uses changed. (BITSET_WORD_MAX): New macro. (bitset_set, bitset_clear, bitset_contain, bitset_empty, (bitset_set_all, bitset_copy): Adjust for bitset_t change. (bitset_empty, bitset_copy): Prefer sizeof (bitset_t) to multiplying it out ourselves. (bitset_not_merge): Remove; unused. (bitset_contain): Return bool, not unsigned int with one bit on. All callers changed. * posix/regexec.c (build_trtable): Don't assume bitset_t has no stricter alignment than re_node_set; do this by defining a new internal type struct dests_alloc and using it to allocate memory.
2005-09-23[BZ #281]Ulrich Drepper
* posix/regex.h: Define RE_TRANSLATE_TYPE as unsigned char *. * posix/regcomp.c: Remove unnecessary uses of unsigned RE_TRANSLATE_TYPE. * posix/regex_internal.h: Likewise. * posix/regex_internal.c: Likewise. * posix/regexexec.c: Likewise. Based on a patch by Stepan Kasal <kasal@ucw.cz>.
2005-09-07(update_cur_sifted_state): Likewise.Ulrich Drepper
(re_search_internal): Likewise. (prune_impossible_nodes): Likewise. (acquire_init_state_context): Likewise. (proceed_next_node): Likewise. (set_regs): Likewise. (free_fail_stack_return): Likewise. (check_subexp_limits): Likewise. (sub_epsilon_src_nodes): Likewise. (add_epsilon_src_nodes): Likewise. (merge_state_array): Likewise. (update_regs): Likewise. (build_trtable): Likewise. (sift_states_backward): Mark MCTX parameter as const. (build_sifted_states): Likewise. (update_cur_sifted_state): Likewise. (sift_states_mkref): Likewise. (check_dst_limits_calc_pos_1): Likewise. * posix/regex_internal.h (re_match_context_t): Make dfa a const pointer.
2005-09-07* posix/regex_internal.c (re_string_reconstruct): Avoid callingUlrich Drepper
mbrtowc for very simple UTF-8 case. 2005-09-01 Paul Eggert <eggert@cs.ucla.edu> * posix/regex_internal.c (build_wcs_upper_buffer): Fix portability bugs in int versus size_t comparisons. 2005-09-06 Ulrich Drepper <drepper@redhat.com> * posix/regex_internal.c (re_acquire_state): Make DFA pointer arg a pointer-to-const. (re_acquire_state_context): Likewise. * posix/regex_internal.h: Adjust prototypes. 2005-08-31 Jim Meyering <jim@meyering.net> * posix/regcomp.c (search_duplicated_node): Make first pointer arg a pointer-to-const. * posix/regex_internal.c (create_ci_newstate, create_cd_newstate, register_state): Likewise. * posix/regexec.c (search_cur_bkref_entry, check_dst_limits): (check_dst_limits_calc_pos_1, check_dst_limits_calc_pos): (group_nodes_into_DFAstates): Likewise. * posix/regexec.c (re_search_internal): Simplify update of rm_so and rm_eo by replacing "if (A == B) A += C - B;" with the equivalent of "if (A == B) A = C;". 2005-09-06 Ulrich Drepper <drepper@redhat.com> * posix/regcomp.c (re_compile_internal): Change third parameter type to size_t. (init_dfa): Likewise. Make sure that arithmetic on pat_len doesn't overflow. * posix/regex_internal.h (struct re_dfa_t): Change type of nodes_alloc and nodes_len to size_t. * posix/regex_internal.c (re_dfa_add_node): Use size_t as type for new_nodes_alloc. Check for overflow. 2005-08-31 Paul Eggert <eggert@cs.ucla.edu> * posix/regcomp.c (re_compile_fastmap_iter, init_dfa, init_word_char): (optimize_subexps, lower_subexp): Don't assume 1<<31 has defined behavior on hosts with 32-bit int, since the signed shift might overflow. Use 1u<<31 instead. * posix/regex_internal.h (bitset_set, bitset_clear, bitset_contain): Likewise. * posix/regexec.c (check_dst_limits_calc_pos_1): Likewise. (check_subexp_matching_top): Likewise. * posix/regcomp.c (optimize_subexps, lower_subexp): Use CHAR_BIT rather than 8, for clarity. * posix/regexec.c (check_dst_limits_calc_pos_1): (check_subexp_matching_top): Likewise. * posix/regcomp.c (init_dfa): Make table_size unsigned, so that we don't have to worry about portability issues when shifting it left. Remove no-longer-needed test for table_size > 0. * posix/regcomp.c (parse_sub_exp): Do not shift more bits than there are in a word, as the resulting behavior is undefined. * posix/regexec.c (check_dst_limits_calc_pos_1): Likewise; in one case, a <= should have been an <, and in another case the whole test was missing. * posix/regex_internal.h (BYTE_BITS): Remove. All uses changed to the standard name CHAR_BIT.
2005-09-06* posix/regex_internal.h (re_sub_match_top_t): Remove unused memberUlrich Drepper
next_last_offset. (struct re_dfa_t): Remove unused member states_alloc. * posix/regcomp.c (init_dfa): Don't initialize unused members. 2005-08-25 Paul Eggert <eggert@cs.ucla.edu> * posix/regexec.c (set_regs): Don't alloca with an unbounded size. alloca modernization/simplification for regex. * posix/regex.c: Remove portability cruft for alloca. This no longer needs to be at the start of the file, and can be moved into regex_internal.h and simplified. * posix/regex_internal.h: Include <alloca.h>. (__libc_use_alloca) [!defined _LIBC]: New macro. * posix/regexec.c (build_trtable): Remove "#ifdef _LIBC", since the code now works outside glibc. 2005-09-06 Ulrich Drepper <drepper@redhat.com> * include/regex.h: Remove use of _RE_ARGS. 2005-08-25 Paul Eggert <eggert@cs.ucla.edu> * posix/regexec.c (find_recover_state): Change "err" to "*err". 2005-08-24 Paul Eggert <eggert@cs.ucla.edu> * posix/regcomp.c (regerror): Pointer args are 'restrict', as per POSIX. * posix/regex.h (regerror): Likewise. * manual/pattern.texi (POSIX Regexp Compilation): Likewise. Similarly for regcomp and regexec. Also, first 2 args of regexec and 2nd arg of regerror are const. * posix/regex.c: Do not include <sys/types.h>, as POSIX no longer requires this. (The code never needed it.) 2005-08-20 Paul Eggert <eggert@cs.ucla.edu> * posix/regexec.c (sift_states_bkref): re_node_set_insert returns int, not reg_errcode_t. * posix/regex_internal.c (calc_state_hash): Put 'inline' before type, since some broken compilers warn about it otherwise. * posix/regcomp.c (create_initial_state): Remove duplicate decl. 2005-08-20 Paul Eggert <eggert@cs.ucla.edu> * posix/regex.h (_RE_ARGS): Remove. No longer needed, since we assume C89 or better. All uses removed. 2005-09-06 Ulrich Drepper <drepper@redhat.com> * posix/regex.c: Prevent using C++ compilers. 2005-08-19 Paul Eggert <eggert@cs.ucla.edu> * posix/regcomp.c (duplicate_node): Return new index, not an error code, and let the caller return REG_ESPACE if out of space. This removes an uninitialied-variable warning with GCC 4.0.1, and also avoids taking the address of a local variable. All callers changed. 2005-09-06 Ulrich Drepper <drepper@redhat.com> * include/time.h (__strptime_internal): Rename parameter to avoid bogus compiler warning. 2005-08-19 Jim Meyering <jim@meyering.net> * posix/regexec.c (proceed_next_node): Redo local variables to avoid GCC shadowing warnings. 2005-09-06 Ulrich Drepper <drepper@redhat.com> * posix/regex_internal.c (re_acquire_state): Minor code rearrangement. (re_acquire_state_context): Likewise. 2005-08-19 Paul Eggert <eggert@cs.ucla.edu> * posix/regex_internal.c (re_string_realloc_buffers): (re_node_set_insert, re_node_set_insert_last, re_dfa_add_node): Rename local variables to avoid GCC shadowing warnings. 2005-07-08 Eric Blake <ebb9@byu.net> Paul Eggert <eggert@cs.ucla.edu> * posix/regcomp.c (init_dfa): Store __btowc value in wint_t, not wchar_t. Remove now-unnecessary cast. (build_range_exp): Likewise.
2005-05-06Include bits/libc-lock.h or define dummy __libc_lock_* macros if not _LIBC. ↵Ulrich Drepper
(struct re_dfa_t): Add lock.
2005-02-17[BZ #284, BZ #721]Roland McGrath
* intl/dcigettext.c (_nl_find_msg): Add a cast. * nis/nis_clone_dir.c (nis_clone_directory): Use char * for ADDR. * nis/nis_clone_obj.c (nis_clone_object): Likewise. * nis/nis_clone_res.c (nis_clone_result): Likewise. * resolv/nss_dns/dns-network.c (getanswer_r): Use const unsigned char * for END_OF_MESSAGE and CP. * resolv/res_send.c (send_dg): Add else branch for case impossible unless `poll' is buggy. * crypt/crypt_util.c (__setkey_r): Add a cast. * locale/programs/linereader.c (get_toplvl_escape): Use size_t for NBYTES, and unsigned char * for BYTES. * locale/programs/charmap.c (charmap_new_char): Use size_t and unsighed char * for NBYTES, BYTES parameters. * sysdeps/generic/dl-hash.h (_dl_elf_hash): Take const char * argument and cast it. * sysdeps/i386/i686/dl-hash.h (_dl_elf_hash): Likewise. * sunrpc/create_xid.c (_create_xid): Don't use unsigned long for RES. * sunrpc/svcauth_des.c (_svcauth_des): Fix cast type. * sunrpc/auth_des.c (authdes_create): Don't use u_char for PKEY_DATA. (authdes_marshal): Don't use unsigned int for LEN. * sunrpc/xdr.c (xdr_hyper): Don't use unsigned long for T2. (xdr_u_hyper): Likewise. (xdr_u_short): Don't use u_long for L. * sunrpc/xdr_intXX_t.c (xdr_int64_t): Don't use uint32_t for T2. * inet/rexec.c (rexec_af): Use socklen_t. * sunrpc/key_call.c (getkeyserv_handle): Likewise. * sunrpc/rtime.c (rtime): Likewise. * resolv/res_send.c (send_vc, send_dg): Likewise. * nis/nis_callback.c (__nis_create_callback): Likewise. * sysdeps/generic/libc-start.c: Use unsigned int for nthreads ptr. * sysdeps/posix/getaddrinfo.c (gaih_inet): Fix type of ADDR local. * libio/libio.h (_IO_BE): Add parenthesis around EXPR. * intl/dcigettext.c (INTVARDEF, INTUSE): Macros removed. (_nl_default_dirname): Use libc_hidden_data_def instead of INTVARDEF. (libc_freeres_fn, DCIGETTEXT): Don't use INTUSE. * intl/bindtextdom.c (INTUSE): Macro removed. (_nl_default_dirname): Use libc_hidden_proto. (set_binding_values): Don't use INTUSE. * include/libintl.h (_libc_intl_domainname_internal): Decl removed. (_libc_intl_domainname): Use libc_hidden_proto. * posix/regex_internal.h (gettext): Remove INTUSE on it. * locale/SYS_libc.c (_libc_intl_domainname): Use libc_hidden_data_def rather than INTDEF. * include/libintl.h (_): Don't use *_internal name. * ctype/ctype-extn.c (__ctype_tolower, __ctype_toupper): Use int32_t, not uint32_t. * locale/lc-ctype.c (_nl_postload_ctype): Likewise for assignments. * iconv/gconv_open.c (__gconv_open): Remove useless cast. [BZ #721] * sysdeps/i386/dl-machine.h (ELF_MACHINE_NO_RELA): Define this outside of [RESOLVE_MAP]. * sysdeps/sh/dl-machine.h (ELF_MACHINE_NO_REL): Likewise. * sysdeps/powerpc/powerpc32/dl-machine.h (elf_machine_rel, elf_machine_rel_relative): Removed. * sysdeps/powerpc/powerpc64/dl-machine.h (elf_machine_rel, elf_machine_rel_relative): Removed. 2005-02-03 Alexandre Oliva <aoliva@redhat.com> [BZ #721] * elf/dynamic-link.h: Don't declare nested auto functions that are not going to be defined. 2004-07-23 Jakub Jelinek <jakub@redhat.com> [BZ #284] * include/features.h (_POSIX_SOURCE, _POSIX_C_SOURCE): Define if _XOPEN_SOURCE >= 500 even if __STRICT_ANSI__ is defined. 2005-02-16 Roland McGrath <roland@redhat.com>
2005-01-26[BZ #605, BZ #611]Ulrich Drepper
Update. 2004-12-13 Paolo Bonzini <bonzini@gnu.org> Separate parsing and creation of the NFA. Avoided recursion on the (very unbalanced) parse tree. [BZ #611] * posix/regcomp.c (struct subexp_optimize, analyze_tree, calc_epsdest, re_dfa_add_tree_node, mark_opt_subexp_iter): Removed. (optimize_subexps, duplicate_tree, calc_first, calc_next, mark_opt_subexp): Rewritten. (preorder, postorder, lower_subexps, lower_subexp, link_nfa_nodes, create_token_tree, free_tree, free_token): New. (analyze): Accept a regex_t *. Invoke the passes via the preorder and postorder generic visitors. Do not initialize the fields in the re_dfa_t that represent the transitions. (free_dfa_content): Use free_token. (re_compile_internal): Analyze before UTF-8 optimizations. Do not include optimization of subexpressions. (create_initial_state): Fetch the DFA node index from the first node's bin_tree_t *. (optimize_utf8): Abort on unexpected nodes, including OP_DUP_QUESTION. Return on COMPLEX_BRACKET. (duplicate_node_closure): Fix comment. (duplicate_node): Do not initialize the fields in the re_dfa_t that represent the transitions. (calc_eclosure, calc_inveclosure): Do not handle OP_DELETED_SUBEXP. (create_tree): Remove final argument. All callers adjusted. Rewritten to use create_token_tree. (parse_reg_exp, parse_branch, parse_expression, parse_bracket_exp, build_charclass_op): Use create_tree or create_token_tree instead of re_dfa_add_tree_node. (parse_dup_op): Likewise. Also free the tree using free_tree for "<re>{0}", and lower OP_DUP_QUESTION to OP_ALT: "a?" is equivalent to "a|". Adjust invocation of mark_opt_subexp. (parse_sub_exp): Create a single SUBEXP node. * posix/regex_internal.c (re_dfa_add_node): Remove last parameter, always perform as if it was 1. Do not initialize OPT_SUBEXP and DUPLICATED, and initialize the DFA fields representing the transitions. * posix/regex_internal.h (re_dfa_add_node): Adjust prototype. (re_token_type_t): Move OP_DUP_PLUS and OP_DUP_QUESTION to the tokens section. Add a tree-only code SUBEXP. Remove OP_DELETED_SUBEXP. (bin_tree_t): Include a full re_token_t for TOKEN. Turn FIRST and NEXT into pointers to trees. Remove ECLOSURE. 2004-12-28 Paolo Bonzini <bonzini@gnu.org > [BZ #605] * posix/regcomp.c (parse_bracket_exp): Do not modify DFA nodes that were already created. * posix/regex_internal.c (re_dfa_add_node): Set accept_mb field in the token if needed. (create_ci_newstate, create_cd_newstate): Set accept_mb field from the tokens' field. * posix/regex_internal.h (re_token_t): Add accept_mb field. (ACCEPT_MB_NODE): Removed. * posix/regexec.c (proceed_next_node, transit_states_mb, build_sifted_states, check_arrival_add_next_nodes): Use accept_mb instead of ACCEPT_MB_NODE.
2005-01-26(DUMMY_CONSTRAINT): Rename to... (WORD_DELIM_CONSTRAINT): ...this. ↵Ulrich Drepper
(NOT_WORD_DELIM_CONSTRAINT): Define. (re_context_type): Add INSIDE_NOTWORD and NOT_WORD_DELIM, change WORD_DELIM to use WORD_DELIM_CONSTRAINT.
2004-12-27Update.Ulrich Drepper
2004-04-27 Paolo Bonzini <bonzini@gnu.org> * posix/regex_internal.h (struct re_dfastate_t): Make word_trtable a pointer to the 512-item transition table. * posix/regexec.c (build_trtable): Fill in either state->trtable or state->word_trtable. Return a boolean indicating success. (transit_state): Expect state->trtable to be a 256-item transition table. Reorganize code to have less tests in the common case, and to save an indentation level.
2004-12-22(CFLAGS-tst-align.c): Add -mpreferred-stack-boundary=4.Ulrich Drepper
2007-07-122.5-18.1Jakub Jelinek
2004-12-10Update.Ulrich Drepper
2004-12-07 Paolo Bonzini <bonzini@gnu.org> * posix/regexec.c (proceed_next_node): Simplify treatment of epsilon nodes. Pass the pushed node to push_fail_stack. (push_fail_stack): Accept a single node rather than an array of two epsilon destinations. (build_sifted_states): Only walk non-epsilon nodes. (check_arrival): Don't pass epsilon nodes to check_arrival_add_next_nodes. (check_arrival_add_next_nodes) [DEBUG]: Abort if an epsilon node is found. (check_node_accept): Do expensive checks later. (add_epsilon_src_nodes): Cache result of merging the inveclosures. * posix/regex_internal.h (re_dfastate_t): Add non_eps_nodes and inveclosure. (re_string_elem_size_at, re_string_char_size_at, re_string_wchar_at, re_string_context_at, re_string_peek_byte_case, re_string_fetch_byte_case, re_node_set_compare, re_node_set_contains): Declare as pure. * posix/regex_internal.c (create_newstate_common): Remove. (register_state): Move part of it here. Initialize non_eps_nodes. (free_state): Free inveclosure and non_eps_nodes. (create_cd_newstate, create_ci_newstate): Allocate the new re_dfastate_t here.
2004-12-06Update.Ulrich Drepper
2004-12-01 Paolo Bonzini <bonzini@gnu.org> * posix/regcomp.c (free_dfa_content, init_dfa): Remove references to re_dfa_t's subexps field. (parse_sub_exp, parse_expression): Do not use it. Use completed_bkref_map instead. (create_initial_state, peek_token): Store a backreference \N with opr.idx = N-1. * posix/regexec.c (proceed_next_node, check_dst_limits, get_subexp): Likewise. (check_subexp_limits): Remove useless condition. * posix/regex_internal.h (re_subexp_t): Remove. (re_dfa_t): Remove subexps and subexps_alloc field, add completed_bkref_map.
2004-11-25Update.Ulrich Drepper
2004-11-23 Paolo Bonzini <bonzini@gnu.org> * posix/regcomp.c (analyze_tree): Always call calc_epsdest. (calc_inveclosure): Use re_node_set_insert_last. (parse_dup_op): Lower X{1,5} to (X(X(X(XX?)?)?)?)? rather than X?X?X?X?X?. * posix/regex_internal.h (re_node_set_insert_last): New declaration. * posix/regex_internal.c (re_node_set_insert_last): New function. * posix/PCRE.tests: Add testcases.
2004-11-18[BZ #544]Ulrich Drepper
Update. 2004-11-18 Jakub Jelinek <jakub@redhat.com> [BZ #544] * posix/regex.h (RE_NO_SUB): New define. * posix/regex_internal.h (OP_DELETED_SUBEXP): New. (re_dfa_t): Add subexp_map. * posix/regcomp.c (struct subexp_optimize): New type. (optimize_subexps): New routine. (re_compile_internal): Call it. (re_compile_pattern): Set preg->no_sub to 1 if RE_NO_SUB. (free_dfa_content): Free subexp_map. (calc_inveclosure, calc_eclosure): Skip OP_DELETED_SUBEXP nodes. * posix/regexec.c (re_search_internal): If subexp_map is not NULL, duplicate registers as needed. * posix/Makefile: Add rules to build and run tst-regex2. * posix/tst-regex2.c: New test. * posix/rxspencer/tests: Fix last two tests (\0 -> \1). Add some new tests for nested subexpressions.
2004-11-12Update.cvs/fedora-glibc-20041112T1640Ulrich Drepper
2004-11-12 Ulrich Drepper <drepper@redhat.com> * posix/Makefile (tests): Add bug-regex24. * posix/bug-regex24.c: New file. 2004-11-12 Paolo Bonzini <bonzini@gnu.org> * posix/regexec.c (check_dst_limits_calc_pos_1): Use the map to cut recursive paths. Make exit condition more precise. (match_ctx_add_entry): Initialize the map. * posix/regex_internal.h (struct re_backref_cache_entry): Add a map of reachable subexpression nodes from each backreference cache entry.
2004-11-08Update.Ulrich Drepper
2004-11-08 Ulrich Drepper <drepper@redhat.com> * posix/regcomp.c (utf8_sb_map): Define. (free_dfa_content): Don't free dfa->sb_char if it's a pointer to utf8_sb_map. (init_dfa): Use utf8_sb_map instead of initializing memory when the encoding is UTF-8. * posix/regcomp.c (init_dfa): Get the codeset name outside glibc as well. Check if it is spelled UTF8 as well as UTF-8, and check case-insensitively. Set dfa->map_notascii manually when outside glibc. * posix/regex_internal.c (build_wcs_upper_buffer) [!_LIBC]: Enable optimizations based on map_notascii. * posix/regex_internal.h [HAVE_LANGINFO_H || HAVE_LANGINFO_CODESET || _LIBC]: Include langinfo.h. * posix/regex_internal.h (struct re_backref_cache_entry): Add "more" field. * posix/regexec.c (check_dst_limits): Hoist computation of the source and destination bkref_idx out of the loop. Pass it to check_dst_limits_calc_pos. (check_dst_limits_calc_pos_1): New function, containing the recursive loop of check_dst_limits_calc_pos; uses the "more" field of struct re_backref_cache to control the loop. (check_dst_limits_calc_pos): Store into "boundaries" the position relative to lim's start and end positions. Do not accept eclosures, accept bkref_idx instead. Call check_dst_limits_calc_pos_1 to do the work. (sift_states_bkref): Use the "more" field of struct re_backref_cache to control the loop. A big "if" was turned into a continue and the function was reindented. (get_subexp): Use the "more" field of struct re_backref_cache to control the loop. (match_ctx_add_entry): Initialize the bkref_ents' "more" field. (search_cur_bkref_entry): Return -1 if out of bounds. * posix/regexec.c (empty_set): Remove. (sift_states_backward): Remove cur_src variable. Move inner loop to build_sifted_states. (build_sifted_states): Extract from sift_states_backward. Do not use empty_set. (update_cur_sifted_state): Do not use empty_set. Special case dest_nodes->nelem == 0.
2004-11-08(struct re_backref_cache_entry): Remove flag field. (struct ↵Ulrich Drepper
re_sift_context_t): Remove cur_bkref, cls_subexp_idx, check_subexp fields. Move limits last.
2004-11-03* posix/regex_internal.h (__regfree) [!_LIBC]: Define to regfree.Roland McGrath
2004-02-26Update.Ulrich Drepper
2004-02-24 Arnold D. Robbins <arnold@skeeve.com> * posix/regex_internal.c (build_wcs_upper_buffer): Enclose `offsets_needed' label in `#ifdef _LIBC' to silence `unused label' compiler warning. 2004-02-24 Nelson H.F. Beebe <beebe@math.utah.edu> * posix/regex_internal.c (build_wcs_buffer): Add cast to char* in call to `wcrtomb'. * posix/regex_internal.h (bitset_not, bitset_merge, bitset_not_merge, bitset_mask, re_string_char_size_a, re_string_wchar_at, re_string_elem_size_at): Change to use prototypes. (re_string_char_size_at, re_string_wchar_at, re_string_elem_size_at): Declare as `internal_function'.
2004-01-30Update.Ulrich Drepper
2004-01-28 Paolo Bonzini <bonzini@gnu.org> Merge regex changes in gawk. * posix/regcomp.c (build_range_exp) [!_LIBC]: Check validity of collation elements. * posix/regex.c: Include limits.h. * posix/regex.h: Document REG_ECOLLATE correctly. * posix/regex_internal.h [!_LIBC && !ENABLE_NLS]: Disable NLS.
2004-01-14Update.Ulrich Drepper
2004-01-13 Ulrich Drepper <drepper@redhat.com> * posix/regex.c: Support crappy compilers and platforms which have problems with alloca. * posix/regex_internal.h: Likewise. Patch by Paolo Bonzini.
2004-01-14Update.Ulrich Drepper
2004-01-14 Jakub Jelinek <jakub@redhat.com> * posix/regcomp.c (peek_token_bracket): Check remaining string length before re_string_peek_byte (x, 1). (parse_bracket_symbol): Likewise. * posix/regex_internal.h (re_string_is_single_byte_char): Return true at last byte in the string. * posix/bug-regex22.c (main): Add new test.
2004-01-06Update.Ulrich Drepper
2004-01-05 Jakub Jelinek <jakub@redhat.com> * posix/regcomp.c (regcomp): Fix comment typo. (regfree): Free preg->translate, clear buffer, allocated, fastmap and translate fields. * posix/regcomp.c (build_charclass, buld_charclass_op): Change first argument to unsigned RE_TRANSLATE_TYPE. * posix/regex_internal.h (re_string_t): Change trans type to unsigned RE_TRANSLATE_TYPE. * posix/regex_internal.c (re_string_construct_common): Cast trans to unsigned RE_TRANSLATE_TYPE. (re_string_peek_byte_case, re_string_fetch_byte_case): Avoid fast path if pstr->trans. Never translate the character through pstr->trans. * posix/Makefile (tests): Add bug-regex22. (bug-regex22-ENV): Set. * posix/bug-regex22.c: New test.
2004-01-03(re_match_context_t): Add dfa member.Andreas Jaeger
2004-01-02Update.Ulrich Drepper
2004-01-02 Jakub Jelinek <jakub@redhat.com> * posix/regex_internal.c (re_node_set_insert): Remove unused variables. * posix/regex_internal.h (re_dfa_t): Add syntax field. * posix/regcomp.c (parse): Initialize dfa->syntax. * posix/regexec.c (acquire_init_state_context, prune_impossible_nodes, check_matching, check_halt_state_context, proceed_next_node, sift_states_iter_mb, sift_states_backward, update_cur_sifted_state, sift_states_bkref, transit_state, transit_state_sb, transit_state_mb, transit_state_bkref, get_subexp, get_subexp_sub, check_arrival, expand_bkref_cache, build_trtable): Remove preg argument, add dfa argument instead and remove dfa = preg->buffer initialization in the body. Adjust all callers. (check_node_accept_bytes, group_nodes_into_DFAstates, check_node_accept): Likewise. Use dfa->syntax instead of preg->syntax. (check_arrival_add_next_nodes): Remove preg argument. * posix/regex_internal.h (re_match_context_t): Make input re_string_t instead of a pointer to it. * posix/regex_internal.c (re_string_construct_common): Don't clear pstr here... (re_string_construct): ... but only here. * posix/regexec.c (match_ctx_init): Remove input argument. Don't initialize fields to zero. (re_search_internal): Move input into mctx.input. (acquire_init_state_context, check_matching, check_halt_state_context, proceed_next_node, clean_state_log_if_needed, sift_states_bkref, sift_states_iter_mb, transit_state, transit_state_sb, transit_state_mb, transit_state_bkref, get_subexp, check_arrival, check_arrival_add_next_nodes, check_node_accept, extend_buffers): Change mctx->input into &mctx->input and mctx->input->field into mctx->input.field. 2004-01-02 Jakub Jelinek <jakub@redhat.com> Paolo Bonzini <bonzini@gnu.org> * posix/regex_internal.h (re_const_bitset_ptr_t): New type. (re_string_t): Add newline_anchor, word_char and word_ops_used fields. (re_dfa_t): Change word_char type to bitset. Add word_ops_used field. (re_string_context_at, re_string_reconstruct): Remove last argument. * posix/regex_internal.c (re_string_allocate): Initialize pstr->word_char and pstr->word_ops_used. (re_string_context_at): Remove newline_anchor argument. Use input->newline_anchor instead, swap && conditions. Only use IS_WIDE_WORD_CHAR if input->word_ops_used != 0. Use input->word_char bitmap instead of IS_WORD_CHAR. (re_string_reconstruct): Likewise. Adjust re_string_context_at caller. * posix/regexec.c (acquire_init_state_context, check_halt_state_context, transit_state, transit_state_sb, transit_state_mb, transit_state_bkref, check_arrival, check_node_accept): Adjust re_string_context_at and re_string_reconstruct callers. (re_search_internal): Likewise. Set input.newline_anchor. (build_trtable): Use dfa->word_char bitmap instead of IS_WORD_CHAR. * posix/regcomp.c (init_word_char): Change return type to void. Set dfa->word_ops_used. (free_dfa_content): Don't free dfa->word_char. (parse_expression): Remove error handling for init_word_char.
2004-01-02Update.Ulrich Drepper
2004-01-01 Paolo Bonzini <bonzini@gnu.org> * posix/regex_internal.h (re_dfastate_t): Fix size of the CONTEXT bitfield. * posix/regex_internal.c (re_node_set_insert): Rewrite.
2003-12-27Update.Ulrich Drepper
2003-12-23 Paolo Bonzini <bonzini@gnu.org> * posix/regex_internal.c (re_dfa_add_node): Initialize opt_subexp. * posix/regex_internal.h (re_token_type_t): Put OP_DUP_PLUS among the tokens, rather than among the epsilon-transiting nodes. (re_token_t): Add the opt_subexp flag. * posix/regcomp.c (optimize_utf8, calc_first, calc_next, calc_epsdest): Don't consider OP_DUP_PLUS. (mark_opt_subexp, mark_opt_subexp_iter): New functions. (parse_dup_op): Mostly rewritten, lowering OP_DUP_PLUS to OP_DUP_ASTERISK and marking optional subexpressions as such using mark_opt_subexp. * posix/regexec.c (set_regs): Initialize PREV_INDEX_MATCH and pass it to update_regs. (update_regs): Use the PREV_INDEX_MATCH parameter, together with the opt_subexp flag, in order to discard a final empty match of a repeated subexpression. * posix/BOOST.tests: Adjust test vectors. * posix/PCRE.tests: Likewise. * posix/rxspencer/tests: Likewise. 2003-12-17 Paolo Bonzini <bonzini@gnu.org> 2003-12-16 Paolo Bonzini <bonzini@gnu.org> 2003-12-17 Paolo Bonzini <bonzini@gnu.org> 2003-12-16 Jakub Jelinek <jakub@redhat.com> 2003-04-06 Kaz Kojima <kkojima@rr.iij4u.or.jp> 2003-02-20 Paolo Bonzini <bonzini@gnu.org> 2003-01-12 Franz Sirl <Franz.Sirl-kernel@lauterbach.com> 2003-01-09 Richard Henderson <rth@redhat.com> 2003-01-09 Richard Henderson <rth@redhat.com> 2003-01-03 Paul Eggert <eggert@twinsun.com>
2003-12-23Update.Ulrich Drepper
2003-12-22 Jakub Jelinek <jakub@redhat.com> * posix/regcomp.c: Remove C99-ism. * posix/tst-rxspencer.c: Likewise. Based on a patch by Alex Davis <alex14641@yahoo.com>. 2002-12-17 Paolo Bonzini <bonzini@gnu.org> * posix/regex_internal.h [!_LIBC] (internal_function): Define. (re_string_allocate, re_string_construct, re_string_reconstruct, re_string_realloc_buffers, build_wcs_buffer, build_wcs_upper_buffer, build_upper_buffer, re_string_translate_buffer, re_string_destruct, re_string_elem_size_at, re_string_char_size_at, re_string_wchar_at, re_string_context_at, re_node_set_alloc, re_node_set_init_1 re_node_set_init_2, re_node_set_init_copy, re_node_set_add_intersect, re_node_set_init_union, re_node_set_merge, re_node_set_insert re_node_set_compare, re_node_set_contains re_node_set_remove_at, re_dfa_add_node, re_acquire_state, re_acquire_state_context, free_state): Add internal_function to declaration. * posix/regexec.c (match_ctx_init, match_ctx_clean, match_ctx_free, match_ctx_free_subtops, match_ctx_add_entry, search_cur_bkref_entry, match_ctx_clear_flag, match_ctx_add_subtop, match_ctx_add_sublast, sift_ctx_init, re_search_internal, re_search_2_stub, re_search_stub, re_copy_regs, acquire_init_state_context, prune_impossible_nodes, check_matching, check_halt_node_context, check_halt_state_context update_regs, proceed_next_node, push_fail_stack, pop_fail_stack, set_regs, free_fail_stack_return, sift_states_iter_mb, sift_states_backward update_cur_sifted_state, add_epsilon_src_nodes, sub_epsilon_src_nodes, check_dst_limits, check_dst_limits_calc_pos, check_subexp_limits, sift_states_bkref, clean_state_log_if_need, merge_state_array, transit_state, check_subexp_matching_top, transit_state_sb, transit_state_mb, transit_state_bkref, get_subexp, get_subexp_sub, find_subexp_node, check_arrival, check_arrival_add_next_nodes, find_collation_sequence_value, check_arrival_expand_ecl, check_arrival_expand_ecl_sub, expand_bkref_cache, build_trtable, check_node_accept_bytes, extend_buffers, group_nodes_into_DFAstates, check_node_accept): Likewise. * posix/regex_internal.c (re_string_construct_common, re_string_skip_chars, create_newstate_common, register_state, create_ci_newstate, create_cd_newstate, calc_state_hash): Likewise. (re_string_peek_byte_case, re_fetch_byte_case): Change declaration from ANSI to K&R. 2002-12-16 Paolo Bonzini <bonzini@gnu.org> * posix/regexec.c (build_trtable): Don't allocate the trtable until state->word_trtable is known. Don't hardcode UINT_BITS iterations on each bitset item.
2003-12-16Update.Ulrich Drepper
2003-12-16 Petter Reinholdtsen <pere@hungry.com> * posix/regex_internal.h: Make sure the regex code compile with non-GCC compilers by hiding attributes.
2003-11-29Update.Ulrich Drepper
2003-11-28 Ulrich Drepper <drepper@redhat.com> * sysdeps/x86_64/fpu/libm-test-ulps: Add some more minor changes to compensate other setup. 2003-11-27 Andreas Jaeger <aj@suse.de> * sysdeps/x86_64/fpu/libm-test-ulps: Add ulps for new atan2 test. * math/libm-test.inc (atan2_test): Add test that run infinitly. Reported by "Willus" <etc231etc231@willus.com>. 2003-11-27 Michael Matz <matz@suse.de> * sysdeps/ieee754/dbl-64/mpsqrt.c (fastiroot): Fix 64-bit problem with wrong types. 2003-11-28 Jakub Jelinek <jakub@redhat.com> * posix/regexec.c (acquire_init_state_context): Make inline. Add always_inline attribute. (check_matching): Add BE macro. Move if (cur_state->has_backref) into if (dfa->nbackref). (sift_states_backward): Fix comment. (transit_state): Add BE macro. Move if (next_state->has_backref) into if (dfa->nbackref && next_state). Don't check for next_state != NULL twice. * posix/regcomp.c (peek_token): Use opr.ctx_type instead of opr.idx for ANCHOR. (parse_expression): Only call init_word_char if word context will be needed. * posix/bug-regex11.c (tests): Add new tests. * posix/tst-regex.c: Include getopt.h. (timing): New variable. (main): Set timing to 1 if --timing argument is present. Add 2 new tests. (run_test, run_test_backwards): Handle timing. 2003-11-27 Jakub Jelinek <jakub@redhat.com> * posix/regex_internal.h (re_string_t): Remove mbs_case field. Add offsets, valid_raw_len, raw_len, raw_stop, mbs_allocated and offsets_needed fields. Change icase, is_utf8 and map_notascii type from int bitfield to unsigned char. (MBS_ALLOCATED, MBS_CASE_ALLOCATED): Remove. (build_wcs_upper_buffer): Change prototype to return int. (re_string_peek_byte_case, re_string_fetch_byte_case): Remove defines, add prototypes. * posix/regex_internal.c (re_string_allocate): Don't initialize stop here. Don't initialize mbs_case. Set valid_raw_len. Use mbs_allocated instead of MBS_* macros. (re_string_construct): Don't initialize stop and valid_len here. Don't initialize mbs_case. Use mbs_allocated instead of MBS_* macros. Reallocate buffers if build_wcs_upper_buffer converted too few bytes. Set valid_len to bufs_len only for single byte no translation and set in that case valid_raw_len as well. (re_string_realloc_buffers): Reallocate offsets if not NULL. Use mbs_allocated instead of MBS_ALLOCATED. Don't reallocate mbs_case. (re_string_construct_common): Initialize raw_len, mbs_allocated, stop and raw_stop. (build_wcs_buffer): Apply pstr->trans before mbrtowc instead of after it. Set valid_raw_len. Don't set mbs_case. (build_wcs_upper_buffer): Return REG_NOERROR or REG_ESPACE. Only use the fast path if !pstr->offsets_needed. Apply pstr->trans before mbrtowc instead of after it. If upper case character uses different number of bytes than lower case, goto to the slow path. Don't call towupper unnecessarily twice. Set valid_raw_len as well. Handle in the slow path the case if lower and upper case use different number of characters. Don't set mbs_case. (re_string_skip_chars): Use valid_raw_len instead of valid_len. (build_upper_buffer): Don't set mbs_case. Add BE macro. Set valid_raw_len. (re_string_translate_buffer): Set mbs instead of mbs_case. Set valid_raw_len. (re_string_reconstruct): Use raw_len/raw_stop to initialize len/stop. Clear valid_raw_len and offsets_needed when clearing valid_len. Use mbs_allocated instead of MBS_* macros. Check original offset against valid_raw_len instead of valid_len. Remove mbs_case handling. Adjust valid_raw_len together with valid_len. If is_utf8 and looking for tip context, apply pstr->trans first. If buffers start with partial multi-byte character, initialize mbs array as well if mbs_allocated. Check return value of build_wcs_upper_buffer. (re_string_peek_byte_case): New function. (re_string_fetch_byte_case): New function. (re_string_destruct): Use mbs_allocated instead of MBS_ALLOCATED. Don't free mbs_case. Free offsets. * posix/regcomp.c (init_dfa): Only check if charset name is UTF-8 if mb_cur_max == 6. * posix/regexec.c (re_search_internal): Initialize input.raw_stop as well. Use valid_raw_len instead of valid_len when looking through fastmap. Adjust registers through input.offsets. (extend_buffers): Allow build_wcs_upper_buffer to fail. * posix/bug-regex18.c (tests): Enable #ifdefed out tests. Add new tests.
2003-11-24Update.Ulrich Drepper
2003-11-24 Jakub Jelinek <jakub@redhat.com> * posix/regex_internal.h (re_token_t): Add word_char bit. Add comment. (re_dfa_t): Add sb_char field. (bitset_mask): New function. * posix/regcomp.c (free_dfa_content): Free sb_char. (init_dfa): Don't initialize word_char unnecessarily. Initialize sb_char. (duplicate_node): Don't duplicate !word_char CHARACTERs with NEXT_WORD_CONSTRAINT constraint or word_char CHARACTERs with NEXT_NOTWORD_CONSTRAINT. Return -1 in *new_idx instead. (duplicate_node_closure): Handle clone_dest == -1 from duplicate_node. (peek_token): Initialize word_char bit. (parse_expression, parse_dup_op): Add comments. (parse_bracket_exp): Don't set bitmask bits for multi-byte char starting bytes here at the beginning. Mask off the bits right before creating SIMPLE_BRACKET. (build_charclass_op): Likewise. * posix/regexec.c (group_nodes_into_DFAstates) <case OP_PERIOD>: Only set accept bits for single-byte characters. (group_nodes_into_DFAstates): Don't rely on characters 0 .. 127 being single byte encoded and the rest multi-byte. * posix/bug-regex19.c (tests): Add new tests. (do_mb_tests): Initialize t to *test. (main): Fail even on do_mb_tests errors.
2003-11-23Update.Ulrich Drepper
2003-11-23 Ulrich Drepper <drepper@redhat.com> * posix/regexec.c: Add const in a number of places. * posix/regex_internal.h: Make EPSILON_BIT a macro to help debugging. Its value isn't important.
2003-11-20Update.Ulrich Drepper
2003-11-20 Ulrich Drepper <drepper@redhat.com> * posix/PTESTS: Fix first test in GA143. 2003-11-20 Jakub Jelinek <jakub@redhat.com> * posix/regex_internal.h (re_dfastate_t): Remove trtable_search. Add word_trtable. * posix/regex_internal.c (create_newstate_common, free_state): Don't free trtable_search. * posix/regexec.c (check_matching): Remove fl_search argument. (transit_state_sb): Likewise. #ifdef out as unused. (build_trtable): Remove fl_search argument. Set state->word_trtable and state->trtable. Build separate word and non-word tables if multi-byte and they differ for some character. (transit_state): Remove fl_search argument. Don't update state->trtable here. Handle state->word_trtable. #ifdef out unused call to transit_state_sb. (re_search_internal): Update check_matching caller. (group_nodes_into_DFAstates): Don't clear non-ascii chars in accepts bitmask for multi-byte locales. * posix/bug-regex19.c (tests): Enable some commented out tests, add 2 new tests. * posix/tst-rxspencer.c (mb_tests): Don't test [[=b=]] for now as multi-byte. Don't run identical multi-byte tests multiple times unnecessarily. (main): Check setlocale return value. * posix/Makefile (tst-rxspencer-ARGS): Add --utf8 argument. (tst-rxspencer-ENV): Remove MALLOC_TRACE, add LOCPATH. ($(objpfx)tst-rxspencer-mem): Run another tst-rxspencer test here, without --utf8 argument but with MALLOC_TRACE.
2003-11-19Update.Ulrich Drepper
2003-11-19 Jakub Jelinek <jakub@redhat.com> * posix/regexec.c (extend_buffers): Don't allocate twice as big state_log as needed. Don't modify pstr->valid_len for mb_cur_max == 1 !icase !trans. * posix/regcomp.c (free_bin_tree): Removed. (create_tree): Add dfa argument. Don't call re_malloc for each tree, instead allocate from str_tree_storage. (re_dfa_add_tree_node): New function. (free_dfa_content): Handle freeing if dfa->nodes == NULL or dfa->state_table == NULL. (re_compile_internal): Call free_dfa_content if init_dfa fails. Call free_workarea_compile, re_string_destruct and free_dfa_content for most of the other failure paths. (init_dfa): Initialize str_tree_storage_idx. Don't clear any fields on allocation failure. (free_workarea_compile): Free str_tree_storage chunks instead of free_bin_tree (dfa->str_tree). (parse): Call re_dfa_add_tree_node instead of re_dfa_add_node followed by create_tree. Add dfa argument to remaining create_tree calls. Remove new_idx variable. Remove calls to free_bin_tree. (parse_reg_exp, parse_branch, parse_expression, parse_sub_exp, parse_dup_op, parse_bracket_exp, build_charclass_op): Likewise. (duplicate_tree): Remove calls to free_bin_tree, add dfa argument to create_tree. * posix/regex_internal.h (BIN_TREE_STORAGE_SIZE): Define. (bin_tree_storage_t): New type. (re_dfa_t): Add str_tree_storage and str_tree_storage_idx fields. * posix/Makefile (tests): Add bug-regex21. (generated): Add bug-regex21-mem, bug-regex21.mtrace, tst-rxspencer-mem and tst-rxspencer.mtrace. (tests): Depend on $(objpfx)bug-regex21-mem and $(objpfx)tst-rxspencer-mem. (bug-regex21-ENV, tst-rxspencer-ENV): Set. ($(objpfx)bug-regex21-mem, $(objpfx)tst-rxspencer-mem): New. * posix/tst-rxspencer.c (main): Add call to mtrace. Free line at the end. * posix/bug-regex21.c: New test. * posix/regexec.c (get_subexp): After calling get_subexp_sub
2003-11-19Update.Ulrich Drepper
2003-11-19 Ulrich Drepper <drepper@redhat.com> * posix/regex_internal.h (re_string_first_byte): Use ->valid_len not ->len. (re_string_is_single_byte_char): Likewise. * posix/regexec.c (get_subexp): After calling get_subexp_seb
2003-11-18Update.Ulrich Drepper
* posix/regex_internal.h (re_token_type_t): Remove unused ALT, END_OF_RE_TOKEN_T and SUBEXP. Reorder values. Add OP_UTF8_PERIOD and EPSILON_BIT. (IS_EPSILON_NODE): Just test if EPSILON_BIT is set. (ACCEPT_MB_NODE): Return 1 for OP_UTF8_PERIOD as well. * posix/regex_internal.c (create_ci_newstate, create_cd_newstate): Handle OP_UTF8_PERIOD. (re_string_reconstruct): Set valid_len for single byte char searching with no translation and case sensitivity. * posix/regcomp.c (re_compile_fastmap_iter, calc_first): Handle OP_UTF8_PERIOD. (re_compile_internal): Don't call optimize_utf8 if preg->translate != NULL. (optimize_utf8): Remove BACK_SLASH case. Transform OP_PERIOD into OP_UTF8_PERIOD if the searching can be optimized. (parse_bracket_exp): Don't create SIMPLE_BRACKET if it doesn't have any bits set and COMPLEX_BRACKET is used. * posix/regexec.c (transit_state_mb): Fix comment typo. (group_nodes_into_DFAstates, check_node_accept): Handle OP_UTF8_PERIOD. (check_node_accept_bytes): Likewise. Reorder slightly so that re_string_char_size_at and re_string_elem_size_at are called only when needed. * posix/bug-regex20.c (BRE, ERE): Define. (tests): Use them to make lines shorter. Expect . to be optimized. Add lots of new tests. (main): Run (ATM just case sensitive) test with backwards searching as well. 2003-11-18 Jakub Jelinek <jakub@redhat.com>
2003-11-16Update.Ulrich Drepper
* posix/regex_internal.h: Add forward declaration of re_dfa_t. Replace last two parameters of re_string_allocate and re_string_construct with pointer to DFA. (re_dfa_t): Add map_notascii field. * posix/regcomp.c (re_compile_internal): Add call of re_string_construct. (init_dfa): Initialize mpa_notascii. * posix/regex_internal.c: Adjust definitions of re_string_allocate and re_string_construct. Pass DFA to re_string_construct. Adjust definition. Initialize map_notascii field. (build_wcs_upper_buffer): If map_notascii is zero use simplfied method to map ASCII values to upper case. * posix/regex.c: Include localeinfo.h. * posix/regexec.c: Adjust call of re_string_allocate. * locale/langinfo.h: Add _NL_CTYPE_MAP_TO_NONASCII. * locale/localeinfo.h (LIMAGIC): Change value. * locale/categories.def. Add entry for _NL_CTYPE_MAP_TO_NONASCII. * locale/C-ctype.h: Likewise. * locale/programs/ld-ctype.c: Compute whether any mapping maps from ASCII to non-ASCII value. Write out that value.
2003-11-12Update.Ulrich Drepper
* posix/regcomp.c (optimize_utf8): New function. (re_compile_fastmap_iter): Use dfa->mb_cur_max > 1 instead of !icase. (re_compile_internal): Call optimize_utf8 if not case insensitive and in UTF-8 locale. * posix/regex_internal.h: Ifdef out some prototypes if RE_NO_INTERNAL_PROTOTYPES is defined to shut up warnings. * posix/Makefile (tests): Add bug-regex20. (bug-regex20-ENV): Add LOCPATH. * posix/bug-regex20.c: New test. 2003-11-12 Jakub Jelinek <jakub@redhat.com>
2003-11-12Update.Ulrich Drepper
2003-11-12 Jakub Jelinek <jakub@redhat.com> * io/ftw.c (NFTW_NEW_NAME, NFTW_OLD_NAME): Add prototypes. 2003-11-12 Jakub Jelinek <jakub@redhat.com> * posix/tst-regex.c (umemlen): New variable. (test_expr): Add expectedicase argument. Test case insensitive searches as well as backwards searches (case sensitive and insensitive) too. (run_test): Add icase argument. Use it to compute regcomp flags. (run_test_backwards): New function. (main): Cast read to size_t to avoid warning. Set umemlen. Add expectedicase arguments to test_expr. * posix/regex_internal.c (re_string_reconstruct): If is_utf8, find previous character by walking back instead of converting all chars from beginning. 2003-11-12 Jakub Jelinek <jakub@redhat.com> * posix/regex_internal.h (struct re_string_t): Add is_utf8 and mb_cur_max fields. (struct re_dfa_t): Likewise. Reorder fields to make structure smaller on 64-bit arches. (re_string_allocate, re_string_construct): Add mb_cur_max and is_utf8 arguments. (re_string_char_size_at, re_string_wchar_at): Use pstr->mb_cur_max instead of MB_CUR_MAX. * posix/regcomp.c (re_compile_fastmap_iter): Use dfa->mb_cur_max instead of MB_CUR_MAX. (re_compile_internal): Pass new arguments to re_string_construct. (init_dfa): Initialize mb_cur_max and is_utf8 fields. (peek_token, peek_token_bracket): Use input->mb_cur_max instead of MB_CUR_MAX. (parse_expression, parse_bracket_exp, parse_charclass_op): Use dfa->mb_cur_max instead of MB_CUR_MAX. * posix/regex_internal.c (re_string_construct_common): Add mb_cur_max and is_utf8 arguments. Initialize fields with them. (re_string_allocate, re_string_construct): Add mb_cur_max and is_utf8 arguments, pass them to re_string_construct_common. Use mb_cur_max instead of MB_CUR_MAX. (re_string_realloc_buffers): Use pstr->mb_cur_max instead of MB_CUR_MAX. (re_string_reconstruct): Likewise. (re_string_context_at): Use input->mb_cur_max instead of MB_CUR_MAX. (create_ci_newstate, create_cd_newstate): Use dfa->mb_cur_max instead of MB_CUR_MAX. * posix/regexec.c (re_search_internal): Likewise. Pass new arguments to re_string_allocate. (check_matching, transit_state_sb): Use dfa->mb_cur_max instead of MB_CUR_MAX. (extend_buffers): Use pstr->mb_cur_max instead of MB_CUR_MAX. 2003-11-12 Jakub Jelinek <jakub@redhat.com> * posix/Makefile (tests): Add bug-regex19. (bug-regex19-ENV): Add LOCPATH. * posix/bug-regex19.c: New test.
2003-11-11(re_string_char_size_at): Don't look beyond valid_len wide chars.Ulrich Drepper
2003-09-23Update.Ulrich Drepper
* posix/regcomp.c (build_word_op): Rename like... (build_charclass_op): ...this. Accept two extra parameters, CLASS_NAME and EXTRA. Add EXTRA to the result, not only _. (peek_token): accept \s and \S as OP_SPACE and OP_NOTSPACE. (parse_expression): replace build_word_op with build_charclass_op, add new arguments, accept OP_SPACE and OP_NOTSPACE. * posix/regex_internal.h (re_token_type_t): Add OP_SPACE and OP_NOTSPACE.
2003-09-23Upate.Ulrich Drepper
2003-09-20 Paolo Bonzini <bonzini@gnu.org> * posix/regcomp.c (peek_token): Don't look back for ( or | to check whether to treat a caret as special. It fails for the (extended) regex \(^. (parse, parse_reg_exp): Pass RE_CARET_ANCHORS_HERE to fetch_token. * posix/regex.h: Define RE_CARET_ANCHORS_HERE. * posix/regexec.c: Check out of bounds value before shifting. * posix/regex_internal.h: Define __attribute for non-gcc.