summaryrefslogtreecommitdiff
path: root/libidn/stringprep.c
diff options
context:
space:
mode:
authorSamuel Thibault <samuel.thibault@ens-lyon.org>2018-12-27 14:12:05 +0000
committerSamuel Thibault <samuel.thibault@ens-lyon.org>2018-12-27 14:12:05 +0000
commit963c37d5c0eb62b38f8764b23931c0dcdd497a13 (patch)
tree12a521ddf17b3e1bb26594656bbb05903c54afd0 /libidn/stringprep.c
parent7bb5f8a836b916d6ebf7b6921b136e99cea2442d (diff)
parent3c03baca37fdcb52c3881e653ca392bba7a99c2b (diff)
Merge tag 'glibc-2.28' into baseline-2.28baseline
The GNU C Library ================= The GNU C Library version 2.28 is now available. The GNU C Library is used as *the* C library in the GNU system and in GNU/Linux systems, as well as many other systems that use Linux as the kernel. The GNU C Library is primarily designed to be a portable and high performance C library. It follows all relevant standards including ISO C11 and POSIX.1-2008. It is also internationalized and has one of the most complete internationalization interfaces known. The GNU C Library webpage is at http://www.gnu.org/software/libc/ Packages for the 2.28 release may be downloaded from: http://ftpmirror.gnu.org/libc/ http://ftp.gnu.org/gnu/libc/ The mirror list is at http://www.gnu.org/order/ftp.html NEWS for version 2.28 ===================== Major new features: * The localization data for ISO 14651 is updated to match the 2016 Edition 4 release of the standard, this matches data provided by Unicode 9.0.0. This update introduces significant improvements to the collation of Unicode characters. This release deviates slightly from the standard in that the collation element ordering for lowercase and uppercase LATIN script characters is adjusted to ensure that regular expressions with ranges like [a-z] and [A-Z] don't interleave e.g. A is not matched by [a-z]. With the update many locales have been updated to take advantage of the new collation information. The new collation information has increased the size of the compiled locale archive or binary locales. * The GNU C Library can now be compiled with support for Intel CET, AKA Intel Control-flow Enforcement Technology. When the library is built with --enable-cet, the resulting glibc is protected with indirect branch tracking (IBT) and shadow stack (SHSTK). CET-enabled glibc is compatible with all existing executables and shared libraries. This feature is currently supported on i386, x86_64 and x32 with GCC 8 and binutils 2.29 or later. Note that CET-enabled glibc requires CPUs capable of multi-byte NOPs, like x86-64 processors as well as Intel Pentium Pro or newer. NOTE: --enable-cet has been tested for i686, x86_64 and x32 on non-CET processors. --enable-cet has been tested for x86_64 and x32 on CET SDVs, but Intel CET support hasn't been validated for i686. * The GNU C Library now has correct support for ABSOLUTE symbols (SHN_ABS-relative symbols). Previously such ABSOLUTE symbols were relocated incorrectly or in some cases discarded. The GNU linker can make use of the newer semantics, but it must communicate it to the dynamic loader by setting the ELF file's identification (EI_ABIVERSION field) to indicate such support is required. * Unicode 11.0.0 Support: Character encoding, character type info, and transliteration tables are all updated to Unicode 11.0.0, using generator scripts contributed by Mike FABIAN (Red Hat). * <math.h> functions that round their results to a narrower type are added from TS 18661-1:2014 and TS 18661-3:2015: - fadd, faddl, daddl and corresponding fMaddfN, fMaddfNx, fMxaddfN and fMxaddfNx functions. - fsub, fsubl, dsubl and corresponding fMsubfN, fMsubfNx, fMxsubfN and fMxsubfNx functions. - fmul, fmull, dmull and corresponding fMmulfN, fMmulfNx, fMxmulfN and fMxmulfNx functions. - fdiv, fdivl, ddivl and corresponding fMdivfN, fMdivfNx, fMxdivfN and fMxdivfNx functions. * Two grammatical forms of month names are now supported for the following languages: Armenian, Asturian, Catalan, Czech, Kashubian, Occitan, Ossetian, Scottish Gaelic, Upper Sorbian, and Walloon. The following languages now support two grammatical forms in abbreviated month names: Catalan, Greek, and Kashubian. * Newly added locales: Lower Sorbian (dsb_DE) and Yakut (sah_RU) also include the support for two grammatical forms of month names. * Building and running on GNU/Hurd systems now works without out-of-tree patches. * The renameat2 function has been added, a variant of the renameat function which has a flags argument. If the flags are zero, the renameat2 function acts like renameat. If the flag is not zero and there is no kernel support for renameat2, the function will fail with an errno value of EINVAL. This is different from the existing gnulib function renameatu, which performs a plain rename operation in case of a RENAME_NOREPLACE flags and a non-existing destination (and therefore has a race condition that can clobber the destination inadvertently). * The statx function has been added, a variant of the fstatat64 function with an additional flags argument. If there is no direct kernel support for statx, glibc provides basic stat support based on the fstatat64 function. * IDN domain names in getaddrinfo and getnameinfo now use the system libidn2 library if installed. libidn2 version 2.0.5 or later is recommended. If libidn2 is not available, internationalized domain names are not encoded or decoded even if the AI_IDN or NI_IDN flags are passed to getaddrinfo or getnameinfo. (getaddrinfo calls with non-ASCII names and AI_IDN will fail with an encoding error.) Flags which used to change the IDN encoding and decoding behavior (AI_IDN_ALLOW_UNASSIGNED, AI_IDN_USE_STD3_ASCII_RULES, NI_IDN_ALLOW_UNASSIGNED, NI_IDN_USE_STD3_ASCII_RULES) have been deprecated. They no longer have any effect. * Parsing of dynamic string tokens in DT_RPATH, DT_RUNPATH, DT_NEEDED, DT_AUXILIARY, and DT_FILTER has been expanded to support the full range of ELF gABI expressions including such constructs as '$ORIGIN$ORIGIN' (if valid). For SUID/GUID applications the rules have been further restricted, and where in the past a dynamic string token sequence may have been interpreted as a literal string it will now cause a load failure. These load failures were always considered unspecified behaviour from the perspective of the dynamic loader, and for safety are now load errors e.g. /foo/${ORIGIN}.so in DT_NEEDED results in a load failure now. * Support for ISO C threads (ISO/IEC 9899:2011) has been added. The implementation includes all the standard functions provided by <threads.h>: - thrd_current, thrd_equal, thrd_sleep, thrd_yield, thrd_create, thrd_detach, thrd_exit, and thrd_join for thread management. - mtx_init, mtx_lock, mtx_timedlock, mtx_trylock, mtx_unlock, and mtx_destroy for mutual exclusion. - call_once for function call synchronization. - cnd_broadcast, cnd_destroy, cnd_init, cnd_signal, cnd_timedwait, and cnd_wait for conditional variables. - tss_create, tss_delete, tss_get, and tss_set for thread-local storage. Application developers must link against libpthread to use ISO C threads. Deprecated and removed features, and other changes affecting compatibility: * The nonstandard header files <libio.h> and <_G_config.h> are no longer installed. Software that was using either header should be updated to use standard <stdio.h> interfaces instead. * The stdio functions 'getc' and 'putc' are no longer defined as macros. This was never required by the C standard, and the macros just expanded to call alternative names for the same functions. If you hoped getc and putc would provide performance improvements over fgetc and fputc, instead investigate using (f)getc_unlocked and (f)putc_unlocked, and, if necessary, flockfile and funlockfile. * All stdio functions now treat end-of-file as a sticky condition. If you read from a file until EOF, and then the file is enlarged by another process, you must call clearerr or another function with the same effect (e.g. fseek, rewind) before you can read the additional data. This corrects a longstanding C99 conformance bug. It is most likely to affect programs that use stdio to read interactive input from a terminal. (Bug #1190.) * The macros 'major', 'minor', and 'makedev' are now only available from the header <sys/sysmacros.h>; not from <sys/types.h> or various other headers that happen to include <sys/types.h>. These macros are rarely used, not part of POSIX nor XSI, and their names frequently collide with user code; see https://sourceware.org/bugzilla/show_bug.cgi?id=19239 for further explanation. <sys/sysmacros.h> is a GNU extension. Portable programs that require these macros should first include <sys/types.h>, and then include <sys/sysmacros.h> if __GNU_LIBRARY__ is defined. * The tilegx*-*-linux-gnu configurations are no longer supported. * The obsolete function ustat is no longer available to newly linked binaries; the headers <ustat.h> and <sys/ustat.h> have been removed. This function has been deprecated in favor of fstatfs and statfs. * The obsolete function nfsservctl is no longer available to newly linked binaries. This function was specific to systems using the Linux kernel and could not usefully be used with the GNU C Library on systems with version 3.1 or later of the Linux kernel. * The obsolete function name llseek is no longer available to newly linked binaries. This function was specific to systems using the Linux kernel and was not declared in a header. Programs should use the lseek64 name for this function instead. * The AI_IDN_ALLOW_UNASSIGNED and NI_IDN_ALLOW_UNASSIGNED flags for the getaddrinfo and getnameinfo functions have been deprecated. The behavior previously selected by them is now always enabled. * The AI_IDN_USE_STD3_ASCII_RULES and NI_IDN_USE_STD3_ASCII_RULES flags for the getaddrinfo and getnameinfo functions have been deprecated. The STD3 restriction (rejecting '_' in host names, among other things) has been removed, for increased compatibility with non-IDN name resolution. * The fcntl function now have a Long File Support variant named fcntl64. It is added to fix some Linux Open File Description (OFD) locks usage on non LFS mode. As for others *64 functions, fcntl64 semantics are analogous with fcntl and LFS support is handled transparently. Also for Linux, the OFD locks act as a cancellation entrypoint. * The obsolete functions encrypt, encrypt_r, setkey, setkey_r, cbc_crypt, ecb_crypt, and des_setparity are no longer available to newly linked binaries, and the headers <rpc/des_crypt.h> and <rpc/rpc_des.h> are no longer installed. These functions encrypted and decrypted data with the DES block cipher, which is no longer considered secure. Software that still uses these functions should switch to a modern cryptography library, such as libgcrypt. * Reflecting the removal of the encrypt and setkey functions above, the macro _XOPEN_CRYPT is no longer defined. As a consequence, the crypt function is no longer declared unless _DEFAULT_SOURCE or _GNU_SOURCE is enabled. * The obsolete function fcrypt is no longer available to newly linked binaries. It was just another name for the standard function crypt, and it has not appeared in any header file in many years. * We have tentative plans to hand off maintenance of the passphrase-hashing library, libcrypt, to a separate development project that will, we hope, keep up better with new passphrase-hashing algorithms. We will continue to declare 'crypt' in <unistd.h>, and programs that use 'crypt' or 'crypt_r' should not need to change at all; however, distributions will need to install <crypt.h> and libcrypt from a separate project. In this release, if the configure option --disable-crypt is used, glibc will not install <crypt.h> or libcrypt, making room for the separate project's versions of these files. The plan is to make this the default behavior in a future release. Changes to build and runtime requirements: GNU make 4.0 or later is now required to build glibc. Security related changes: CVE-2016-6261, CVE-2016-6263, CVE-2017-14062: Various vulnerabilities have been fixed by removing the glibc-internal IDNA implementation and using the system-provided libidn2 library instead. Originally reported by Hanno Böck and Christian Weisgerber. CVE-2017-18269: An SSE2-based memmove implementation for the i386 architecture could corrupt memory. Reported by Max Horn. CVE-2018-11236: Very long pathname arguments to realpath function could result in an integer overflow and buffer overflow. Reported by Alexey Izbyshev. CVE-2018-11237: The mempcpy implementation for the Intel Xeon Phi architecture could write beyond the target buffer, resulting in a buffer overflow. Reported by Andreas Schwab. The following bugs are resolved with this release: [1190] stdio: fgetc()/fread() behaviour is not POSIX compliant [6889] manual: 'PWD' mentioned but not specified [13575] libc: SSIZE_MAX defined as LONG_MAX is inconsistent with ssize_t, when __WORDSIZE != 64 [13762] regex: re_search etc. should return -2 on memory exhaustion [13888] build: /tmp usage during testing [13932] math: dbl-64 pow unexpectedly slow for some inputs [14092] nptl: Support C11 threads [14095] localedata: Review / update collation data from Unicode / ISO 14651 [14508] libc: -Wformat warnings [14553] libc: Namespace pollution loff_t in sys/types.h [14890] libc: Make NT_PRFPREG canonical. [15105] libc: Extra PLT references with -Os [15512] libc: __bswap_constant_16 not compiled when -Werror -Wsign- conversion is given [16335] manual: Feature test macro documentation incomplete and out of date [16552] libc: Unify umount implementations in terms of umount2 [17082] libc: htons et al.: statement-expressions prevent use on global scope with -O1 and higher [17343] libc: Signed integer overflow in /stdlib/random_r.c [17438] localedata: pt_BR: wrong d_fmt delimiter [17662] libc: please implement binding for the new renameat2 syscall [17721] libc: __restrict defined as /* Ignore */ even in c11 [17979] libc: inconsistency between uchar.h and stdint.h [18018] dynamic-link: Additional $ORIGIN handling issues (CVE-2011-0536) [18023] libc: extend_alloca is broken (questionable pointer comparison, horrible machine code) [18124] libc: hppa: setcontext erroneously returns -1 as exit code for last constant. [18471] libc: llseek should be a compat symbol [18473] soft-fp: [powerpc-nofpu] __sqrtsf2, __sqrtdf2 should be compat symbols [18991] nss: nss_files skips large entry in database [19239] libc: Including stdlib.h ends up with macros major and minor being defined [19463] libc: linknamespace failures when compiled with -Os [19485] localedata: csb_PL: Update month translations + add yesstr/nostr [19527] locale: Normalized charset name not recognized by setlocale [19667] string: Missing Sanity Check for malloc calls in file 'testcopy.c' [19668] libc: Missing Sanity Check for malloc() in file 'tst-setcontext- fpscr.c' [19728] network: out of bounds stack read in libidn function idna_to_ascii_4i (CVE-2016-6261) [19729] network: out of bounds heap read on invalid utf-8 inputs in stringprep_utf8_nfkc_normalize (CVE-2016-6263) [19818] dynamic-link: Absolute (SHN_ABS) symbols incorrectly relocated by the base address [20079] libc: Add SHT_X86_64_UNWIND to elf.h [20251] libc: 32bit programs pass garbage in struct flock for OFD locks [20419] dynamic-link: files with large allocated notes crash in open_verify [20530] libc: bswap_16 should use __builtin_bswap16() when available [20890] dynamic-link: ldconfig: fsync the files before atomic rename [20980] manual: CFLAGS environment variable replaces vital options [21163] regex: Assertion failure in pop_fail_stack when executing a malformed regexp (CVE-2015-8985) [21234] manual: use of CFLAGS makes glibc detect no optimization [21269] dynamic-link: i386 sigaction sa_restorer handling is wrong [21313] build: Compile Error GCC 5.4.0 MIPS with -0S [21314] build: Compile Error GCC 5.2.0 MIPS with -0s [21508] locale: intl/tst-gettext failure with latest msgfmt [21547] localedata: Tibetan script collation broken (Dzongkha and Tibetan) [21812] network: getifaddrs() returns entries with ifa_name == NULL [21895] libc: ppc64 setjmp/longjmp not fully interoperable with static dlopen [21942] dynamic-link: _dl_dst_substitute incorrectly handles $ORIGIN: with AT_SECURE=1 [22241] localedata: New locale: Yakut (Sakha) locale for Russia (sah_RU) [22247] network: Integer overflow in the decode_digit function in puny_decode.c in libidn (CVE-2017-14062) [22342] nscd: NSCD not properly caching netgroup [22391] nptl: Signal function clear NPTL internal symbols inconsistently [22550] localedata: es_ES locale (and other es_* locales): collation should treat ñ as a primary different character, sync the collation for Spanish with CLDR [22638] dynamic-link: sparc: static binaries are broken if glibc is built by gcc configured with --enable-default-pie [22639] time: year 2039 bug for localtime etc. on 64-bit platforms [22644] string: memmove-sse2-unaligned on 32bit x86 produces garbage when crossing 2GB threshold (CVE-2017-18269) [22646] localedata: redundant data (LC_TIME) for es_CL, es_CU, es_EC and es_BO [22735] time: Misleading typo in time.h source comment regarding CLOCKS_PER_SECOND [22753] libc: preadv2/pwritev2 fallback code should handle offset=-1 [22761] libc: No trailing `%n' conversion specifier in FMT passed from `__assert_perror_fail ()' to `__assert_fail_base ()' [22766] libc: all glibc internal dlopen should use RTLD_NOW for robust dlopen failures [22786] libc: Stack buffer overflow in realpath() if input size is close to SSIZE_MAX (CVE-2018-11236) [22787] dynamic-link: _dl_check_caller returns false when libc is linked through an absolute DT_NEEDED path [22792] build: tcb-offsets.h dependency dropped [22797] libc: pkey_get() uses non-reserved name of argument [22807] libc: PTRACE_* constants missing for powerpc [22818] glob: posix/tst-glob_lstat_compat failure on alpha [22827] dynamic-link: RISC-V ELF64 parser mis-reads flag in ldconfig [22830] malloc: malloc_stats doesn't restore cancellation state on stderr [22848] localedata: ca_ES: update date definitions from CLDR [22862] build: _DEFAULT_SOURCE is defined even when _ISOC11_SOURCE is [22884] math: RISCV fmax/fmin handle signalling NANs incorrectly [22896] localedata: Update locale data for an_ES [22902] math: float128 test failures with GCC 8 [22918] libc: multiple common of `__nss_shadow_database' [22919] libc: sparc32: backtrace yields infinite backtrace with makecontext [22926] libc: FTBFS on powerpcspe [22932] localedata: lt_LT: Update of abbreviated month names from CLDR required [22937] localedata: Greek (el_GR, el_CY) locales actually need ab_alt_mon [22947] libc: FAIL: misc/tst-preadvwritev2 [22963] localedata: cs_CZ: Add alternative month names [22987] math: [powerpc/sparc] fdim inlines errno, exceptions handling [22996] localedata: change LC_PAPER to en_US in es_BO locale [22998] dynamic-link: execstack tests are disabled when SELinux is disabled [23005] network: Crash in __res_context_send after memory allocation failure [23007] math: strtod cannot handle -nan [23024] nss: getlogin_r is performing NSS lookups when loginid isn't set [23036] regex: regex equivalence class regression [23037] libc: initialize msg_flags to zero for sendmmsg() calls [23069] libc: sigaction broken on riscv64-linux-gnu [23094] localedata: hr_HR: wrong thousands_sep and mon_thousands_sep [23102] dynamic-link: Incorrect parsing of multiple consecutive $variable patterns in runpath entries (e.g. $ORIGIN$ORIGIN) [23137] nptl: s390: pthread_join sometimes block indefinitely (on 31bit and libc build with -Os) [23140] localedata: More languages need two forms of month names [23145] libc: _init/_fini aren't marked as hidden [23152] localedata: gd_GB: Fix typo in "May" (abbreviated) [23171] math: C++ iseqsig for long double converts arguments to double [23178] nscd: sudo will fail when it is run in concurrent with commands that changes /etc/passwd [23196] string: __mempcpy_avx512_no_vzeroupper mishandles large copies (CVE-2018-11237) [23206] dynamic-link: static-pie + dlopen breaks debugger interaction [23208] localedata: New locale - Lower Sorbian (dsb) [23233] regex: Memory leak in build_charclass_op function in file posix/regcomp.c [23236] stdio: Harden function pointers in _IO_str_fields [23250] nptl: Offset of __private_ss differs from GCC [23253] math: tgamma test suite failures on i686 with -march=x86-64 -mtune=generic -mfpmath=sse [23259] dynamic-link: Unsubstituted ${ORIGIN} remains in DT_NEEDED for AT_SECURE [23264] libc: posix_spawnp wrongly executes ENOEXEC in non compat mode [23266] nis: stringop-truncation warning with new gcc8.1 in nisplus- parser.c [23272] math: fma(INFINITY,INFIITY,0.0) should be INFINITY [23277] math: nan function should not have const attribute [23279] math: scanf and strtod wrong for some hex floating-point [23280] math: wscanf rounds wrong; wcstod is ok for negative numbers and directed rounding [23290] localedata: IBM273 is not equivalent to ISO-8859-1 [23303] build: undefined reference to symbol '__parse_hwcap_and_convert_at_platform@@GLIBC_2.23' [23307] dynamic-link: Absolute symbols whose value is zero ignored in lookup [23313] stdio: libio vtables validation and standard file object interposition [23329] libc: The __libc_freeres infrastructure is not properly run across DSO boundaries. [23349] libc: Various glibc headers no longer compatible with <linux/time.h> [23351] malloc: Remove unused code related to heap dumps and malloc checking [23363] stdio: stdio-common/tst-printf.c has non-free license [23396] regex: Regex equivalence regression in single-byte locales [23422] localedata: oc_FR: More updates of locale data [23442] build: New warning with GCC 8 [23448] libc: Out of bounds access in IBM-1390 converter [23456] libc: Wrong index_cpu_LZCNT [23458] build: tst-get-cpu-features-static isn't added to tests [23459] libc: COMMON_CPUID_INDEX_80000001 isn't populated for Intel processors [23467] dynamic-link: x86/CET: A property note parser bug Release Notes ============= https://sourceware.org/glibc/wiki/Release/2.28 Contributors ============ This release was made possible by the contributions of many people. The maintainers are grateful to everyone who has contributed changes or bug reports. These include: Adhemerval Zanella Agustina Arzille Alan Modra Alexandre Oliva Amit Pawar Andreas Schwab Andrew Senkevich Andrew Waterman Aurelien Jarno Carlos O'Donell Chung-Lin Tang DJ Delorie Daniel Alvarez David Michael Dmitry V. Levin Dragan Stanojevic - Nevidljivi Florian Weimer Flávio Cruz Francois Goichon Gabriel F. T. Gomes H.J. Lu Herman ten Brugge Hongbo Zhang Igor Gnatenko Jesse Hathaway John David Anglin Joseph Myers Leonardo Sandoval Maciej W. Rozycki Mark Wielaard Martin Sebor Michael Wolf Mike FABIAN Patrick McGehearty Patsy Franklin Paul Pluzhnikov Quentin PAGÈS Rafal Luzynski Rajalakshmi Srinivasaraghavan Raymond Nicholson Rical Jasan Richard Braun Robert Buj Rogerio Alves Samuel Thibault Sean McKean Siddhesh Poyarekar Stefan Liebler Steve Ellcey Sylvain Lesage Szabolcs Nagy Thomas Schwinge Tulio Magno Quites Machado Filho Valery Timiriliyev Vincent Chen Wilco Dijkstra Zack Weinberg Zong Li
Diffstat (limited to 'libidn/stringprep.c')
-rw-r--r--libidn/stringprep.c668
1 files changed, 0 insertions, 668 deletions
diff --git a/libidn/stringprep.c b/libidn/stringprep.c
deleted file mode 100644
index 72a502e5a3..0000000000
--- a/libidn/stringprep.c
+++ /dev/null
@@ -1,668 +0,0 @@
-/* stringprep.c --- Core stringprep implementation.
- * Copyright (C) 2002, 2003, 2004 Simon Josefsson
- *
- * This file is part of GNU Libidn.
- *
- * GNU Libidn is free software; you can redistribute it and/or
- * modify it under the terms of the GNU Lesser General Public
- * License as published by the Free Software Foundation; either
- * version 2.1 of the License, or (at your option) any later version.
- *
- * GNU Libidn is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- * Lesser General Public License for more details.
- *
- * You should have received a copy of the GNU Lesser General Public
- * License along with GNU Libidn; if not, see <http://www.gnu.org/licenses/>.
- */
-
-#if HAVE_CONFIG_H
-# include "config.h"
-#endif
-
-#include <stdlib.h>
-#include <string.h>
-#include <stdint.h>
-
-#include "stringprep.h"
-
-static ssize_t
-stringprep_find_character_in_table (uint32_t ucs4,
- const Stringprep_table_element * table)
-{
- ssize_t i;
-
- /* This is where typical uses of Libidn spends very close to all CPU
- time and causes most cache misses. One could easily do a binary
- search instead. Before rewriting this, I want hard evidence this
- slowness is at all relevant in typical applications. (I don't
- dispute optimization may improve matters significantly, I'm
- mostly interested in having someone give real-world benchmark on
- the impact of libidn.) */
-
- for (i = 0; table[i].start || table[i].end; i++)
- if (ucs4 >= table[i].start &&
- ucs4 <= (table[i].end ? table[i].end : table[i].start))
- return i;
-
- return -1;
-}
-
-static ssize_t
-stringprep_find_string_in_table (uint32_t * ucs4,
- size_t ucs4len,
- size_t * tablepos,
- const Stringprep_table_element * table)
-{
- size_t j;
- ssize_t pos;
-
- for (j = 0; j < ucs4len; j++)
- if ((pos = stringprep_find_character_in_table (ucs4[j], table)) != -1)
- {
- if (tablepos)
- *tablepos = pos;
- return j;
- }
-
- return -1;
-}
-
-static int
-stringprep_apply_table_to_string (uint32_t * ucs4,
- size_t * ucs4len,
- size_t maxucs4len,
- const Stringprep_table_element * table)
-{
- ssize_t pos;
- size_t i, maplen;
-
- while ((pos = stringprep_find_string_in_table (ucs4, *ucs4len,
- &i, table)) != -1)
- {
- for (maplen = STRINGPREP_MAX_MAP_CHARS;
- maplen > 0 && table[i].map[maplen - 1] == 0; maplen--)
- ;
-
- if (*ucs4len - 1 + maplen >= maxucs4len)
- return STRINGPREP_TOO_SMALL_BUFFER;
-
- memmove (&ucs4[pos + maplen], &ucs4[pos + 1],
- sizeof (uint32_t) * (*ucs4len - pos - 1));
- memcpy (&ucs4[pos], table[i].map, sizeof (uint32_t) * maplen);
- *ucs4len = *ucs4len - 1 + maplen;
- }
-
- return STRINGPREP_OK;
-}
-
-#define INVERTED(x) ((x) & ((~0UL) >> 1))
-#define UNAPPLICAPLEFLAGS(flags, profileflags) \
- ((!INVERTED(profileflags) && !(profileflags & flags) && profileflags) || \
- ( INVERTED(profileflags) && (profileflags & flags)))
-
-/**
- * stringprep_4i:
- * @ucs4: input/output array with string to prepare.
- * @len: on input, length of input array with Unicode code points,
- * on exit, length of output array with Unicode code points.
- * @maxucs4len: maximum length of input/output array.
- * @flags: stringprep profile flags, or 0.
- * @profile: pointer to stringprep profile to use.
- *
- * Prepare the input UCS-4 string according to the stringprep profile,
- * and write back the result to the input string.
- *
- * The input is not required to be zero terminated (@ucs4[@len] = 0).
- * The output will not be zero terminated unless @ucs4[@len] = 0.
- * Instead, see stringprep_4zi() if your input is zero terminated or
- * if you want the output to be.
- *
- * Since the stringprep operation can expand the string, @maxucs4len
- * indicate how large the buffer holding the string is. This function
- * will not read or write to code points outside that size.
- *
- * The @flags are one of Stringprep_profile_flags, or 0.
- *
- * The @profile contain the instructions to perform. Your application
- * can define new profiles, possibly re-using the generic stringprep
- * tables that always will be part of the library, or use one of the
- * currently supported profiles.
- *
- * Return value: Returns %STRINGPREP_OK iff successful, or an error code.
- **/
-int
-stringprep_4i (uint32_t * ucs4, size_t * len, size_t maxucs4len,
- Stringprep_profile_flags flags,
- const Stringprep_profile * profile)
-{
- size_t i, j;
- ssize_t k;
- size_t ucs4len = *len;
- int rc;
-
- for (i = 0; profile[i].operation; i++)
- {
- switch (profile[i].operation)
- {
- case STRINGPREP_NFKC:
- {
- uint32_t *q = 0;
-
- if (UNAPPLICAPLEFLAGS (flags, profile[i].flags))
- break;
-
- if (flags & STRINGPREP_NO_NFKC && !profile[i].flags)
- /* Profile requires NFKC, but callee asked for no NFKC. */
- return STRINGPREP_FLAG_ERROR;
-
- q = stringprep_ucs4_nfkc_normalize (ucs4, ucs4len);
- if (!q)
- return STRINGPREP_NFKC_FAILED;
-
- for (ucs4len = 0; q[ucs4len]; ucs4len++)
- ;
-
- if (ucs4len >= maxucs4len)
- {
- free (q);
- return STRINGPREP_TOO_SMALL_BUFFER;
- }
-
- memcpy (ucs4, q, ucs4len * sizeof (ucs4[0]));
-
- free (q);
- }
- break;
-
- case STRINGPREP_PROHIBIT_TABLE:
- k = stringprep_find_string_in_table (ucs4, ucs4len,
- NULL, profile[i].table);
- if (k != -1)
- return STRINGPREP_CONTAINS_PROHIBITED;
- break;
-
- case STRINGPREP_UNASSIGNED_TABLE:
- if (UNAPPLICAPLEFLAGS (flags, profile[i].flags))
- break;
- if (flags & STRINGPREP_NO_UNASSIGNED)
- {
- k = stringprep_find_string_in_table
- (ucs4, ucs4len, NULL, profile[i].table);
- if (k != -1)
- return STRINGPREP_CONTAINS_UNASSIGNED;
- }
- break;
-
- case STRINGPREP_MAP_TABLE:
- if (UNAPPLICAPLEFLAGS (flags, profile[i].flags))
- break;
- rc = stringprep_apply_table_to_string
- (ucs4, &ucs4len, maxucs4len, profile[i].table);
- if (rc != STRINGPREP_OK)
- return rc;
- break;
-
- case STRINGPREP_BIDI_PROHIBIT_TABLE:
- case STRINGPREP_BIDI_RAL_TABLE:
- case STRINGPREP_BIDI_L_TABLE:
- break;
-
- case STRINGPREP_BIDI:
- {
- int done_prohibited = 0;
- int done_ral = 0;
- int done_l = 0;
- int contains_ral = -1;
- int contains_l = -1;
-
- for (j = 0; profile[j].operation; j++)
- if (profile[j].operation == STRINGPREP_BIDI_PROHIBIT_TABLE)
- {
- done_prohibited = 1;
- k = stringprep_find_string_in_table (ucs4, ucs4len,
- NULL,
- profile[j].table);
- if (k != -1)
- return STRINGPREP_BIDI_CONTAINS_PROHIBITED;
- }
- else if (profile[j].operation == STRINGPREP_BIDI_RAL_TABLE)
- {
- done_ral = 1;
- if (stringprep_find_string_in_table
- (ucs4, ucs4len, NULL, profile[j].table) != -1)
- contains_ral = j;
- }
- else if (profile[j].operation == STRINGPREP_BIDI_L_TABLE)
- {
- done_l = 1;
- if (stringprep_find_string_in_table
- (ucs4, ucs4len, NULL, profile[j].table) != -1)
- contains_l = j;
- }
-
- if (!done_prohibited || !done_ral || !done_l)
- return STRINGPREP_PROFILE_ERROR;
-
- if (contains_ral != -1 && contains_l != -1)
- return STRINGPREP_BIDI_BOTH_L_AND_RAL;
-
- if (contains_ral != -1)
- {
- if (!(stringprep_find_character_in_table
- (ucs4[0], profile[contains_ral].table) != -1 &&
- stringprep_find_character_in_table
- (ucs4[ucs4len - 1], profile[contains_ral].table) != -1))
- return STRINGPREP_BIDI_LEADTRAIL_NOT_RAL;
- }
- }
- break;
-
- default:
- return STRINGPREP_PROFILE_ERROR;
- break;
- }
- }
-
- *len = ucs4len;
-
- return STRINGPREP_OK;
-}
-
-static int
-stringprep_4zi_1 (uint32_t * ucs4, size_t ucs4len, size_t maxucs4len,
- Stringprep_profile_flags flags,
- const Stringprep_profile * profile)
-{
- int rc;
-
- rc = stringprep_4i (ucs4, &ucs4len, maxucs4len, flags, profile);
- if (rc != STRINGPREP_OK)
- return rc;
-
- if (ucs4len >= maxucs4len)
- return STRINGPREP_TOO_SMALL_BUFFER;
-
- ucs4[ucs4len] = 0;
-
- return STRINGPREP_OK;
-}
-
-/**
- * stringprep_4zi:
- * @ucs4: input/output array with zero terminated string to prepare.
- * @maxucs4len: maximum length of input/output array.
- * @flags: stringprep profile flags, or 0.
- * @profile: pointer to stringprep profile to use.
- *
- * Prepare the input zero terminated UCS-4 string according to the
- * stringprep profile, and write back the result to the input string.
- *
- * Since the stringprep operation can expand the string, @maxucs4len
- * indicate how large the buffer holding the string is. This function
- * will not read or write to code points outside that size.
- *
- * The @flags are one of Stringprep_profile_flags, or 0.
- *
- * The @profile contain the instructions to perform. Your application
- * can define new profiles, possibly re-using the generic stringprep
- * tables that always will be part of the library, or use one of the
- * currently supported profiles.
- *
- * Return value: Returns %STRINGPREP_OK iff successful, or an error code.
- **/
-int
-stringprep_4zi (uint32_t * ucs4, size_t maxucs4len,
- Stringprep_profile_flags flags,
- const Stringprep_profile * profile)
-{
- size_t ucs4len;
-
- for (ucs4len = 0; ucs4len < maxucs4len && ucs4[ucs4len] != 0; ucs4len++)
- ;
-
- return stringprep_4zi_1 (ucs4, ucs4len, maxucs4len, flags, profile);
-}
-
-/**
- * stringprep:
- * @in: input/ouput array with string to prepare.
- * @maxlen: maximum length of input/output array.
- * @flags: stringprep profile flags, or 0.
- * @profile: pointer to stringprep profile to use.
- *
- * Prepare the input zero terminated UTF-8 string according to the
- * stringprep profile, and write back the result to the input string.
- *
- * Note that you must convert strings entered in the systems locale
- * into UTF-8 before using this function, see
- * stringprep_locale_to_utf8().
- *
- * Since the stringprep operation can expand the string, @maxlen
- * indicate how large the buffer holding the string is. This function
- * will not read or write to characters outside that size.
- *
- * The @flags are one of Stringprep_profile_flags, or 0.
- *
- * The @profile contain the instructions to perform. Your application
- * can define new profiles, possibly re-using the generic stringprep
- * tables that always will be part of the library, or use one of the
- * currently supported profiles.
- *
- * Return value: Returns %STRINGPREP_OK iff successful, or an error code.
- **/
-int
-stringprep (char *in,
- size_t maxlen,
- Stringprep_profile_flags flags,
- const Stringprep_profile * profile)
-{
- int rc;
- char *utf8 = NULL;
- uint32_t *ucs4 = NULL;
- size_t ucs4len, maxucs4len, adducs4len = 50;
-
- do
- {
- free (ucs4);
- ucs4 = stringprep_utf8_to_ucs4 (in, -1, &ucs4len);
- maxucs4len = ucs4len + adducs4len;
- uint32_t *newp = realloc (ucs4, maxucs4len * sizeof (uint32_t));
- if (!newp)
- {
- free (ucs4);
- return STRINGPREP_MALLOC_ERROR;
- }
- ucs4 = newp;
-
- rc = stringprep_4i (ucs4, &ucs4len, maxucs4len, flags, profile);
- adducs4len += 50;
- }
- while (rc == STRINGPREP_TOO_SMALL_BUFFER);
- if (rc != STRINGPREP_OK)
- {
- free (ucs4);
- return rc;
- }
-
- utf8 = stringprep_ucs4_to_utf8 (ucs4, ucs4len, 0, 0);
- free (ucs4);
- if (!utf8)
- return STRINGPREP_MALLOC_ERROR;
-
- if (strlen (utf8) >= maxlen)
- {
- free (utf8);
- return STRINGPREP_TOO_SMALL_BUFFER;
- }
-
- strcpy (in, utf8); /* flawfinder: ignore */
-
- free (utf8);
-
- return STRINGPREP_OK;
-}
-
-/**
- * stringprep_profile:
- * @in: input array with UTF-8 string to prepare.
- * @out: output variable with pointer to newly allocate string.
- * @profile: name of stringprep profile to use.
- * @flags: stringprep profile flags, or 0.
- *
- * Prepare the input zero terminated UTF-8 string according to the
- * stringprep profile, and return the result in a newly allocated
- * variable.
- *
- * Note that you must convert strings entered in the systems locale
- * into UTF-8 before using this function, see
- * stringprep_locale_to_utf8().
- *
- * The output @out variable must be deallocated by the caller.
- *
- * The @flags are one of Stringprep_profile_flags, or 0.
- *
- * The @profile specifies the name of the stringprep profile to use.
- * It must be one of the internally supported stringprep profiles.
- *
- * Return value: Returns %STRINGPREP_OK iff successful, or an error code.
- **/
-int
-stringprep_profile (const char *in,
- char **out,
- const char *profile, Stringprep_profile_flags flags)
-{
- const Stringprep_profiles *p;
- char *str = NULL;
- size_t len = strlen (in) + 1;
- int rc;
-
- for (p = &stringprep_profiles[0]; p->name; p++)
- if (strcmp (p->name, profile) == 0)
- break;
-
- if (!p || !p->name || !p->tables)
- return STRINGPREP_UNKNOWN_PROFILE;
-
- do
- {
- free (str);
- str = (char *) malloc (len);
- if (str == NULL)
- return STRINGPREP_MALLOC_ERROR;
-
- strcpy (str, in);
-
- rc = stringprep (str, len, flags, p->tables);
- len += 50;
- }
- while (rc == STRINGPREP_TOO_SMALL_BUFFER);
-
- if (rc == STRINGPREP_OK)
- *out = str;
- else
- free (str);
-
- return rc;
-}
-
-/*! \mainpage GNU Internationalized Domain Name Library
- *
- * \section intro Introduction
- *
- * GNU Libidn is an implementation of the Stringprep, Punycode and IDNA
- * specifications defined by the IETF Internationalized Domain Names
- * (IDN) working group, used for internationalized domain names. The
- * package is available under the GNU Lesser General Public License.
- *
- * The library contains a generic Stringprep implementation that does
- * Unicode 3.2 NFKC normalization, mapping and prohibitation of
- * characters, and bidirectional character handling. Profiles for
- * Nameprep, iSCSI, SASL and XMPP are included. Punycode and ASCII
- * Compatible Encoding (ACE) via IDNA are supported. A mechanism to
- * define Top-Level Domain (TLD) specific validation tables, and to
- * compare strings against those tables, is included. Default tables
- * for some TLDs are also included.
- *
- * The Stringprep API consists of two main functions, one for
- * converting data from the system's native representation into UTF-8,
- * and one function to perform the Stringprep processing. Adding a
- * new Stringprep profile for your application within the API is
- * straightforward. The Punycode API consists of one encoding
- * function and one decoding function. The IDNA API consists of the
- * ToASCII and ToUnicode functions, as well as an high-level interface
- * for converting entire domain names to and from the ACE encoded
- * form. The TLD API consists of one set of functions to extract the
- * TLD name from a domain string, one set of functions to locate the
- * proper TLD table to use based on the TLD name, and core functions
- * to validate a string against a TLD table, and some utility wrappers
- * to perform all the steps in one call.
- *
- * The library is used by, e.g., GNU SASL and Shishi to process user
- * names and passwords. Libidn can be built into GNU Libc to enable a
- * new system-wide getaddrinfo() flag for IDN processing.
- *
- * Libidn is developed for the GNU/Linux system, but runs on over 20 Unix
- * platforms (including Solaris, IRIX, AIX, and Tru64) and Windows.
- * Libidn is written in C and (parts of) the API is accessible from C,
- * C++, Emacs Lisp, Python and Java.
- *
- * The project web page:\n
- * http://www.gnu.org/software/libidn/
- *
- * The software archive:\n
- * ftp://alpha.gnu.org/pub/gnu/libidn/
- *
- * For more information see:\n
- * http://www.ietf.org/html.charters/idn-charter.html\n
- * http://www.ietf.org/rfc/rfc3454.txt (stringprep specification)\n
- * http://www.ietf.org/rfc/rfc3490.txt (idna specification)\n
- * http://www.ietf.org/rfc/rfc3491.txt (nameprep specification)\n
- * http://www.ietf.org/rfc/rfc3492.txt (punycode specification)\n
- * http://www.ietf.org/internet-drafts/draft-ietf-ips-iscsi-string-prep-04.txt\n
- * http://www.ietf.org/internet-drafts/draft-ietf-krb-wg-utf8-profile-01.txt\n
- * http://www.ietf.org/internet-drafts/draft-ietf-sasl-anon-00.txt\n
- * http://www.ietf.org/internet-drafts/draft-ietf-sasl-saslprep-00.txt\n
- * http://www.ietf.org/internet-drafts/draft-ietf-xmpp-nodeprep-01.txt\n
- * http://www.ietf.org/internet-drafts/draft-ietf-xmpp-resourceprep-01.txt\n
- *
- * Further information and paid contract development:\n
- * Simon Josefsson <simon@josefsson.org>
- *
- * \section examples Examples
- *
- * \include example.c
- * \include example3.c
- * \include example4.c
- * \include example5.c
- */
-
-/**
- * STRINGPREP_VERSION
- *
- * String defined via CPP denoting the header file version number.
- * Used together with stringprep_check_version() to verify header file
- * and run-time library consistency.
- */
-
-/**
- * STRINGPREP_MAX_MAP_CHARS
- *
- * Maximum number of code points that can replace a single code point,
- * during stringprep mapping.
- */
-
-/**
- * Stringprep_rc:
- * @STRINGPREP_OK: Successful operation. This value is guaranteed to
- * always be zero, the remaining ones are only guaranteed to hold
- * non-zero values, for logical comparison purposes.
- * @STRINGPREP_CONTAINS_UNASSIGNED: String contain unassigned Unicode
- * code points, which is forbidden by the profile.
- * @STRINGPREP_CONTAINS_PROHIBITED: String contain code points
- * prohibited by the profile.
- * @STRINGPREP_BIDI_BOTH_L_AND_RAL: String contain code points with
- * conflicting bidirectional category.
- * @STRINGPREP_BIDI_LEADTRAIL_NOT_RAL: Leading and trailing character
- * in string not of proper bidirectional category.
- * @STRINGPREP_BIDI_CONTAINS_PROHIBITED: Contains prohibited code
- * points detected by bidirectional code.
- * @STRINGPREP_TOO_SMALL_BUFFER: Buffer handed to function was too
- * small. This usually indicate a problem in the calling
- * application.
- * @STRINGPREP_PROFILE_ERROR: The stringprep profile was inconsistent.
- * This usually indicate an internal error in the library.
- * @STRINGPREP_FLAG_ERROR: The supplied flag conflicted with profile.
- * This usually indicate a problem in the calling application.
- * @STRINGPREP_UNKNOWN_PROFILE: The supplied profile name was not
- * known to the library.
- * @STRINGPREP_NFKC_FAILED: The Unicode NFKC operation failed. This
- * usually indicate an internal error in the library.
- * @STRINGPREP_MALLOC_ERROR: The malloc() was out of memory. This is
- * usually a fatal error.
- *
- * Enumerated return codes of stringprep(), stringprep_profile()
- * functions (and macros using those functions). The value 0 is
- * guaranteed to always correspond to success.
- */
-
-/**
- * Stringprep_profile_flags:
- * @STRINGPREP_NO_NFKC: Disable the NFKC normalization, as well as
- * selecting the non-NFKC case folding tables. Usually the profile
- * specifies BIDI and NFKC settings, and applications should not
- * override it unless in special situations.
- * @STRINGPREP_NO_BIDI: Disable the BIDI step. Usually the profile
- * specifies BIDI and NFKC settings, and applications should not
- * override it unless in special situations.
- * @STRINGPREP_NO_UNASSIGNED: Make the library return with an error if
- * string contains unassigned characters according to profile.
- *
- * Stringprep profile flags.
- */
-
-/**
- * Stringprep_profile_steps:
- *
- * Various steps in the stringprep algorithm. You really want to
- * study the source code to understand this one. Only useful if you
- * want to add another profile.
- */
-
-/**
- * stringprep_nameprep:
- * @in: input/ouput array with string to prepare.
- * @maxlen: maximum length of input/output array.
- *
- * Prepare the input UTF-8 string according to the nameprep profile.
- * The AllowUnassigned flag is true, use
- * stringprep_nameprep_no_unassigned() if you want a false
- * AllowUnassigned. Returns 0 iff successful, or an error code.
- **/
-
-/**
- * stringprep_nameprep_no_unassigned:
- * @in: input/ouput array with string to prepare.
- * @maxlen: maximum length of input/output array.
- *
- * Prepare the input UTF-8 string according to the nameprep profile.
- * The AllowUnassigned flag is false, use stringprep_nameprep() for
- * true AllowUnassigned. Returns 0 iff successful, or an error code.
- **/
-
-/**
- * stringprep_iscsi:
- * @in: input/ouput array with string to prepare.
- * @maxlen: maximum length of input/output array.
- *
- * Prepare the input UTF-8 string according to the draft iSCSI
- * stringprep profile. Returns 0 iff successful, or an error code.
- **/
-
-/**
- * stringprep_plain:
- * @in: input/ouput array with string to prepare.
- * @maxlen: maximum length of input/output array.
- *
- * Prepare the input UTF-8 string according to the draft SASL
- * ANONYMOUS profile. Returns 0 iff successful, or an error code.
- **/
-
-/**
- * stringprep_xmpp_nodeprep:
- * @in: input/ouput array with string to prepare.
- * @maxlen: maximum length of input/output array.
- *
- * Prepare the input UTF-8 string according to the draft XMPP node
- * identifier profile. Returns 0 iff successful, or an error code.
- **/
-
-/**
- * stringprep_xmpp_resourceprep:
- * @in: input/ouput array with string to prepare.
- * @maxlen: maximum length of input/output array.
- *
- * Prepare the input UTF-8 string according to the draft XMPP resource
- * identifier profile. Returns 0 iff successful, or an error code.
- **/