NetBSD/lib
christos 0e1288d7c8 From Ingo Schwarze:
If CHARSET_IS_UTF8 is not set, read_char() is broken in a large
number of ways:

 1. The isascii(3) check can yield false positives.  If a string in
    an arbitrary encoding contains a byte in the range 0..127,
    that does not at all imply that it forms a character all by
    itself, and even less that it represents the same character
    as in ASCII.  Consequently, read_char() may return characters
    the user never typed.
    Even if the encoding is not state dependent, the assumption that
    bytes in the range 0..127 represent ASCII characters is broken.
    Consider UTF-16, for example.

 2. The reverse problem can also occur.  In an arbitrary encoding,
    there is no guarantee that a character that can be represented
    by ASCII is represented by a seven-bit byte, and even less by
    the same byte as in ASCII.
    Even for single-byte encodings, these assumptions are broken.
    Consider the ISO 646 national variants, for example.
    Consequently, the current code is insufficient to keep ASCII
    characters working even for single-byte encodings.

 3. The condition "++cbp != 1" can never trigger (because initially,
    cbp is 0, and the code can only go back up via the final goto,
    which has another cbp = 0 right before it) and it has no effect
    (because cbp isn't used afterwards).

 4. bytes = ct_mbtowc(cp, cbuf, cbp) is broken.  If this returns -1,
    the code assumes that is can just call mbtowc(3) again for later
    input bytes.  In some implementations, that may even be broken
    for state-independent encodings, but trying again after mbtowc(3)
    failure certainly produces completely erratic and meaningless
    results in state-dependent encodings.

 5. The assignment "*cp = (Char)(unsigned char)cbuf[0]" is
    completely bogus.  Even if the byte cbuf[0] represents a
    character all by itself, which it usually will not, whether
    or not the cast produces the desired result depends on the
    internal representation of wchar_t in the C library, which
    the application program can know nothing about.  Even for ASCII
    in the C/POSIX locale, an ASCII character other than '\0' ==
    L'\0' == 0 need not have the same numeric value as a char and
    as a wchar_t.

To summarize, this code only works if all of the following
conditions hold:

 - The encoding is a single-byte encoding.
 - ASCII is a subset of the encoding.
 - The implementation of mbtowc(3) in the C library does not
   require re-initialization after encoding errors.
 - The implementation of wchar_t in the C library uses the
   same numerical values as ASCII.

Otherwise, it silently produces wrong results.

The simplest way to fix this is to just use the same code as for
UTF-8 (right above).  Of course, that causes functional changes
but that shouldn't matter since current behaviour is undefined.

The patch below provides the following improvements:

 - It works for all stateless single-byte encodings, no matter
   whether they are somehow related to ASCII, no matter how
   mb[r]towc(3) are internally implemented, and no matter how
   wchar_t is internally represented.
 - Instead of producing unpredictable and definitely wrong
   results for non-UTF-8 multibyte characters, it behaves in
   a well-defined way: It aborts input processing, sets errno,
   and returns failure.
   Note that short of providing full support for arbitrary locales,
   it is impossible to do better.  We cannot know whether a given
   unsupported locale is state-dependent, and for a state-dependent
   locale, it makes no sense to retry parsing after an encoding
   error, so the best we can do is abort processing for *any*
   unsupported multi-byte character.
 - Note that single-byte characters in arbitrary state-independent
   locales still work, even in locales that may potentially also
   contain multibyte characters, as long as those don't occur in
   input.  I'm not sure whether any such locales exist in practice...

Tested with UTF-8 and C/POSIX on OpenBSD.  Also tested that in the
C/POSIX locale, non-ASCII bytes get through unmangled.  You may
wish to test with ISO-LATIN on NetBSD if NetBSD supports that.

----
Also use a constant for meta to avoid warnings.
2016-02-12 15:11:09 +00:00
..
csu Undo previous; the lossage is more basic. 2016-01-24 16:47:32 +00:00
i18n_module
libarch Define _KERNTYPES for things that need it. 2016-01-23 21:22:45 +00:00
libbluetooth correct comment in literal section 2016-01-22 08:51:40 +00:00
libbpfjit
libbsdmalloc
libbz2 Reorg docs, part 1: 2014-07-05 19:22:41 +00:00
libc Avoid shadowing global. 2016-02-06 19:33:07 +00:00
libc_vfp Possibly build libc_vfp if MACHINE_CPU is aarch64 too. 2015-07-08 01:08:24 +00:00
libcompat PR/50711: David Binderman: Fix memory leak on error 2016-01-26 16:05:18 +00:00
libcrypt fix error messages 2015-06-17 00:15:26 +00:00
libcurses Clear the "forced" flag after updating a line, otherwise we'll always do 2016-01-10 08:11:06 +00:00
libdm The actual header file for these functions is dm.h, not libdm.h. 2016-01-22 22:12:40 +00:00
libedit From Ingo Schwarze: 2016-02-12 15:11:09 +00:00
libexecinfo Fix typo, from FreeBSD. 2015-12-26 10:34:36 +00:00
libform Counting from 0 to n-1 can go wrong badly, if n is unsigned and zero and 2015-12-11 21:22:57 +00:00
libintl back to the defines (fixing a typo -- extra 'g') 2015-06-08 15:04:20 +00:00
libipsec
libisns
libkern Define _KERNTYPES for things that need it. 2016-01-23 21:22:45 +00:00
libkvm mips needs _KMEMUSER for label_t in pcb.h 2016-01-24 16:07:48 +00:00
liblwres
libm Fix incorrect magic numbers in scaling. From FreeBSD commit 23397, by 2016-01-24 20:34:30 +00:00
libmenu
libnpf - Change LDADD/DPADD in library dependencies to LIBDPLIBS 2016-01-05 13:07:46 +00:00
libossaudio Add missing defines for 16, 24 and 32 bit NE and OE formats. 2014-09-09 10:45:18 +00:00
libp2k Don't include <rump/rumpvnode_if.h> from rump.h. It's not needed 2016-01-25 11:45:57 +00:00
libpam Adapt to the new API. 2015-04-04 02:51:10 +00:00
libpanel Specify path of a local internal header of libpanel 2015-11-22 04:30:33 +00:00
libpci unsigned -> unsigned int 2016-01-23 07:21:18 +00:00
libperfuse Define _KERNTYPES for things that need it. 2016-01-23 21:22:45 +00:00
libpmc pmc_evid_, pmc_ctr_t etc are defined in <machine/types.h> but are not exposed 2016-01-23 21:44:55 +00:00
libposix MKCOMPAT fixes for when compat MACHINE_CPU != normal MACHINE_CPU 2014-08-10 23:25:49 +00:00
libppath
libprop
libpthread Fix PTHREAD_FOO_INITIALIZER for C++ by not using volatile in the relevant 2015-08-27 12:30:50 +00:00
libpthread_dbg don't use kernel types. 2016-01-23 14:02:21 +00:00
libpuffs Define _KERNTYPES for things that need it. 2016-01-23 21:22:45 +00:00
libquota Some NFS servers return RPC_PROGNOTREGISTERED instead of RPC_PROGVERSMISMATCH 2016-01-30 16:31:28 +00:00
libradius
librefuse Define _KERNTYPES for things that need it. 2016-01-23 21:22:45 +00:00
libresolv src is too big these days to tolerate superfluous apostrophes. It's 2014-10-18 08:33:23 +00:00
librmt
librpcsvc remove __P 2013-12-20 21:04:09 +00:00
librt Bump date for previous. 2015-11-19 07:03:13 +00:00
librump This is not needed anymore. 2015-08-21 06:56:35 +00:00
librumpclient Define _KERNTYPES for things that need it. 2016-01-23 21:22:45 +00:00
librumpdev
librumphijack Define _KERNTYPES for things that need it. 2016-01-23 21:22:45 +00:00
librumpnet
librumpuser Move librumpuser compile-time options into the librumpuser source 2016-01-25 00:24:23 +00:00
librumpvfs Move rump kernel man pages from various sources to sys/rump 2014-11-09 17:39:37 +00:00
libskey Uses FILE *, needs stdio.h. 2016-01-22 23:25:51 +00:00
libss
libtelnet Avoid enum type mismatch. 2014-04-26 22:10:40 +00:00
libterminfo Always copy the area buffer, even when the length was the same 2015-11-26 01:03:22 +00:00
libukfs Don't include <rump/rumpvnode_if.h> from rump.h. It's not needed 2016-01-25 11:45:57 +00:00
libusbhid Uses __BEGIN_DECLS so needs sys/cdefs.h; also needs stdint.h. 2016-01-22 23:51:23 +00:00
libutil prefer <sys/cpu.h> instead of <machine/cpu.h> 2016-01-25 18:14:04 +00:00
libwrap these are syslog-like 2015-10-14 15:54:21 +00:00
liby
libz Merge riastradh-drm2 to HEAD. 2014-03-18 18:20:35 +00:00
lua lua: updated from 5.3 work3 to 5.3.0 2015-02-02 14:03:05 +00:00
npf
bumpversion
checkoldver
checkver
checkvers
Makefile use EXTERNAL_BINUTILS_SUBDIR 2016-01-26 17:47:35 +00:00
Makefile.inc