NetBSD

History

christos 6b42622b31 UTF-8 fixes from Ingo Schwarze: 1. Assume that errno is non-zero when entering read_char() and that read(2) returns 0 (indicating end of file). Then, the code will clear errno before returning. (Obviously, the statement "errno = 0" is almost always a bug unless there is save_errno = errno right before it and the previous value is properly restored later, in all reachable code paths.) 2. When encountering an invalid byte sequence, the code discards all following bytes until MB_LEN_MAX overflows; consider, for example, 0xc2 immediately followed by a few valid ASCII bytes. Three of those ASCII bytes will be discarded. 3. On a POSIX system, EILSEQ will always be set after reading a valid (yes, valid, not invalid!) UTF-8 character. The reason is that mbtowc(3) will first be called with a length limit (third argument) of 1, which will fail, return -1, and - on a POSIX system - set errno to EILSEQ. This third bug is mitigated a bit because i couldn't find any system that actually conforms to POSIX in this respect: None of OpenBSD, NetBSD, FreeBSD, Solaris 11, and glibc set errno when an incomplete character is passed to mbtowc(3), even though that is required by POSIX. Anyway, that mbtowc(3) bug will be fixed at least in OpenBSD after release unlock, so it would be good to fix this bug in libedit before fixing the bug in mbtowc(3). How can these three bugs be fixed? 1. As far as i understand it, the intention of the bogus errno = 0 is to undo the effects of failing system calls in el_wset(), sig_set(), and read__fixio() if the subsequent read(2) indicates end of file. So, restoring errno has to be moved right after read__fixio(). Of course, neither 0 nor e is the right value to restore: 0 is wrong if errno happened to be set on entry, e would be wrong because if one read(2) fails but a second attempt succeeds after read__fixio(), errno should not be touched. So, the errno to be restored in this case has to be saved before calling read(2) for the first time. 2. Solving the second issue requires distinguishing invalid and incomplete characters, but that is impossible with the function mbtowc(3) because it returns -1 in both cases and sets errno to EILSEQ in both cases (once properly implemented). It is vital that each input character is processed right away. It is not acceptable to wait for the next input character before processing the previous one because this is an interactive library, not a batch system. Consequently, the only situation where it is acceptable to wait for the next byte without first processing the previous one(s) is when the previous one(s) form an incomplete sequence that can be continued to form a valid character. Consequently, short of reimplementing a full UTF-8 state machine by hand, the only correct way forward is to use mbrtowc(3). Even then, care is needed to always have the state object properly initialized before using it, and to not discard a valid ASCII or UTF-8 lead byte if it happens to follow an invalid sequence. 3. Fortunately, solution 2. also solves issue 3. as a side effect, by no longer using mbtowc(3) in the first place.		2016-02-08 17:18:43 +00:00
..
readline	remove duplicate declaration	2015-06-02 15:36:45 +00:00
TEST	cast gotsig because it is long on some systems.	2014-06-18 20:12:15 +00:00
chared.c	Don't depend on weak aliases to define the vi "alias" expansion function,	2014-06-18 18:12:28 +00:00
chared.h	Don't depend on weak aliases to define the vi "alias" expansion function,	2014-06-18 18:12:28 +00:00
chartype.c	split the allocation functions, their mixed usage was too confusing.	2015-02-22 02:16:19 +00:00
chartype.h	UTF-8 fixes from Ingo Schwarze:	2016-02-08 17:18:43 +00:00
common.c	From Jilles Tjoelker:	2012-03-24 20:08:43 +00:00
config.h	better autoconf results	2011-07-29 20:57:34 +00:00
editline.3	Fix descriptions of el_set functions.	2015-11-03 21:36:59 +00:00
editrc.5	Bump date for previous.	2014-12-25 13:39:41 +00:00
el.c	Only reset the terminal if we have a tty (Boris Ranto)	2015-12-08 12:56:55 +00:00
el.h	pass -Wconversion	2011-07-29 23:44:44 +00:00
eln.c	make el_gets() return the number of characters read in wide mode (not the	2015-05-18 15:07:04 +00:00
emacs.c	KNF return (\1); -> return \1;	2011-07-29 15:16:33 +00:00
filecomplete.c	callers's -> caller's	2014-10-18 15:07:02 +00:00
filecomplete.h
hist.c	KNF return (\1); -> return \1;	2011-07-29 15:16:33 +00:00
hist.h	Whitespace fix (Ingo Schwarze)	2016-01-30 15:05:27 +00:00
histedit.h	Don't depend on weak aliases to define the vi "alias" expansion function,	2014-06-18 18:12:28 +00:00
history.c	Add a history function that takes a FILE pointer; needed for Capsicum.	2014-05-11 01:05:17 +00:00
keymacro.c	re-enable -Wconversion	2011-08-16 16:25:15 +00:00
keymacro.h	One macro is enough (Ingo Schwarze)	2016-01-29 19:59:11 +00:00
Makefile	Disable -Wcast-qual for clang for now.	2015-01-29 20:30:02 +00:00
makelist	Use C89 functions definitions.	2012-03-21 05:34:54 +00:00
map.c	fix warnings on ubuntu 32 bit (Miki Rozloznik)	2015-05-14 10:44:15 +00:00
map.h	Bounds search for reallocated index, from OpenBSD via Andreas Fett	2014-07-06 18:15:34 +00:00
parse.c	Bounds search for reallocated index, from OpenBSD via Andreas Fett	2014-07-06 18:15:34 +00:00
parse.h
prompt.c	KNF return (\1); -> return \1;	2011-07-29 15:16:33 +00:00
prompt.h
read.c	UTF-8 fixes from Ingo Schwarze:	2016-02-08 17:18:43 +00:00
read.h
readline.c	Adjust API to a more modern readline (Ryo Onodera)	2015-06-02 15:35:31 +00:00
refresh.c	pass -Wconversion	2011-07-29 23:44:44 +00:00
refresh.h
search.c	Fix misplaced parentheses (Ingo Schwarze)	2016-01-30 04:02:51 +00:00
search.h
shlib_version	provide an el_init_fd function.	2013-01-22 20:23:21 +00:00
sig.c	kill ptr_t and ioctl_t, add * sizeof(*foo) to all allocations.	2011-07-28 20:50:55 +00:00
sig.h
sys.h	include <wchar.h> if we don't have wcsdup()	2011-09-28 14:08:04 +00:00
terminal.c	don't include both term.h and termcap.h	2012-05-30 18:21:14 +00:00
terminal.h	From: Jilles Tjoelker: Add a mapping for the cursor delete key	2012-03-24 20:09:30 +00:00
tokenizer.c	Fix misplaced parentheses (Ingo Schwarze)	2016-01-30 04:02:51 +00:00
tty.c	unbreak the build	2015-12-08 16:53:27 +00:00
tty.h	more tty modes refactoring, no functional change intended.	2014-05-19 19:54:12 +00:00
vi.c	Use the full buffer for the conversion; ideally we should be dynamically	2015-10-21 21:45:30 +00:00