NetBSD/usr.bin/mklocale
martin 31fbf26310 Pull up following revision(s) (requested by rin in ticket #538):
usr.bin/mklocale/yacc.y: revision 1.35
	usr.bin/mklocale/yacc.y: revision 1.36
	usr.bin/mklocale/mklocale.1: revision 1.18
	usr.bin/mklocale/mklocale.1: revision 1.19

mklocale: XXX: Neglect TODIGIT at the moment
PR lib/57798

It was implemented with an assumption that all digit characters
can be mapped to numerical values <= 255.
This is no longer true for Unicode, and results in, e.g., wrong
return values of wcwidth(3) for U+5146 or U+16B60.

As a workaround, neglect TODIGIT for now, as done for OpenBSD:
https://github.com/OpenBSD/src/commit/4efe9bdeb34
XXX

At least netbsd-10 should be fixed, but it requires some tests.


mklocale(1): Add range check for TODIGIT, rather than disabling it
PR lib/57798

Digit value specified by TODIGIT is storaged as lowest 8 bits of
_RuneType, see lib/libc/locale/runetype_file.h:
https://nxr.netbsd.org/xref/src/lib/libc/locale/runetype_file.h#56

The symptom reported in the PR is due to missing range check for
this value; values of 256 and above were mistakenly treated as
other flag bits in _RuneType.

For example, U+5146 has numerical value 1000,000,000,000 ==
0xe8d4a51000 where __BITS(30, 31) == _RUNETYPE_SW3 are turned on.

This is why wcwidth(3) returned 3 for this character.

This apparently affected not only character width, but also other
attributes storaged in _RuneType.

IIUC, digit value attributes in _RuneType have never been utilized
until now, but preserve these if digit fits within (0, 256). This
should be safer for pulling this up into netbsd-10. Also, these
attributes may be useful to implement some I18N features as
suggested by uwe@ in the PR.

netbsd-[98] is not affected as these use old UTF-8 ctype definitions.
2024-01-14 15:15:00 +00:00
..
Makefile
ldef.h
lex.l
mklocale.1 Pull up following revision(s) (requested by rin in ticket #538): 2024-01-14 15:15:00 +00:00
mklocaledb.c
yacc.y Pull up following revision(s) (requested by rin in ticket #538): 2024-01-14 15:15:00 +00:00