This version of the file, and the generator script, come from
OpenBSD. The script was written by Andrew Fresh.
The file covers the encodings from Unicode 13.0.0, based on the files
distributed with perl 5.32.1.
(previously 10.0.0):
http://www.unicode.org/charts/
They are classified as PUNCT, which has been used for characters other
than blank, alphabetic, or digit ones.
Glyph widths are taken from "East Asian Width":
https://www.unicode.org/Public/15.0.0/ucd/EastAsianWidth.txt
Characters of "F" or "W" are classified to SWIDTH2, and others are to
SWIDTH1, as usual. See also:
https://www.unicode.org/reports/tr11/
XXX
It would be really nice if someone could check and update characters in
categories other than Emoji...
In the Finnish language, the recommended symbol for euro is the euro sign
where it is available, and the lowercase letter e otherwise.
The use of the ISO currency code EUR is not an abbreviation of the word
euro in the Finnish language, just like FIM is not an abbreviation of
the word markka.
Reference:
https://www.kielikello.fi/-/euro-
Euro
Kielikello 3/1998
Kotimaisten kielten keskus
Institute for the Languages of Finland
[Last retrieved 2020-03-23]
http://www.unicode.org/charts/
They are classified as PUNCT, which is historically used for characters other
than blank, alphabetic, or digit ones.
Glyph widths are taken from "East Asian Width":
https://www.unicode.org/Public/10.0.0/ucd/EastAsianWidth.txt
Characters of "F" or "W" are classified to SWIDTH2, and others are classified
to SWIDTH1, as implicitly done in the previous revisions.
Should address problems like PR bin/53323.
Discussed with soda@. We thank Takuya SHIOZAKI (tshiozak@) for useful comments.
from the Unicode Common Locale Data Repository.
Convert non-UTF-8 versions from the UTF-8 version using iconv and some
ad-hoc transliterations using sed.
Use EUR as currency_symbol in ISO8859-1.
Invert the Norwegian handling. no_NO is an alias for nb_NO as the latter
is used e.g. in CLDR.
Provide the Serbian locales in both Cyrilic and Latin script versions.
The alias is choosen based on the character set for the non-UTF-8 case
and Cyrillic is the default for UTF-8.
ok'ed by core and releng.
(thanks for agc@, snj@ and i'm sorry for long time patience).
[libc]
- localeio.[ch] and lc*.[ch] in src/lib/libc/locale was replaced by
new locale-db implementation using citrus_db backend,
see src/lib/libc/citrus/citrus_lc_*.[ch].
- add citrus_bcs_strtou?l.c. don't use strtou?l locale implementation
internally, because they're locale-aware function.
- add some stubs for multi-locale issue, see {current,global}_locale.c.
- remove some obsolete file, setrunelocale.c, ___runetype_mb.c.
- remove __savectype() from ctypeio.[ch].
[tools]
- mklocale(1): add new option ``-t'' that generates new style
LC_{MONETARY,NUMERIC,TIME,MESSAGES} locale-db format.
- chrtbl(1): added ctypeio.[ch] for __savectype().
[locale-db]
- added en_US.US-ASCII locale.
- removed some shareable locale definition file:
en_US.US-ASCII -> en_US.ISO8859-1, en_US.UTF-8
zh_CN.eucCN -> zh_CN.GB18030
and more...see src/share/locale/*/Makefile.
- remove obsoleted locale sr_YU, added new locale sr_ME, sr_RS.
- change locale name ja_JP.ISO2022-JP* -> ja_JP.ISO-2022-JP*
for X11's locale.alias file alignments.
- fix regression test, wrong wcs?width(3), NAN/INF usage.
i tested release-build following arch:
i386, amd64, hpc{mips,arm,sh}, sparc64, vax.
citrus_lc_*.[ch] also can read old-plain-text style locale-db.
so that backward compatibility is keeped, but lc*.[ch] can't read
new citrus_db'ed locale-db and localeio.c never check sanity,
so forward compatibility is broken ;-<
old mklocale(1) doesn't know -t option, so you have to rebuild toolchain.
with modifications to the comments and sorted properly for NetBSD.
Additional Makefile work is still needed to generate the links for locales
with shared category files.