Commit Graph

78 Commits

Author SHA1 Message Date
wiz 42f5b68c81 Add missing dot in abbreviation. 2023-02-01 20:24:22 +00:00
wiz 4a1e84fe2f Update Unicode tables to 15.0.0.
This is based on the tables provided by perl 5.37.7.
2022-12-21 06:25:35 +00:00
wiz d7bbc7f115 Update Unicode tables.
These tables are for Unicode 14.0.0 using the data provided with
perl 5.36.0.
2022-12-20 23:08:51 +00:00
wiz 9d24b22cba Add NetBSD RCS Id header instead of OpenBSD one. 2022-12-20 23:07:57 +00:00
wiz 17c3d9d43e Update unicode tables.
This version of the file, and the generator script, come from
OpenBSD. The script was written by Andrew Fresh.

The file covers the encodings from Unicode 13.0.0, based on the files
distributed with perl 5.32.1.
2022-12-20 23:06:08 +00:00
rin 7342bdeab4 Add "Emoji & Pictographs" character definitions from Unicode 15.0.0
(previously 10.0.0):

http://www.unicode.org/charts/

They are classified as PUNCT, which has been used for characters other
than blank, alphabetic, or digit ones.

Glyph widths are taken from "East Asian Width":

https://www.unicode.org/Public/15.0.0/ucd/EastAsianWidth.txt

Characters of "F" or "W" are classified to SWIDTH2, and others are to
SWIDTH1, as usual. See also:

https://www.unicode.org/reports/tr11/

XXX
It would be really nice if someone could check and update characters in
categories other than Emoji...
2022-10-17 11:20:29 +00:00
kim ac8e31be1d Fix the currency symbol for fi_FI.ISO8859-1
In the Finnish language, the recommended symbol for euro is the euro sign
where it is available, and the lowercase letter e otherwise.

The use of the ISO currency code EUR is not an abbreviation of the word
euro in the Finnish language, just like FIM is not an abbreviation of
the word markka.

Reference:
    https://www.kielikello.fi/-/euro-
    Euro
    Kielikello 3/1998
    Kotimaisten kielten keskus
    Institute for the Languages of Finland
    [Last retrieved 2020-03-23]
2020-03-23 13:56:12 +00:00
kim 601620aab2 Add C.UTF-8 2020-03-23 08:44:10 +00:00
rin b232fd18de Add characters in "Emoji & Pictographs" from Unicode 10.0.0:
http://www.unicode.org/charts/

They are classified as PUNCT, which is historically used for characters other
than blank, alphabetic, or digit ones.

Glyph widths are taken from "East Asian Width":
  https://www.unicode.org/Public/10.0.0/ucd/EastAsianWidth.txt
Characters of "F" or "W" are classified to SWIDTH2, and others are classified
to SWIDTH1, as implicitly done in the previous revisions.

Should address problems like PR bin/53323.

Discussed with soda@. We thank Takuya SHIOZAKI (tshiozak@) for useful comments.
2018-06-03 07:54:51 +00:00
joerg 6f9c8629a8 Remove duplicate zh entry. 2013-08-19 22:34:41 +00:00
joerg 491bae4a02 Add forgotten conversions of ja_JP for the COMPOUND_TEXT encoding. 2013-08-11 22:13:56 +00:00
joerg d1c1419eb8 Provide UTF-8 variants for all existing locales. The data is derived
from the Unicode Common Locale Data Repository.

Convert non-UTF-8 versions from the UTF-8 version using iconv and some
ad-hoc transliterations using sed.

Use EUR as currency_symbol in ISO8859-1.

Invert the Norwegian handling. no_NO is an alias for nb_NO as the latter
is used e.g. in CLDR.

Provide the Serbian locales in both Cyrilic and Latin script versions.
The alias is choosen based on the character set for the non-UTF-8 case
and Cyrillic is the default for UTF-8.
2013-08-11 22:09:40 +00:00
mbalmer 60da905091 Fix a typo: Affrimative -> Affirmative. 2013-06-17 11:05:42 +00:00
tnozaki c264671cd8 fix PR lib/46772 wcwidth of combining characters.
patch probyted by yamt@, thanks.
2012-08-08 18:40:37 +00:00
martin 64484e6359 German uses dot as thousands separator 2011-03-15 15:30:52 +00:00
bouyer d478d33fa1 Add support for fr_*.UTF-8 locale. Setting LANG to fr_*.UTF-8 won't get
the message catalog right (they're encoded in iso-8859-1), but other locale
functions should work right.
Proposed on tech-userlevel on 20 May 2009.
2009-06-03 18:47:05 +00:00
tnozaki 52ed7b035f Fixes PR lib/39662, shortcomings in LC_{MONETARY,NUMERIC,TIME,MESSAGES} db format.
ok'ed by core and releng.
(thanks for agc@, snj@ and i'm sorry for long time patience).

[libc]
- localeio.[ch] and lc*.[ch] in src/lib/libc/locale was replaced by
  new locale-db implementation using citrus_db backend,
  see src/lib/libc/citrus/citrus_lc_*.[ch].
- add citrus_bcs_strtou?l.c. don't use strtou?l locale implementation
  internally, because they're locale-aware function.
- add some stubs for multi-locale issue, see {current,global}_locale.c.
- remove some obsolete file, setrunelocale.c, ___runetype_mb.c.
- remove __savectype() from ctypeio.[ch].

[tools]
- mklocale(1): add new option ``-t'' that generates new style
  LC_{MONETARY,NUMERIC,TIME,MESSAGES} locale-db format.
- chrtbl(1): added ctypeio.[ch] for __savectype().

[locale-db]
- added en_US.US-ASCII locale.
- removed some shareable locale definition file:
    en_US.US-ASCII -> en_US.ISO8859-1, en_US.UTF-8
    zh_CN.eucCN -> zh_CN.GB18030
    and more...see src/share/locale/*/Makefile.
- remove obsoleted locale sr_YU, added new locale sr_ME, sr_RS.
- change locale name ja_JP.ISO2022-JP* -> ja_JP.ISO-2022-JP*
  for X11's locale.alias file alignments.
- fix regression test, wrong wcs?width(3), NAN/INF usage.

i tested release-build following arch:
  i386, amd64, hpc{mips,arm,sh}, sparc64, vax.

citrus_lc_*.[ch] also can read old-plain-text style locale-db.
so that backward compatibility is keeped, but lc*.[ch] can't read
new citrus_db'ed locale-db and localeio.c never check sanity,
so forward compatibility is broken ;-<

old mklocale(1) doesn't know -t option, so you have to rebuild toolchain.
2009-01-02 00:20:18 +00:00
apb f46c1de7cb Use ${TOOL_SED} instead if plain sed in Makefiles. 2008-10-25 22:27:34 +00:00
tnozaki 76b2ef13b2 add alias for XFree86 compatibility. 2008-06-21 07:06:01 +00:00
ginsbach 2624ccf202 The hi_IN.ISCII-dev locales shouldn't be installed as there is no support
for this code set in LC_CTYPE nor iconv(3).
2008-06-04 13:19:31 +00:00
ginsbach b750bd80b0 Add some more LC_TYPE aliases. OK'ed by tnozaki. 2008-05-30 03:24:02 +00:00
ginsbach 008c3f646e These are really aliases for zh_CN.eucCN. This was a redundancy that
was incorrectly copied from FreeBSD.  OK'ed by tnozaki.
2008-05-30 03:12:59 +00:00
ginsbach 9a7780f955 Use ${TOOL_SED} instead of sed. 2008-05-24 02:56:55 +00:00
ginsbach 14f5b96735 Add locale category files for LC_TIME. These are sourced from FreeBSD
with modifications to the comments and sorted properly for NetBSD.
Additional Makefile work is still needed to generate the links for locales
with shared category files.
2008-05-17 04:11:29 +00:00
ginsbach 74c7f35ad9 Add locale category files for LC_NUMERIC. These are sourced from FreeBSD
with only modifications to the comments.  Additional Makefile work is
still needed to generate the links for locales with shared category files.
2008-05-17 04:07:29 +00:00
ginsbach ae1920a444 Add locale category files for LC_MONETARY. These are sourced from FreeBSD
with only modifications to the comments.  Additional Makefile work is
still needed to generate the links for locales with shared category files.
2008-05-17 04:05:51 +00:00
ginsbach ad33e5af01 Add locale category files for LC_MESSAGES. These are sourced from FreeBSD
with only modifications to the comments.  Additional Makefile work is
still needed to generate the links for locales with shared category files.
2008-05-17 03:57:50 +00:00
tnozaki aeadbd280f add tr_TR.ISO8859-9 locale. 2007-03-14 15:49:25 +00:00
tnozaki ab9a36c548 add nn_NO(Nynorsk) and nb_NO(Bokmal) locale. 2007-03-08 16:26:26 +00:00
tnozaki 806c2e8ee3 add zh_HK.Big5-HKSCS locale, derrived from FreeBSD. 2007-03-06 15:50:45 +00:00
tnozaki 57f0023ef8 catch up KS X 1001:2002: added U+327E - CIRCLED HANGUL IEUNG U. 2006-12-04 15:01:42 +00:00
tnozaki 309c4c3cc7 corrrect invalid charset mask. 2006-07-16 10:42:26 +00:00
tnozaki 41efa2e2cd 1. added CNS11643 plane 3 <-> UCS iconv data.
2. zh_TW.eucTW locale now supports CNS11643 plane 3 ~ 7
2006-07-16 06:13:29 +00:00
tnozaki 68099f2838 1. make fullwidth space as printble(also blank).
suggested by yamt-san.
2. JISX0201 2/1 - 2/5 is not phonogram, change it as punct.
2006-04-11 18:45:03 +00:00
tnozaki 307ce80709 add kk_KZ.PT154 locale and iconv support for PTCP154. 2006-03-28 14:44:00 +00:00
tnozaki 55e54105e0 1. remove duplicated entry(et_EE.UTF-8).
2. add missing LC_MESSAGES alias.
   during cvs diff -r1.2, following locale aliases are introduced:
	+ af_ZA.UTF-8
	+ be_BY.UTF-8
	+ en_NZ.UTF-8
	+ et_EE.UTF-8
	+ eu_ES.UTF-8
	+ pt_BR.UTF-8
	+ ro_RO.UTF-8
	+ sr_YU.UTF-8	<- en_US.UTF-8
	+ zh_CN.GB2312	<- zh_CN.eucCN (for Linux/FreeBSD compatibility)
2006-03-24 11:54:52 +00:00
tnozaki ade0b1e1b5 add following locales:
af_AZ.ISO8859-1/15
	be_BY.ISO8859-5
	en_NZ.ISO8859-1/15
	et_EE.ISO8859-15
	eu_ES.ISO8859-1/15
	pt_BR.ISO8859-1
	ro_RO.ISO8859-2
	sr_YU.ISO8859-2/5
	ukUA.CP1251/ISO8859-5
2006-03-23 23:23:38 +00:00
tnozaki 2a441a26bc fix invalid charset-bit. 2006-03-17 16:25:06 +00:00
tnozaki 10269f58df added be_BY.CP1251, ru_BY.CP1251, ru_RU.CP1251 locale.
requested by cheusov AT tut DOT by, thanks.
2006-03-14 16:16:44 +00:00
tnozaki a3b248100e add csmapper:CNS11643-1,2 and esdb:ISO-2022-CN,
integrate esdb:EUC-TW, locale:zh_TW.eucTW.
2005-03-27 22:30:05 +00:00
tshiozak aaa316a46c add tab and several characters to BLANK and SPACE classes on the Latin
charsets to make up for shortages.
pointed out by Joerg Sonnenberger.
2005-03-09 11:54:13 +00:00
tshiozak b115b9ec4f fix XDIGIT problem on several locales,
pointed by Joerg Sonnenberger on tech-userlevel@.
2005-03-08 06:35:13 +00:00
tnozaki 9cc920cfc1 merge resent FreeBSD UTF-8 locale.
now support outside BMP area.
2005-02-10 18:12:42 +00:00
tnozaki fcff889a4d remove ko_KR.UTF-8.
this locale is alias for en_US.UTF-8 by locale.alias now.
2005-02-10 18:03:01 +00:00
tnozaki d65a4ead10 fix invalid range. 2005-02-04 18:35:45 +00:00
tnozaki 4b07a87f1d fix invalid range. 2005-01-16 20:15:35 +00:00
ian 70f35b6f77 Remove support for ALIASES in share/locale/ctype/Makefile, which
created symlinks in the filesystem.  Put the one existing alias
(zh_TW.BIG5) into the newer locale.alias file.
2004-09-10 15:12:51 +00:00
tshiozak 3407ed183f add the default locale.alias. 2004-07-21 19:02:17 +00:00
itojun a60209cdf0 for platforms that has problem with c++ comment, switch to good old /* */ 2004-06-24 03:28:50 +00:00
tshiozak 45f22bd0e2 To prevent ctype predicate functions (e.g. isalpha()) to misjudge,
change the charset mask for G2 (JIS X 0201 kana), from 0x80 to 0x800080.
2004-05-13 09:57:03 +00:00