Commit Graph

80 Commits

Author SHA1 Message Date
joerg
939ab48f97 Actually return something deterministic 2011-05-23 14:45:44 +00:00
joerg
e325eedcc1 Remove tautology. 2010-12-07 22:01:45 +00:00
joerg
aaf356760f Mark function as static and give it an explicit return type. 2010-12-07 22:01:22 +00:00
tnozaki
56bf19aaea fix byte order mark related bug introduced by previous commit,
reported by Sverre Froyen via current-user, thanks!
2010-03-20 18:15:32 +00:00
tnozaki
36a8b8869c 1. fix wrong byte order mark of utf-16, reported by NARUSE Yui -san.
patch provided by tshiozak@ -san.

2. don't eat 0xfeff/0xfffe if they don't appear at the first of bytestream.
noticed y tshiozak@ -san, patch provied by me.

thanks a lot.
2010-03-15 15:00:58 +00:00
christos
461a86f9bd merge christos-time_t 2009-01-11 02:45:45 +00:00
tnozaki
52ed7b035f Fixes PR lib/39662, shortcomings in LC_{MONETARY,NUMERIC,TIME,MESSAGES} db format.
ok'ed by core and releng.
(thanks for agc@, snj@ and i'm sorry for long time patience).

[libc]
- localeio.[ch] and lc*.[ch] in src/lib/libc/locale was replaced by
  new locale-db implementation using citrus_db backend,
  see src/lib/libc/citrus/citrus_lc_*.[ch].
- add citrus_bcs_strtou?l.c. don't use strtou?l locale implementation
  internally, because they're locale-aware function.
- add some stubs for multi-locale issue, see {current,global}_locale.c.
- remove some obsolete file, setrunelocale.c, ___runetype_mb.c.
- remove __savectype() from ctypeio.[ch].

[tools]
- mklocale(1): add new option ``-t'' that generates new style
  LC_{MONETARY,NUMERIC,TIME,MESSAGES} locale-db format.
- chrtbl(1): added ctypeio.[ch] for __savectype().

[locale-db]
- added en_US.US-ASCII locale.
- removed some shareable locale definition file:
    en_US.US-ASCII -> en_US.ISO8859-1, en_US.UTF-8
    zh_CN.eucCN -> zh_CN.GB18030
    and more...see src/share/locale/*/Makefile.
- remove obsoleted locale sr_YU, added new locale sr_ME, sr_RS.
- change locale name ja_JP.ISO2022-JP* -> ja_JP.ISO-2022-JP*
  for X11's locale.alias file alignments.
- fix regression test, wrong wcs?width(3), NAN/INF usage.

i tested release-build following arch:
  i386, amd64, hpc{mips,arm,sh}, sparc64, vax.

citrus_lc_*.[ch] also can read old-plain-text style locale-db.
so that backward compatibility is keeped, but lc*.[ch] can't read
new citrus_db'ed locale-db and localeio.c never check sanity,
so forward compatibility is broken ;-<

old mklocale(1) doesn't know -t option, so you have to rebuild toolchain.
2009-01-02 00:20:18 +00:00
tnozaki
e1ee662664 remove unused include, locale.h. 2008-06-14 16:01:07 +00:00
tnozaki
7ed5b48246 add BOM to utf-16/32 stream in case that endian is not specified,
like other iconv implementation, GNU libiconv, glibc2 iconv, perl iconv.
2008-03-20 11:47:45 +00:00
tnozaki
fca38949e4 fix lib/37290
- don't call abort(3) when there's no suitable charset found.
- use iso-8859-1(or INIT1 if specified) for C1 control char.
2007-11-21 14:19:32 +00:00
tnozaki
561e0bd51b remove invalid range check. 2007-10-23 15:28:25 +00:00
tnozaki
0941b12b16 lib/36938 mbtowc misbehaving after invalid char sequence
- make sure to initialize mbtowc's internal state.
 - add regression test.
2007-09-18 15:12:07 +00:00
tnozaki
6a1c27dd91 fix typo. 2007-04-24 15:42:08 +00:00
tnozaki
fd2dd8ec0d add new encoding support to iconv(3):
- RISCOS-LATIN1
	- DEC-MCS
	- DEC-HANYU(libDECHanyu)
2007-04-01 18:52:28 +00:00
tnozaki
9f260693ac disallow illegal utf-8 byte sequence and surrogate chars (RFC3629).
5-6 byte sequence(0x110000 - 0x7FFFFFFF) are still available
for backward compatibility.
2007-03-06 16:13:58 +00:00
tnozaki
1bf1d71e3c iconv: add following CCS/CES support.
- CNS11643-[3-7] <-> UCS:BMP/SIP (EUC-TW, ISO-2022-CN-EXT)
- HKSCS <-> UCS:BMP/SIP (Big5-HKSCS)
- JISX0213-[1-2] <-> UCS:BMP/SIP (EUC-JIS-2004,Shift_JIS-2004,ISO-2022-JP-2004)
2007-03-05 16:57:06 +00:00
tnozaki
c61eef3da4 make del(\x7f) pass through. 2006-12-13 16:16:56 +00:00
tnozaki
3fb79e8260 don't read input string more than MB_LEN_MAX(maybe redundant escape sequence). 2006-11-24 17:27:52 +00:00
tnozaki
663e0dad61 don't throw EILSEQ when byte sequence is "zW ". 2006-11-24 16:52:20 +00:00
tnozaki
8033a5b008 1. add iconv support for following encodings:
Chinese Simplefied
        HZ, HZ8 - 7/8bit stateful encoding, see RFC1842,1843. (libHZ)
        zW      - 7bit stateful encoding, see RFC1842. (libZW)
2. add citrus_prop.[ch] - parser for encoding module's init parameter strings.
2006-11-22 23:38:25 +00:00
tnozaki
5bda830543 fix memory leak. 2006-11-22 20:11:03 +00:00
dogcow
cfe7a78c9c change uint32_t to size_t; fixes build issue on 64-bit platforms. 2006-11-14 02:55:34 +00:00
tnozaki
74fca02cf1 avoid infinity loop, iso2022 module's stdenc_get_state_desc_generic()
never return _STDENC_SDGEN_INITIAL.
2006-11-13 19:08:19 +00:00
tnozaki
79a70a823d 1. add iconv support for followint encoding:
Vietnamese
	TCVN	8bit Viet Nam National Standard
	VISCII	8bit RFC1456
	VIQR	7bit RFC1456(libVIQR)
    Unicode Escape (GNU libiconv compatibility)
	C99, JAVA (libUES)
2. fix iconv_std module:
	add special treatment for POSIX Defect Report #288 case.
2006-11-13 15:16:28 +00:00
tnozaki
1a00f7afa4 don't pass through surrogate character(0xD800 - 0xDFFF). 2006-10-27 14:13:55 +00:00
tnozaki
2e2fc44e22 add new iconv module libJOHAB,
this supports S.Korean character encoding scheme ``JOHAB''.
2006-10-18 17:54:54 +00:00
tnozaki
8316f5b826 correct typo in _DIAGASSERT() and some KNF.
pointed by uebayashi-san, thanks!
2006-09-11 13:06:33 +00:00
tnozaki
48d386f61a mapper_std iconv module and mkcsmapper(1) now can treat
plain-row-col charset and 4byte code(like GB18030) as SRC_ZONE.
2006-09-09 14:35:17 +00:00
tnozaki
f264ea3a01 cleanup code 2006-08-23 12:57:24 +00:00
tnozaki
2bcfe3b4c8 added Chinese Tradisional Big5 family,
Big5-2003, Big5-ETen, Big5-IBM, Big-5E, Big-5+.

``Big5 is now the alias of Big5-ETen,
if you want Unicode.org's obsolete mappings, use Big5-IBM instead.
2006-06-19 17:28:24 +00:00
tnozaki
b29e60b31d if INIT0 specified, use it instead of ASCII. 2006-06-07 16:28:34 +00:00
christos
c8780d3168 Coverity CID 1440: off by one in array count. 2006-03-22 00:08:09 +00:00
christos
15cc8e46f6 Coverity CID 1439: Prevent array index out of bounds access. 2006-03-19 01:55:48 +00:00
christos
95f6be8b1a Coverity CID 2461: Bail out quickly instead of accessing uninitialized variables 2006-03-19 01:25:44 +00:00
christos
68259ab10a Coverity 2462: Bail out quickly instead of accessing uninitialized variables. 2006-03-19 01:24:09 +00:00
christos
f2194f03cc Coverity CID 2463: Bail out instead of accessing uninitialized variables. 2006-03-19 01:21:28 +00:00
christos
5bd7f658fe Coverity CID 2464: Don't use uninitialized variables; exit with error quickly. 2006-03-19 01:19:32 +00:00
christos
adcc2139d9 Coveriry CID 2472: If the number of bits is invalid, return immediately
instead of accessing uninitialized variables.
2006-03-19 01:17:30 +00:00
christos
f174420e75 Coverity CID 2473: Fix uninitialized variable reference. 2006-03-19 01:15:06 +00:00
tnozaki
1b24b76f6b MB_CUR_MAX should be 2 when MODE_2BYTE flag set. 2006-02-15 19:50:27 +00:00
dogcow
86811edb37 change #include <sys/endian.h> => #include <machine/endian.h> so that
it's (more) consistent in the tree; this, along with changing tools/compat's
autoconf detection from AC_CHECK_FUNCS to AC_CHECK_DECLS makes the vast
majority of htobe16 and friends' redefinition errors bite the dust.
Tested with -current and FreeBSD.
2006-02-09 22:03:15 +00:00
tshiozak
bb345c8a27 add missing _citrus_MSKanji_stdenc_get_state_desc_generic() function.
pointed out by Patrick Welche <prlw1 _at_ newn _dot_ cam _dot_ ac _dot_ uk>
2005-12-07 06:20:20 +00:00
tshiozak
1beef8fe93 fix lib/31874.
- add _citrus_stdenc_get_state_desc() interface to get
  encoding-scheme-independent encoder/decoder state descriptions.
- make sure that iconv_std module uses it to judge whether the last
  sequences forms complete shift sequences.
- bump minor of i18n_module because of get_state_desc().
2005-10-29 18:02:04 +00:00
tshiozak
c8a7d58fe9 make sure that this module can handle all private/vendor-defined
character area.
This is reported by MORIYAMA Masayuki <msyk _at_ mtg.biglobe.ne.jp> and
"NARUSE, Yui" <naruse _at_ airemix.com>, and fixed by MORIYAMA-san.
2005-10-18 06:44:28 +00:00
tshiozak
eda4f3c630 fix a problem on wc->mb conversion for G2 plane.
This is reported by MORIYAMA Masayuki <msyk _at_ mtg.biglobe.ne.jp> and
"NARUSE, Yui" <naruse _at_ airemix.com>, and fixed by MORIYAMA-san.
2005-10-18 06:42:12 +00:00
tnozaki
a3b248100e add csmapper:CNS11643-1,2 and esdb:ISO-2022-CN,
integrate esdb:EUC-TW, locale:zh_TW.eucTW.
2005-03-27 22:30:05 +00:00
tnozaki
6e2609d649 anonymous union between chlen and _UTF7StatePrive has
compilation problem with gcc295.
this union attempt to make mbsinit(3) handle multibyte state correctly,
but it's useless as far as we use utf-7 only iconv interface.
so i eliminate ctype feature.

patch contributed by Joerg Sonnenberg(who porting Citrus to DragonFlyBSD).
and yamt-san gave me advice, thanks a lot.
2005-03-14 03:43:10 +00:00
christos
61e7a23268 UTF8EncodingInfo is an empty struct; remove noop code and DIAGASSERT.
From Joerg Sonnenberger
2005-03-11 23:32:03 +00:00
tnozaki
fe05f588fb add new citrus iconv module UTF-7.
thanks advice, yamt-san.
2005-03-05 18:05:14 +00:00
simonb
3cebd9325e White space nit- don't put a space before/after increment/decrement
operators.
2005-02-11 06:21:21 +00:00