Commit Graph

91 Commits

Author SHA1 Message Date
spz
dd745abb62 Fix our iconv version for the issues that apply to us from CVE-2014-3951
(which are the:
- Consistently pass around context information using a simple pointer.
  This fixes some dereferencing bugs in Chinese character set conversions.
- Fix Simplified Chinese character set conversions by switching around the
  fields of an internal struct so it corresponds with the way variables of
  this type are initialised.
part)
Patch taken from FreeBSD and mutilated to fit.
FreeBSD credits: Manuel Mausz (reporter), Tijl Coosemans (report handler)
2014-06-24 22:24:18 +00:00
christos
3a5ace8027 revert previous, it causes other problem and I cannot easily debug it. 2014-01-18 15:21:41 +00:00
christos
9465819ab5 PR/47602: Christos Zoulas: getwc() modifies input instead of returning EILSEQ.
Waited for almost a year for feedback and there was none.
2014-01-16 20:28:51 +00:00
joerg
85a67e61f7 Add mbsnrtowcs and wcsnrtombs. Approved by core. 2013-05-28 16:57:56 +00:00
wiz
6234e98d3f Remove unused variables. From cppcheck via Henning Petersen in PR 45997. 2012-02-12 13:51:29 +00:00
tnozaki
7119b42a87 fix memory leak, pointed by nonaka-san(again^3). 2011-11-19 18:48:39 +00:00
tnozaki
a750734d28 remove unused variable, pointed by nonaka-san, thanks. 2011-11-19 18:20:13 +00:00
wiz
60025aa86f Use boolean AND instead of bitwise one in _DIAGASSERT.
From Henning Petersen in PR 45518.
2011-10-30 21:48:27 +00:00
tnozaki
6b58a1b843 revert r1.21, still problem exists for posix2008 mbsnrtowcs(not yet commited),
but i have no time to investigate t_mbrtowc failure.
2011-10-10 22:45:45 +00:00
tnozaki
9f0b22ed7d update string pointer when input is partial escape sequence or multibyte. 2011-10-07 18:59:13 +00:00
joerg
998c5d780f Make intermediate size variable size_t like the rest to avoid
unnecessary casting.
2011-05-23 14:53:46 +00:00
joerg
939ab48f97 Actually return something deterministic 2011-05-23 14:45:44 +00:00
joerg
e325eedcc1 Remove tautology. 2010-12-07 22:01:45 +00:00
joerg
aaf356760f Mark function as static and give it an explicit return type. 2010-12-07 22:01:22 +00:00
tnozaki
56bf19aaea fix byte order mark related bug introduced by previous commit,
reported by Sverre Froyen via current-user, thanks!
2010-03-20 18:15:32 +00:00
tnozaki
36a8b8869c 1. fix wrong byte order mark of utf-16, reported by NARUSE Yui -san.
patch provided by tshiozak@ -san.

2. don't eat 0xfeff/0xfffe if they don't appear at the first of bytestream.
noticed y tshiozak@ -san, patch provied by me.

thanks a lot.
2010-03-15 15:00:58 +00:00
christos
461a86f9bd merge christos-time_t 2009-01-11 02:45:45 +00:00
tnozaki
52ed7b035f Fixes PR lib/39662, shortcomings in LC_{MONETARY,NUMERIC,TIME,MESSAGES} db format.
ok'ed by core and releng.
(thanks for agc@, snj@ and i'm sorry for long time patience).

[libc]
- localeio.[ch] and lc*.[ch] in src/lib/libc/locale was replaced by
  new locale-db implementation using citrus_db backend,
  see src/lib/libc/citrus/citrus_lc_*.[ch].
- add citrus_bcs_strtou?l.c. don't use strtou?l locale implementation
  internally, because they're locale-aware function.
- add some stubs for multi-locale issue, see {current,global}_locale.c.
- remove some obsolete file, setrunelocale.c, ___runetype_mb.c.
- remove __savectype() from ctypeio.[ch].

[tools]
- mklocale(1): add new option ``-t'' that generates new style
  LC_{MONETARY,NUMERIC,TIME,MESSAGES} locale-db format.
- chrtbl(1): added ctypeio.[ch] for __savectype().

[locale-db]
- added en_US.US-ASCII locale.
- removed some shareable locale definition file:
    en_US.US-ASCII -> en_US.ISO8859-1, en_US.UTF-8
    zh_CN.eucCN -> zh_CN.GB18030
    and more...see src/share/locale/*/Makefile.
- remove obsoleted locale sr_YU, added new locale sr_ME, sr_RS.
- change locale name ja_JP.ISO2022-JP* -> ja_JP.ISO-2022-JP*
  for X11's locale.alias file alignments.
- fix regression test, wrong wcs?width(3), NAN/INF usage.

i tested release-build following arch:
  i386, amd64, hpc{mips,arm,sh}, sparc64, vax.

citrus_lc_*.[ch] also can read old-plain-text style locale-db.
so that backward compatibility is keeped, but lc*.[ch] can't read
new citrus_db'ed locale-db and localeio.c never check sanity,
so forward compatibility is broken ;-<

old mklocale(1) doesn't know -t option, so you have to rebuild toolchain.
2009-01-02 00:20:18 +00:00
tnozaki
e1ee662664 remove unused include, locale.h. 2008-06-14 16:01:07 +00:00
tnozaki
7ed5b48246 add BOM to utf-16/32 stream in case that endian is not specified,
like other iconv implementation, GNU libiconv, glibc2 iconv, perl iconv.
2008-03-20 11:47:45 +00:00
tnozaki
fca38949e4 fix lib/37290
- don't call abort(3) when there's no suitable charset found.
- use iso-8859-1(or INIT1 if specified) for C1 control char.
2007-11-21 14:19:32 +00:00
tnozaki
561e0bd51b remove invalid range check. 2007-10-23 15:28:25 +00:00
tnozaki
0941b12b16 lib/36938 mbtowc misbehaving after invalid char sequence
- make sure to initialize mbtowc's internal state.
 - add regression test.
2007-09-18 15:12:07 +00:00
tnozaki
6a1c27dd91 fix typo. 2007-04-24 15:42:08 +00:00
tnozaki
fd2dd8ec0d add new encoding support to iconv(3):
- RISCOS-LATIN1
	- DEC-MCS
	- DEC-HANYU(libDECHanyu)
2007-04-01 18:52:28 +00:00
tnozaki
9f260693ac disallow illegal utf-8 byte sequence and surrogate chars (RFC3629).
5-6 byte sequence(0x110000 - 0x7FFFFFFF) are still available
for backward compatibility.
2007-03-06 16:13:58 +00:00
tnozaki
1bf1d71e3c iconv: add following CCS/CES support.
- CNS11643-[3-7] <-> UCS:BMP/SIP (EUC-TW, ISO-2022-CN-EXT)
- HKSCS <-> UCS:BMP/SIP (Big5-HKSCS)
- JISX0213-[1-2] <-> UCS:BMP/SIP (EUC-JIS-2004,Shift_JIS-2004,ISO-2022-JP-2004)
2007-03-05 16:57:06 +00:00
tnozaki
c61eef3da4 make del(\x7f) pass through. 2006-12-13 16:16:56 +00:00
tnozaki
3fb79e8260 don't read input string more than MB_LEN_MAX(maybe redundant escape sequence). 2006-11-24 17:27:52 +00:00
tnozaki
663e0dad61 don't throw EILSEQ when byte sequence is "zW ". 2006-11-24 16:52:20 +00:00
tnozaki
8033a5b008 1. add iconv support for following encodings:
Chinese Simplefied
        HZ, HZ8 - 7/8bit stateful encoding, see RFC1842,1843. (libHZ)
        zW      - 7bit stateful encoding, see RFC1842. (libZW)
2. add citrus_prop.[ch] - parser for encoding module's init parameter strings.
2006-11-22 23:38:25 +00:00
tnozaki
5bda830543 fix memory leak. 2006-11-22 20:11:03 +00:00
dogcow
cfe7a78c9c change uint32_t to size_t; fixes build issue on 64-bit platforms. 2006-11-14 02:55:34 +00:00
tnozaki
74fca02cf1 avoid infinity loop, iso2022 module's stdenc_get_state_desc_generic()
never return _STDENC_SDGEN_INITIAL.
2006-11-13 19:08:19 +00:00
tnozaki
79a70a823d 1. add iconv support for followint encoding:
Vietnamese
	TCVN	8bit Viet Nam National Standard
	VISCII	8bit RFC1456
	VIQR	7bit RFC1456(libVIQR)
    Unicode Escape (GNU libiconv compatibility)
	C99, JAVA (libUES)
2. fix iconv_std module:
	add special treatment for POSIX Defect Report #288 case.
2006-11-13 15:16:28 +00:00
tnozaki
1a00f7afa4 don't pass through surrogate character(0xD800 - 0xDFFF). 2006-10-27 14:13:55 +00:00
tnozaki
2e2fc44e22 add new iconv module libJOHAB,
this supports S.Korean character encoding scheme ``JOHAB''.
2006-10-18 17:54:54 +00:00
tnozaki
8316f5b826 correct typo in _DIAGASSERT() and some KNF.
pointed by uebayashi-san, thanks!
2006-09-11 13:06:33 +00:00
tnozaki
48d386f61a mapper_std iconv module and mkcsmapper(1) now can treat
plain-row-col charset and 4byte code(like GB18030) as SRC_ZONE.
2006-09-09 14:35:17 +00:00
tnozaki
f264ea3a01 cleanup code 2006-08-23 12:57:24 +00:00
tnozaki
2bcfe3b4c8 added Chinese Tradisional Big5 family,
Big5-2003, Big5-ETen, Big5-IBM, Big-5E, Big-5+.

``Big5 is now the alias of Big5-ETen,
if you want Unicode.org's obsolete mappings, use Big5-IBM instead.
2006-06-19 17:28:24 +00:00
tnozaki
b29e60b31d if INIT0 specified, use it instead of ASCII. 2006-06-07 16:28:34 +00:00
christos
c8780d3168 Coverity CID 1440: off by one in array count. 2006-03-22 00:08:09 +00:00
christos
15cc8e46f6 Coverity CID 1439: Prevent array index out of bounds access. 2006-03-19 01:55:48 +00:00
christos
95f6be8b1a Coverity CID 2461: Bail out quickly instead of accessing uninitialized variables 2006-03-19 01:25:44 +00:00
christos
68259ab10a Coverity 2462: Bail out quickly instead of accessing uninitialized variables. 2006-03-19 01:24:09 +00:00
christos
f2194f03cc Coverity CID 2463: Bail out instead of accessing uninitialized variables. 2006-03-19 01:21:28 +00:00
christos
5bd7f658fe Coverity CID 2464: Don't use uninitialized variables; exit with error quickly. 2006-03-19 01:19:32 +00:00
christos
adcc2139d9 Coveriry CID 2472: If the number of bits is invalid, return immediately
instead of accessing uninitialized variables.
2006-03-19 01:17:30 +00:00
christos
f174420e75 Coverity CID 2473: Fix uninitialized variable reference. 2006-03-19 01:15:06 +00:00