Commit Graph

1333 Commits

Author SHA1 Message Date
K. Lange
4dc3f08fa0 Fix upvalue capture in generators 2022-07-24 15:54:36 +09:00
K. Lange
8213430712 Close upvalues even if exception exits runtime 2022-07-24 14:14:47 +09:00
K. Lange
ee86a241e0 b5 2022-07-23 08:35:00 +09:00
HarJIT
a580a835b8
Codecs revisited (#28)
* xraydict functionality and usage improvements

Add a filter_function to xraydict, allowing fewer big data structures. Make
uses of xraydict prefer exclusion sets to exclusion lists, to avoid
repeated linear search of a list.

* Make `big5_coded_forms_from_hkscs` a set, remove set trailing commas.

* Remove `big5_coded_forms_from_hkscs` in favour of a filter function.

* Similarly, use sets for 7-bit exclusion lists except when really short.

* Revise mappings for seven 78JIS codepoints.

Mappings for 25-23 and 90-22 were previously the same as those used for
97JIS; they have been swapped to correspond with how the IBM extension
versus the standard code are mapped in the "old sequence" (78JIS-based)
as opposed to the "new sequence".

Mappings for 32-70, 34-45, 35-29, 39-77 and 54-02 in 78JIS have been
changed to reflect disunifications made in 2000-JIS and 2004-JIS, assigning
the 1978-edition unsimplified variants of those characters separate coded
forms (where previously, only swaps and disunifications in 83JIS and
disunifications in 90JIS (including JIS X 0212) had been considered).

This only affects the `jis_encoding` codec (including the decoding
direction for `iso-2022-jp-2`, `iso-2022-jp-3` and `iso-2022-jp-2004`),
and the decoding is only affected when `ESC $ @` (not `ESC $ B`) is used.
The `iso-2022-jp` codec is unaffected, and remains similar to (but more
consistently pedantic than) the WHATWG specification, thus using the same
table for both 78JIS and 97JIS.

* Make `johab-ebcdic` decoder use many-to-one, not corporate PUA.

Many-to-one decodes are not uncommon in CJK encodings (e.g. Windows-31J),
and mapping to the IBM Corporate PUA (code page 1449) would probably make
it render as completely the wrong character if at all in practice.

* Switch `cp950_no_eudc_encoding_map` away from a hardcoded exclusion list.

* Codec support for `x-mac-korean`.

* Add a test bit for the UTF-8 wrapper.

* Document the unique error-condition definition of the ISO-2022-JP codec.

* Update docs now there is an actual implementation for `x-mac-korean`.

* Further explanations of the hazards of `jis_encoding`.

* Sanitised → Sanitised or escaped.

* Further clarify the status with not verifying Shift In.

* Corrected description of End State 2.

* Changes to MacKorean to avoid mapping non-ASCII using ASCII punctuation.

* Extraneous word "still".

* Fix omitting MacKorean single-byte codes.
2022-07-23 08:32:54 +09:00
K. Lange
79a4a02b58 Resolve long-standing left/right binding issue with '**' 2022-07-20 09:21:21 +09:00
K. Lange
b966db7b91 Bump serial to b4 2022-07-20 08:15:54 +09:00
K. Lange
825dbe1c94 Support type hints on locals 2022-07-20 08:15:26 +09:00
K. Lange
c7391806fa Cleanup exported symbols in rline 2022-07-16 14:34:14 +09:00
K. Lange
eae29dd6bf Avoid unnecesasry type checking in tableGet_fast 2022-07-16 13:54:20 +09:00
K. Lange
6eda98161d Be more clear on OPERANDS being unsigned 2022-07-16 13:27:38 +09:00
K. Lange
d027b86861 Bump serial, 1.3.0b3 2022-07-15 09:52:42 +09:00
K. Lange
d73d7bdef9 -m dis should recurse 2022-07-15 08:06:34 +09:00
K. Lange
e22410d81b Still need to advance into that string token... 2022-07-13 22:13:39 +09:00
K. Lange
77c38ea7cd Only print actual expressions in repl; fix dumb hacks 2022-07-13 21:31:48 +09:00
K. Lange
a20c89fe2f Support -m dis with a dis.krk pseudomodule 2022-07-13 09:22:19 +09:00
K. Lange
f8acd8a75b Fix segfault in syntax error when previous token is synthetic 2022-07-13 09:21:57 +09:00
K. Lange
b7e1454d83 int in bytes 2022-07-12 21:42:22 +09:00
K. Lange
cf4683e4e7 tuple.__mul__ 2022-07-12 21:35:26 +09:00
K. Lange
56c1bbe231 Tab-complete builtin module names after 'import', 'from' 2022-07-12 20:42:18 +09:00
K. Lange
d3326885de Bump serial number 2022-07-12 09:42:57 +09:00
K. Lange
6e2ba5f060 Allow functions to be built from codeobjects, upvalues, dict/instance 2022-07-12 09:42:33 +09:00
K. Lange
316d1219a2 Bind globals to functions, not codeobjects 2022-07-12 09:41:48 +09:00
K. Lange
7dc754a519 Update outdated comment about enum values for opcodes 2022-07-12 08:13:26 +09:00
K. Lange
3472fcfb6b Accept trailing comma in set literal 2022-07-12 06:23:16 +09:00
K. Lange
ed81bc9c03 Randomize the opcode table. 2022-07-12 05:41:35 +09:00
K. Lange
212efab01b Consolidate opcode definitions and do not expose the macros in public headers 2022-07-11 21:02:35 +09:00
K. Lange
ff43e94054 Cleanup common method invocation instructions 2022-07-11 13:41:10 +09:00
K. Lange
431d347568 OP_TEST_ARG 2022-07-11 11:44:13 +09:00
K. Lange
eeca53e4f1 Fix bad argument collection with optional positionals 2022-07-11 10:03:00 +09:00
K. Lange
258527ef7b Fill out missing tokens in parse table for debugging; remove TOKEN_ from string names 2022-07-11 09:33:50 +09:00
K. Lange
be3c8a9ba4 Set release serial, shorten -beta to b 2022-07-11 07:39:40 +09:00
K. Lange
15014df397 Add kuroko.hexversion 2022-07-11 07:17:58 +09:00
K. Lange
71931151e4 Let MAKE_STRING handle the = prefix, after FORMAT_VALUE swaps it around 2022-07-10 19:36:14 +09:00
K. Lange
feebd6e6a8 Emit emit string not MAKE_STRING 0 2022-07-10 19:16:01 +09:00
K. Lange
a3b2722707 Unicode character for fill in __format__ 2022-07-10 19:12:16 +09:00
K. Lange
0c101079d4 Support Unicode strings in argument to str.*strip 2022-07-10 18:49:42 +09:00
K. Lange
aa97b3762d f'{foo=}' should default to !r if no = or !s 2022-07-10 17:58:00 +09:00
K. Lange
ec6a896a04 Display textual representation of FORMAT_VALUE type in dis like CPython does 2022-07-10 17:55:25 +09:00
K. Lange
391a4d79db Fixup concatenating unalike string tokens in compiler 2022-07-10 17:44:06 +09:00
K. Lange
e5f4208f6a str.__format__ 2022-07-10 17:04:11 +09:00
K. Lange
cb1bfa4b93 Cleanup, fix, break out common format string parsing 2022-07-10 17:04:02 +09:00
K. Lange
7d409ebcbb Format spec support in f-strings 2022-07-10 16:11:12 +09:00
K. Lange
9230d4fee1 int.__format__, long.__format__ with as close to Python semantics as I can be bothered 2022-07-10 16:10:40 +09:00
K. Lange
d8a1861c23 format() and object.__format__() 2022-07-10 16:10:07 +09:00
K. Lange
f1c0af711e Cache __format__ method 2022-07-10 16:09:32 +09:00
K. Lange
f24cb336e7 Fixup SET_GLOBAL to use IfExists, no need for delete 2022-07-10 13:19:15 +09:00
K. Lange
9b5ce15bf7 Implement Python's identifier mangling 2022-07-10 13:13:27 +09:00
K. Lange
d00bdda104 Strings, too... 2022-07-09 21:55:08 +09:00
K. Lange
17a6aaf8d6 Fix parse error when 'if' ends on a class 2022-07-09 21:10:36 +09:00
K. Lange
95c6f17a21 Remove accidentally committed test file 2022-07-09 20:56:07 +09:00