Commit Graph

  • 6d0e342e1b Attach module name to classes like Python does K Lange 2021-02-24 22:48:04 +09:00
  • e3f774065d Module name for repl should be __main__, <module> is what we label the module-level function... K Lange 2021-02-24 22:47:33 +09:00
  • 0f5786ed62 Make -m work with dotted imports K. Lange 2021-02-24 18:16:33 +09:00
  • 73feea3cc0 Makefile should also install tools K. Lange 2021-02-24 12:54:54 +09:00
  • db4185d035 I think that should fix up installs to /usr/local ? K. Lange 2021-02-24 11:22:18 +09:00
  • 4fc510034d Add some functions for unloading modules K. Lange 2021-02-24 10:13:10 +09:00
  • 3945e3865f Changes to how uncaught exceptions are handled. K. Lange 2021-02-24 09:08:21 +09:00
  • 5e16d8d31d
    Merge pull request #16 from kuroko-lang/master HarJIT 2021-02-23 07:53:31 +00:00
  • 361aeb3927 Fixup LDLIBS linking for libkuroko, interpreter, tools, modules? K. Lange 2021-02-23 16:30:10 +09:00
  • 7df8c0803f Don't use 'git' to do comparison for tests K. Lange 2021-02-23 14:01:55 +09:00
  • 617c1e8eec Add docs/syntax.md just to have the page exist K. Lange 2021-02-23 13:00:47 +09:00
  • fec2e69317 Fix int(bool) to return int K. Lange 2021-02-23 12:40:36 +09:00
  • 21afbb3fb1 Actually try to determine the required min libc for deb package K. Lange 2021-02-23 10:50:37 +09:00
  • e626fefbb8 bools should be derived from ints and valid for int operations implicitly K. Lange 2021-02-23 10:02:34 +09:00
  • 5c0c3d938b Inhibit decoding ASCII from JIS_Encoding double byte states, coercing it to fullwidth. HarJIT 2021-02-22 22:47:00 +00:00
  • 5c8126d65d Add a pseudo-grammar in EBNF? It's not necessarily accurate, but it's close enough to be a useful reference K. Lange 2021-02-23 07:38:37 +09:00
  • fb17f7092a Don't penalise an ASCII→ASCII switch at the very start of an ISO-2022-JP stream. HarJIT 2021-02-22 22:21:16 +00:00
  • 89bfeb82b2
    Merge pull request #15 from kuroko-lang/master HarJIT 2021-02-22 21:51:37 +00:00
  • de9270ccca Move the delta-ing tool to the tools directory. HarJIT 2021-02-22 21:34:53 +00:00
  • b990d82e19 Make the Johab codecs use xraydict. HarJIT 2021-02-22 21:32:10 +00:00
  • 6ee2a92502 Fixes to the xraydict. HarJIT 2021-02-22 21:24:26 +00:00
  • 3b1624a4a5 Introduce xraydict type as hopefully a more efficient way to deal with common subsets of mappings. HarJIT 2021-02-22 21:22:10 +00:00
  • 584dc60a7f Ability to encode (not just decode) JIS X 0212 in EUC-JP, both via an alternative non-WHATWG "euc-jp-full" encoder, and as a fall-back in the euc-jis-2004 encoder (and making the euc-jis-2004 decoder accept it). HarJIT 2021-02-22 20:40:06 +00:00
  • 0b93de745d repr for incremental encoders and decoders, and include addresses in repr. HarJIT 2021-02-22 19:23:22 +00:00
  • 092c6fb4d2 OP_INVOKE_CONTAINS, related changes K. Lange 2021-02-22 18:32:50 +09:00
  • a7bae46791
    Merge pull request #14 from kuroko-lang/master HarJIT 2021-02-22 09:27:05 +00:00
  • f3a6b6d713 Try to show the escapes in the error messages for ISO-2022-JP scrutiny. HarJIT 2021-02-22 09:25:16 +00:00
  • 0216021b6d OP_INVOKE_ITER K. Lange 2021-02-22 15:15:02 +09:00
  • ed1e8aebba Catch bad argument ordering in function declaration K. Lange 2021-02-22 14:43:35 +09:00
  • f74c6c77b1 for ...;... should not accept multiple expressions for condition K. Lange 2021-02-22 14:19:28 +09:00
  • 6fee62f341 Fix parsing of nested ternary K. Lange 2021-02-22 13:51:45 +09:00
  • 6fa465a23e Correct a very out of date source comment. HarJIT 2021-02-21 21:12:19 +00:00
  • 026a1ac9d6 Fix TypeError when passing "¥" to "shift_jis" encoder. HarJIT 2021-02-21 21:10:55 +00:00
  • 190dbdd635 Use \[ notation rather than \x1b notation. HarJIT 2021-02-21 20:42:47 +00:00
  • daca646094 Add the ISO-2022-JP versus JIS_Encoding escape switch scrutiny to the test. HarJIT 2021-02-21 20:40:07 +00:00
  • a2643bf7e5 Make the JIS_encoding decoder not suffer from the ISO-2022-JP concatenation problem. HarJIT 2021-02-21 15:33:40 +00:00
  • e021cddafb Add escape code scrutiny to ISO-2022-JP. This includes measures required by WHATWG (generate an error condition on zero-length spans between G0 redesignations) but also additional measures (i.e. error conditions on ASCII→ASCII, JIS-Roman→JIS-Roman, or cases of ASCII→JIS-Roman and JIS-Roman→ASCII which do not make sense in any obvious encoder design). HarJIT 2021-02-21 15:22:17 +00:00
  • 0775335ae1 Clear html5name on set-endian UTF-32 because they aren't HTML5 encodings. HarJIT 2021-02-21 13:45:27 +00:00
  • 0cebe88242 Altered source comments on UTF-32 since it is not in fact WHATWG-specified. HarJIT 2021-02-21 13:44:29 +00:00
  • e3851600da Make treatment of control codes encountered in DBCS-Host mode more consistent with respect to Eight Ones. HarJIT 2021-02-21 13:40:45 +00:00
  • 13fdc442a1 More meaningful error message for a DBCS-Host sequence truncated by a control or space character. HarJIT 2021-02-21 13:37:22 +00:00
  • c02e83a952 Fix more UTF-16 bugs, and add UTF-32. HarJIT 2021-02-21 11:21:06 +00:00
  • ca19d857cd Correct some mistakes in the UTF-16 codec and ISO-2022-JP-3 handling of NEC Special Characters. HarJIT 2021-02-21 10:52:15 +00:00
  • 0fa7736d11 Use a newer / more comprehensive version of the EBCDIC Johab non-Hangul set (from IBM-1364 rather than IBM-933). HarJIT 2021-02-21 09:07:34 +00:00
  • 8a13b981fe
    Merge pull request #13 from kuroko-lang/master HarJIT 2021-02-21 09:03:36 +00:00
  • 3c8855ddcf Fix display of inf, nan K. Lange 2021-02-21 09:55:09 +09:00
  • d3ade45d73 Add EUC-JIS-2004 and Shift_JIS-2004. HarJIT 2021-02-20 23:51:50 +00:00
  • c19a6955cb Change encmap→encode, decmap→decode for consistency and readability. HarJIT 2021-02-20 23:19:16 +00:00
  • 91a25f32ba Make extradata.krk (since its data is from disparate sources, unlike dbdata.krk) not glitch out text editors by keeping every line below 4KiB, by replacing some spaces with linefeeds. HarJIT 2021-02-20 21:18:45 +00:00
  • db6af22f94 Add some more labels recognised by Python that can reasonably be associated with already-implemented codecs. HarJIT 2021-02-20 20:20:45 +00:00
  • f12044d7a3 Some housekeeping renames (tools/codecs→tools/codectools, and the webname property to the less ambiguous-in-intent html5name). HarJIT 2021-02-20 18:54:01 +00:00
  • 7672e16e65 Shift EBCDIC out of DBCS-Host state when finalising. HarJIT 2021-02-20 17:17:59 +00:00
  • c56c530880
    Merge pull request #12 from kuroko-lang/master HarJIT 2021-02-20 15:40:07 +00:00
  • aacd16b430 Get the ISO-2022-JP-1,2,3,2004 encoders to preferentially encode to the other JIS planes over the NEC extensions. Hopefully. HarJIT 2021-02-20 15:39:14 +00:00
  • 62331d2db3 cp50220 would seem to be the more precise IANA-registered name for what WHATWG calls iso-2022-jp (without accepting cp50220 as an alias). I'm not entirely sure what to do with the labelling of the other encodings since I'm treating their ESC $ B sets the same as in cp50220, and treating their ESC $ @ sets with NEC extensions and IBM character backports. HarJIT 2021-02-20 15:00:31 +00:00
  • ae4a6130e4 IncrementalEncoder and IncrementalDecoder base classes, enforce descent from them as a sanity-checking measure. HarJIT 2021-02-20 14:46:28 +00:00
  • 789960ae9c Use self-explanatory label on both encoder and decoder for the Johabs. HarJIT 2021-02-20 13:44:05 +00:00
  • 31c10e20f0 Add Johab support (both ASCII-based and EBCDIC-based versions). HarJIT 2021-02-20 13:42:54 +00:00
  • 4235c74138 Generate indexes, fixup CSS for mobile K. Lange 2021-02-20 22:31:59 +09:00
  • 379e1846a9 Stringify floats with more digits K Lange 2021-02-20 21:48:47 +09:00
  • a5ff538dc1 Write a bunch more docs K. Lange 2021-02-20 20:44:07 +09:00
  • 6951a6d09d Better approach there (reject attempts to open a GE sequence in the middle of a DBCS character). HarJIT 2021-02-20 10:18:29 +00:00
  • 18df6f3b67 Make EBCDIC encoder/decoder theoretically symmetrical in re use of GE from the DBCS-Host mode. HarJIT 2021-02-20 10:16:44 +00:00
  • 8693947f52 EBCDIC support. Just to Python parity level for the time being, but with theoretical support for multi-byte (DBCS-Host and Graphic Escape) features not leveraged by any subclass yet (and thus untested). HarJIT 2021-02-20 10:03:35 +00:00
  • c5b04d55b1 Natively support @classmethod since we intentionally broke using @property for it K. Lange 2021-02-20 16:04:26 +09:00
  • 76d53eb198 Various doc improvements K. Lange 2021-02-20 15:43:47 +09:00
  • 663de74695 Actually give new property objects a class so we can do things with them directly K. Lange 2021-02-20 15:43:22 +09:00
  • d5d3d721e7 The big documentation system overhaul K. Lange 2021-02-20 14:10:36 +09:00
  • 32b957aff7 Be more usefully descriptive of argument types in function.__args__ K. Lange 2021-02-20 14:08:05 +09:00
  • d55071c291 Fix incorrect assignment of *args name to keywordArgNames K. Lange 2021-02-20 14:07:41 +09:00
  • b484dce5b2 ISO-2022-JP-Ext (as defined by Python) encoder. HarJIT 2021-02-19 21:57:08 +00:00
  • ca13a11da2 Parity-ish with Python's coverage of single-byte non-EBCDIC codes (i.e. add support for all such codes supported by Python and neither included nor aliased to another by WHATWG). HarJIT 2021-02-19 21:39:47 +00:00
  • 96f6572da5 Pull some repeated code into a BaseDecoder superclass. HarJIT 2021-02-19 19:29:43 +00:00
  • baa675f207 Implement KurokoCodecInfo.__repr__. HarJIT 2021-02-19 18:15:32 +00:00
  • ad439d4241 Completely abandon the Python-defined format for getstate/setstate state objects in favour of making them arbitrary objects for simplicity. HarJIT 2021-02-19 17:41:40 +00:00
  • e3ca583afc
    Merge pull request #11 from kuroko-lang/master HarJIT 2021-02-19 17:26:36 +00:00
  • 298f429c30 Add a very simple embedding demo for doc purposes K. Lange 2021-02-19 21:25:15 +09:00
  • 2af6291834 Minor doxygen comment cleanup K. Lange 2021-02-19 21:24:59 +09:00
  • 35ddf81c88 isinstance() should be able to take a tuple of types K. Lange 2021-02-19 21:24:41 +09:00
  • 8fc9ce128b Add getattr(), kuroko.importmodule() K. Lange 2021-02-19 21:05:23 +09:00
  • e9f99044ea Add decoding (only) of the halfwidth katakana set by escape sequence to ISO-2022-JP-2 (like in the main ISO-2022-JP codec, and per them being leniently accepted by ICU (ICU's ISO-2022-JP-1 = Python's ISO-2022-JP-EXT)), and add distinct decoding of the 78JIS escape to all except the main ISO-2022-JP one (since it would have broken WHATWG compliance in that case). Note that still only JIS_Encoding will ever *generate* a 78JIS escape. HarJIT 2021-02-19 08:46:04 +00:00
  • e53fd82e26
    Merge pull request #10 from kuroko-lang/master HarJIT 2021-02-19 08:25:50 +00:00
  • 71272337d9 Also accept (but not generate) \[(H for JIS-Roman in JIS_encoding. HarJIT 2021-02-19 08:22:47 +00:00
  • 3d9530ed19 Doxygen C API stuff is pretty much ready to ship, so here it is... K. Lange 2021-02-19 13:56:54 +09:00
  • f14cc086e0 Just always enable threading outside of Emscripten and Windows K. Lange 2021-02-19 13:32:45 +09:00
  • 79cc3bdac1 More header cleanup, rename some stuff K. Lange 2021-02-19 12:30:39 +09:00
  • 6f1eefe68d Make isObjType a macro, mostly to appease llvm that complains about the static inline usage in an externed function K. Lange 2021-02-19 11:14:40 +09:00
  • d1535de8d2 Elminate calls to sprintf K Lange 2021-02-19 11:06:07 +09:00
  • 03d917bf40 Doc comments for exceptions.c K. Lange 2021-02-19 08:18:06 +09:00
  • c8ebd3d1a3 Some more labels for JIS_encoding and ISO-2022-JP-2. HarJIT 2021-02-18 22:15:26 +00:00
  • 3d0b141493
    Merge pull request #9 from kuroko-lang/master HarJIT 2021-02-18 21:37:43 +00:00
  • bb4ea67c18 Add ISO-2022-JP-3 and ISO-2022-JP-2004 codecs, and add the rest of JP-2 to the all-encompassing jis_encoding codec. HarJIT 2021-02-18 21:34:13 +00:00
  • 718e267e6e Add ISO-2022-JP-1 and ISO-2022-JP-2 codecs. HarJIT 2021-02-18 21:14:48 +00:00
  • b3a4cf5656 Some fixes to the encoding code for ISO-2022-JP-2 style single-shifts, not that they're being utilised in practice by the current encoding roster. HarJIT 2021-02-18 20:40:19 +00:00
  • 324e5c6090 Add experimental JIS_Encoding codec extending the ISO-2022-JP encoding, and fix some of the hooks in the latter now that I can test them. HarJIT 2021-02-18 19:54:06 +00:00
  • 803df0be36 Hook support decoding to Unicode sequences from ISO-2022-JP 2-byte codes. HarJIT 2021-02-18 18:57:57 +00:00
  • 3992182742 Also, hooks for super-shifts (but not Shift Out) on the encoder. HarJIT 2021-02-18 18:36:28 +00:00
  • cbffda1f6d Decoder-only (for now) subclass hooks for Shift Out katakana and ISO-2022-JP-2 super-shift sets. HarJIT 2021-02-18 18:22:26 +00:00
  • f94988f741 Prefix CallFrame to KrkCallFrame since it is in our public API K Lange 2021-02-18 21:54:58 +09:00
  • c567204f65 More doc comments K. Lange 2021-02-18 21:00:27 +09:00