This commit fixed the problem that TCC directly cast each byte in wide string literal to wchar_t, which is wrong when wide string literal contains real wide chars.
It fixed the problem by assuming input charset is UTF-8, and wchar_t stores wide chars in UTF-16 (Windows) or UTF-32 (others).
The UTF-8 decoder is coded according to The Unicode Standard Version 10.
Too simple long parsion.
Take me a long long long time to see my mistake,
Sorry
(long long long wasn't see as an error)
This reverts commit a4cd2805f9.
As long is now a qualifier, and because compare_type will notice
that variables are not the same type, we can't use long as int anymore.
So, I've redefine __SIZE_TYPE__ as unsigned int and __PTRDIFF_TYPE__ as int.
tccgen.c:
doubles need to be aligned, on ARM. The section_reserve()
in init_putv does not do that.
-D ONE_SOURCE: is now the default and not longer needed. Also,
tcc.h now sets the default native target. These both make
compiling tcc simple as "gcc tcc.c -o tcc -ldl" again.
arm-asm.c:
enable pseudo asm also for inline asm
tests/tests2/Makefile:
disable bitfield tests except on windows and x86_64
and don't generate-always
tcc.c:
fix a loop with -dt on errors
configure:
print compiler version (as recognized)
tccpp.c:
actually define symbols for tcc -dt
clear static variables (needed for -dt or libtcc usage)
96_nodata_wanted.c:
use __label__ instead of asm
lib/files:
use native symbols (__i386__ etc.) instead of TCC_TARGET_...
* check that _Generic don't match unsigned char * with char *
this case is usefull as with -funsigned-char, 'char *' are unsigned
* change VT_LONG so it's now a qualifier
VT_LONG are never use for code generation, but only durring parsing state,
in _Generic we need to be able to make diference between
'long' and 'long long'
So VT_LONG is now use as a type qualifier, it's old behaviour is still
here, but we can keep trace of what was a long and what wasn't
* add TOK_CLONG and TOK_CULONG
tcc was directly converting value like '7171L' into TOK_CLLONG or
TOK_CINT depending of the machine architecture.
because of that, we was unable to make diference between a long and a
long long, which doesn't work with _Generic.
So now 7171L is a TOK_CLONG, and we can handle _Generic properly
* check that _Generic can make diference between long and long long
* uncomment "type match twice" as it should now pass tests on any platforms
* add inside_generic global
the point of this variable is to use VT_LONG in comparaison only
when we are evaluating a _Generic.
problem is with my lastest patchs tcc can now make the diference between
a 'long long' and a 'long', but in 64 bit stddef.h typedef uint64_t as
typedef signed long long int int64_t and stdint.h as unsigned long int, so tcc
break when stdint.h and stddef.h are include together.
Another solution woud be to modifie include/stddef.h so it define uint64_t as
unsigned long int when processor is 64 bit, but this could break some
legacy code, so for now, VT_LONG are use only inside generc.
* check that _Generic parse first argument correctly
* check that _Generic evaluate correctly exresion like "f() / 2"
* -dt now with lowercase t
* test snippets now separated by real preprocessor statements
which is valid C also for other compilers
#if defined test_xxx
< test snippet x >
#elif defined test_yyy
< test snippet y >
#elif ...
#endif
* simpler implementation, behaves like -run if no 'test_...' macros
are seen, works with -E too
* for demonstration I combined some of the small tests for errors
and warnings (56..63,74) in "60_errors_and_warnings.c"
Also:
* libtcc.c:
put tcc_preprocess() and tcc_assemble() under the setjmp clause
to let them return to caller after errors. This is for -dt -E.
* tccgen.c:
- get rid of save/restore_parse_state(), macro_ptr is saved
by begin_macro anyway, now line_num too.
- use expr_eq for parsing _Generic's controlling_type
- set nocode_wanted with const_wanted. too, This is to keep
VT_JMP on vtop when parsing preprocessor expressions.
* tccpp.c: tcc -E: suppress trailing whitespace from lines with
comments (that -E removes) such as
NO_GOTPLT_ENTRY,\t /* never generate ... */
The existing variable 'nocode_wanted' is now used to control
output of static data too. So...
(nocode_wanted == 0)
code and data (normal within functions)
(nocode_wanted < 0)
means: no code, but data (global or static data)
(nocode_wanted > 0)
means: no code and no data (code and data suppressed)
(nocode_wanted & 0xC0000000)
means: we're in declaration of static data
Also: new option '-dT' to be used with -run
tcc -dT -run file.c
This will look in file.c for certain comment-boundaries:
/*-* test-xxx: ...some description */
and then for each test below run it from memory. This way
various features and error messages can be tested with one
single file. See 96_nodata_wanted.c for an example.
Also: tccgen.c: one more bitfield fix
tccpp.c:
* #pragma comment(option,"-some-option")
to set commandline option from C code. May work only
for some options.
libtcc.c:
* option "-d1..9": sets a 'g_debug' global variable.
(for development)
tests2/Makefile:
* new make targets: tests2.37 / tests2.37+
run single test from tests2, optionally update .expect
* new variable GEN-ALWAYS to always generate certain .expects
* bitfields test
tccgen.c:
* bitfields: fix a bug and improve slightly more
* _Generic: ignore "type match twice"
The assembler uses the ->sym_scope member to walk up the symbols
until finding a non-automatic symbol. Since reordering the
members of Sym the sym_scope member contains a scope even for local
statics. Formerly the use of asm_label for statics was implicitely
clearing sym_scope, now we have to do that explicitely.
Add a testcase for that, and one I encountered when moving the
clearing of sym_scope too deep into the call chain (into put_extern_sym).
Like returned local variables also labels local to a statement expression
can be returned, and so their symbols must not be immediately freed
(though they need to be removed from the symbol table).
Use 2 level strategy to access packed bitfields cleanly:
1) Allow to override the original declaration type with
an auxilary "access type". This solves cases such as
struct {
...
unsigned f1:1;
};
by using VT_BYTE to access f1.
2) Allow byte-wise split accesses using two new functions
load/store_packed_bf. This solves any cases, also ones
such as
struct __attribute((packed)) _s {
unsigned x : 12;
unsigned char y : 7;
unsigned z : 28;
unsigned a: 3;
unsigned b: 3;
unsigned c: 3;
};
where for field 'z':
- VT_INT access from offset 2 would be unaligned
- VT_LLONG from offset 0 would go past the total
struct size (7)
and for field 'a' because it is in two bytes and
aligned access with VT_SHORT/INT is not possible.
Also, static bitfield initializers are stored byte-wise always.
Also, cleanup the struct_layout function a bit.
tcc.h:
* cleanup struct 'Sym'
* include some 'Attributes' into 'Sym'
* in turn get rid of VT_IM/EXPORT, VT_WEAK
* re-number VT_XXX flags
* replace some 'long' function args by 'int'
tccgen.c:
* refactor parse_btype()
- configure
* use aarch64 instead of arm64
- Makefile
* rename the custom include file to "config-extra.mak"
* Also avoid "rm -r /*" if $(tccdir) is empty
- pp/Makefile
* fix .expect generation with gcc
- tcc.h
* cleanup #defines for _MSC_VER
- tccgen.c:
* fix const-propagation for &,|
* fix anonymous named struct (ms-extension) and enable
-fms-extension by default
- i386-gen.c
* clear VT_DEFSIGN
- x86_64-gen.c/win64:
* fix passing structs in registers
* fix alloca (need to keep "func_scratch" below each alloca area on stack)
(This allows to compile a working gnu-make on win64)
- tccpp.c
* alternative approach to 37999a4fbf
This is to avoid some slowdown with ## token pasting.
* get_tok_str() : return <eof> for TOK_EOF
* -funsigned-char: apply to "string" literals as well
- tccpe/tools.c: -impdef: support both 32 and 64 bit dlls anyway
That means: we do not macro-expand the argument of 'defined'
Also: store the last token position into the TokenString
structure in order to get rid of the tok_last function.
This requires one more change in how macro arguments are expanded:
the standard requires that macro args are expanded before substituting
into the replacement list. This implies expanding them only once
even when they occur multiple times in the list. TCC expanded
them repeatedly in that case. Without __COUNTER__ that's harmless.
So, simply always expand arguments (when used without # and ##) once
and store the resulting tokens.
See added testcase 19.c from a bug report. The problem is reading outside
the arguments buffers, even though we aren't allowed to. The single
can_read_stream variable is not enough, sometimes we need to be able
to pop into outer contexts but not into arbitrarily outside contexts.
The trick is to terminate argument tokens with a EOF (and not just with
0 that makes us pop contexts), and deal with that in the few places
we have to,
This enables some cleanups of the can_read_stream variable use.
Simple implementation, I'm not even sure to respect C standart here,
but it should work with most use case.
This add an case in unary(), and generate TokString depending of _Generic
controlling exression, use begin_macro to "push"
the generated TokString, then call unary() again before exiting the switch
so the just add token are reevaluate again.
This reverts commit d4fe9aba3f.
I was confused by the fact that string literals aren't writable.
Nevertheless the type isn't const. As extension in GCC it's const
with -Wwrite-string, which is exactly what we had before.
GCC also wonders in a comment if it's really a good idea to change
expression types based on warning flags (IMHO it's not), but let's
be compatible. So restore the state from before.
Don't make the standard mandated types of string literals
depends on warning options. Instead make them always const,
but limit the emission of the warning by that option.
any dyn symbols. The if( !s1->static_link ) prevents tcc from
crashing when buiding a program linked to dietlibc.
The section header should not contain the number of local symbols when
the sh_size is null. This makes the header compliant and IDA will not
issue any warnings when an executable is disassembled.
Only warn if the struct has a non-zero size. You can't create objects
of zero-sized structs, but they can be used inside sizeof (e.g.
"sizeof (struct {int :0;})". The warning would always trigger for these,
but as no objects can be created no accesses can ever happen.
If there were more than 6 integer arguments before the ellipsis, or
there were used more than 8 slots used until the ellipsis (e.g. by
a large intermediate struct) we generated wrong code. See testcase.