This fixes the issue
int main() { extern char *x; }
void main1() { extern char *x; }
t2.c:5: error: incompatible types for redefinition of 'x'
(reported by Giovanni Mascellani 2019/07/16)
this is enough to let me link a tcctest.c compiled by GCC
using some current debian sid riscv64 system. It needs
linking against libgcc.a for various floating point TFmode
routines. The result runs.
the uninitialized cumofs was leading to random sizes for
the memset when initializing local structures, potentially
leading to segfaults from it. Only a problem with GNU
designated initializers, which we didn't test very well.
See testcase.
- libtcc.c/tccpp.c: fix -U option for multiple input files
- libtcc: remove decl of tcc_add_crt() for PE
- tcc.h: define __i386__ and __x86_64__ for msvc
- tcc.h: undef __attribute__ for __TINYC__ on gnu/linux platforms
- tccelf.c: disable prepare_dynamic_rel unless x86/x64
- tccpe.c: construct rather than predefine PE section flags
- tccpp.c: (alt.) fix access of dead stack variable after error/longjmp
- x86_64-gen.c: fix func_alloca chain for nocode_wanted
- tccpp.c/tccgen.c: improve file:line info for inline functions
- winapi/winnt.h: correct position for DECLSPEC_ALIGN attribute
- win32/lib/crt: simplify top exception handler (needed for signal)
- arm64-gen.c: remove dprintf left from VT_CMP commit
- tccgen.c: limit binary scan with gcase to > 8 (= smaller code)
- tccgen.c: call save_regs(4) in gen_opl for cmp-ops (see test in tcctest.c)
A more automatic approach to code suppression (aka. nocode_wanted)
The simple rules are:
- Clear 'nocode_wanted' at (im/explicit) label IF it was used
- Set 'nocode_wanted' after unconditional jumps
Also in order to test this then I did add the "function might
return no value" warning, and then to make that work again I
did add the __attribute__((noreturn)).
Also moved the look ahead label check into the type parser
to gain a little speed.
Example:
int a = 1;
void f(void)
{
int a = 2;
{
extern int a; // = 1 !!
....
To get this (more) correctly there is a new function to copy
syms between local to global stacks.
Also, this patch changes the meaning of VT_EXTERN back
to the simpler and IMO more useful notion of
DECLARED but not (yet) DEFINED.
and that for both variables and functions. That is, VT_EXTERN
in tcc doesn't have to do with the keyword 'extern' necessarily.
Also this patch does allow
int x[];
as alias for
extern int x[];
(as do gcc and msvc)
... which IMO are:
1) files don't need a _test suffix because all files in
the directory are tests ;)
2) we test the BEHAVIOR of the program, rather than its
binary bit contents.
Ok, but nobody said a test can't use two files ;)
(where the 104+_ construct is meant to prevent the file
from being picked up by the makefile as a test on its own).
Previously test 104 used a combination of *nix tools and system() calls
to emulate a `sh` script, which required split code paths for windows
due to different shell and different absolute path representation.
Also, it used a hardcoded tcc binary path, didn't set locale for sort.
Now the tools are used from a `sh` script which the program generates
and invokes, tmp files are at CWD and no conversion is required, tcc
path is taken from Makefile (exported), and `sort` uses LC_ALL=C.
there's no need for two new flags in type.t . We just can't use
VT_EXTERN as marker if functions are defined or not (like we can
for objects), and then can simply implement the rules of C99/C11
by not overwriting VT_STATIC/VT_EXTERN at all but rather only
look at them. A function already on the inline list can be
forced by removing the VT_INLINE flag, and then linkage
follows from some combination of VT_STATIC, VT_EXTERN and VT_INLINE.
- add tests for standard conformant inline functions
- implement it
The old tinycc failed to provide a conforming implementation
of non-static inlines. It would expose external symbols where it
shouldn't and hide them where it should expose them.
This commit provides a hopefully comprehensive test suite
for how things should be done. The .expect file can be obtained
by compiling the example c file (embedded in the test)
with a conforming compiler such as gcc, clang or icc and then
printing the exported symbols (e.g., with nm+awk+sort).
(The implementation currently reserves two new VT_ flags.
If anyone can provide an implementation without reserving
two extra flags, please replace mine.)
GCC wouldn't be able to implement this either (due to the separate
phases of compilation and assembly). We could allow it but it
makes not much sense and actively can confuse broken code into
segfaulting TCC. At least we can warn.
Warning exposes a problem in tcctest, and fixing that gives us
an opportunity to also test .pushsection/.popsection and .previous
directive support.
anonymous struct members were somewhat broken as the testcase
demonstrates. The reason is the jumping through hoops to fiddle
with the offsets I once introduced to avoid having to track
a cumulative offset. That's now not necessary anymore and actively
harmful, doing the obvious thing is now better.
see testcase, when the inner array dimension of multi-dimensional
VLAs isn't given TCC was generating invalid vstack accesses.
Those are actually invalid, so just diagnose them.
see testcases. A local 'extern int i' declaration needs to
refer to the global declaration, not to a local one it might
be shadowing. Doesn't seem to happen in the wild very often as
this was broken forever.
in presence of invalid source code we can't rely on the
next token to determine if we have or haven't already parsed
an initializer element, we really have to track it in some separate
state; it's a flag, so merge it with the other two we have (size_only
and first). Also add some syntax checks for situations which
formerly lead to vstack leaks, see the added testcases.
Also added a test yielding a failure with the previous definition,
i.e. when using: (va_end(ap));
The test also checks potentially incorrect va_start() definition.
sometimes abstract decls in parameter lists left the returned name
uninitialized potentially leading to segfaults, like in
int f(int ()) {
return 0;
}
Deal with this.
like on 'enum myenum { L = -1 } L;'. It's a bit tedious as
there are two paths (for global vs local symbols), and because
the scope and enum_val share same storage.
when parsing the type in this cast:
(int (__attribute__(X) *)(int))foo
we ignored the attribute, which matters if it's e.g. a 'stdcall'
attribute on the function pointer. Only this particular placement
was misparsed. Putting the attribute after the '*' or outside the inner
parens worked. This idiom seems to be used on SQLite, perhaps this
fixes a compilation problem on win32 with that.
like qualifier merging the array sizes are merged as well
for the operands of ?:, and they must not statically influence
the types of decls.
Also fix some style problems, spaces aren't stinky :)
char **argv;
_Generic(argv, char**: (void)0);
_Generic(0?(char const*)0:argv[0], char const*: (void)0);
_Generic(argv, char**: (void)0);
would fail because the type of argv would get modified by the
ternary. Now allocate a separate type on the value stack to
prevent this error.
tcc would reject e.g.,
void f(){ struct {_Bool x:_Generic(({0;}),default:1);} my_x; }
with `expected constant`. This patch makes it accept it.
(The patch also makes tcc's _Generic a little more "generic" than that
of gcc and clang in that that tcc now also accepts
`struct {_Bool x:_Generic(({0;}),default:1);} my_x;` in file scope
while gcc and clang don't, but I think there's no harm in that
and gcc and clang might as well accept it in filescope too, given
that they have no problem with
e.g., `/*filescope:*/int x=1, y=2, z=_Generic(x+y, int:3);`)
The type within the cast (int (__attribute__((foo)) *)(void))
was misparsed because of the presence of the attribute (parse_btype
prematurely concluded that (__attribute__() *) is a type.
Also see testcase. This construct is used in sqlite it seems.
linkers don't treat relocations using symindex 0 (undefined)
very well, it can't be misused as indicator for an absolute number.
Just don't bother with special casing this, rather emit an indirect
call/jump right away. ARM64 needs the same (and didn't handle
calls via constant absolute func pointers before).
The testcase as is doesn't fail without the patch, it actually
needs separate compilation (to -fPIC .o file, then to shared lib)
to fail.
misc fixes including:
- tcc.c: fix "tcc -vv" for libtcc1.a on win32/PE
- tccelf.c: fix a crash when GOT has no relocs (witn -nostdlib)
- tccelf.c: fix stab linkage for zero n_strx
- tccgen.c: fix stdcall decoration for array parameters
int __stdcall func(char buf[10]) is _func@4 (was _func@12)
- tccgen.c: fix static variables with nocode/nodata_wanted
see tests2/96_nodata_wanted.c
- tccrun.c: align sections using sh_addralign (for reliable function_alignment)
- tests2/Makefile sort 100 after 99
- win32/include/sys/stat.h fix _stat and _wstat
- x86_64-gen.c: win64/gfunc_call: fix a bug with xmmN register args
previously overwrote valid other xmmN registers eventually
which requires being able to emit an arbitrary number of NOP
instructions, which is also implemented here. For x86 we
could emit other sequences but these are the easiest.
This patch makes tcc ignore them.
Normally (as per the C standard), They should
be only applicable inside parameter arrays
and affect (const/restrict) the pointer the
array gets converted to.
[matz: fix formatting, add volatile handling, add testcase,
add comment about above deficiency]
Code like:
#include <signal.h>
int main() { _Generic(signal, int: 0); }
should fail with
error: type 'extern void (*(int, void (*)(int)))(int)' does not match any association
not
error: type 'extern void *(int)(int, void *(int))' does not match any association
[matz: fix formatting, fix function-to-pointer decay for operands of
_Generic, add testcase for this]
the rules for constant expressions in static initializers are more
relaxed than for integer constant expressions. We need to accept
0.0/0.0 in static initializers (in non-static initializers the potential
exceptions need to be raised though, so no translation-time calculation
then).
tccgen.c:
- fix ldouble asm hack
- fix a VLA problem on Win64 (also x86_64-gen.c)
- patch_type(): make sure that no symbol ever changes
from global to static
tcc.c:
- tcc -vv: print libtcc1.a path also on win32
tccpe.c, tcctools.c:
- use unix LF mode to for .def output files (that is for
creating reproducible output trees)
Makefile:
- suppress some warnings when makeinfo is missing
- call 'which install' only on win32
tests/Makefile:
- change PATH only on WINNT systems (i.e. not if cross-compiling
on linux for win32)
- asm-c-connect.test: slim output and do diff
tccrun.c tccpe.c *-link.c:
- integrate former 'pe_relocate_rva()' into normal relocation
This also fixes linkage of the unwind data on WIN64 for -run
(reported by Janus Lynggaard Thorborg)
tccasm.c, tests/tcctest.c:
- fix dot (sym_index of -1 crashed in put_elf_reloc)
- massage .set a bit (see test)
other:
- #define SECTION_ABS removed
- ST_DATA Section *strtab_section: removed
- put_extern_sym2(): take int section number
Conflicts:
tccelf.c
tccpe.c
Conflicts:
tccelf.c
See testcase (from grischka). If the asm has no .globl,
but there's a (non-static) C definition the symbol should
be exported, even if the first reference comes from asm.
The length suffix for cmovCC isn't necessary as the required register
operands always allow length deduction. But let's be nice to users
and accept them anyway. Do that without blowing up tables, which means
we don't detect invalid suffixes for the given operands, but so be it.
This makes the asm symbols use the same members as the C symbols
for global decls, e.g. using the ELF symbol to hold offset and
section. That allows us to use only one symbol table for C and
asm symbols and to get rid of hacks to synch between them.
We still need some special handling for symbols that come purely
from asm sources.
See testcase. The C and asm symtab are still separate,
but integrated tighter: the asm labels are only synched at file
end, not after each asm snippet (this fixes references from one
to another asm block), the C and asm syms are synched both ways,
so defining things in asm and refering from C, or the other way
around works. In effect this model reflects what happens with
GCC better.
For this the asm labels aren't using the C label namespace anymore,
but their own, which increases the size of each TokenSym by a pointer.
one some systems GCC defaults to PIC/PIE code which is incompatible
with a unannotated asm call to a function (getenv here). TCC doesn't
support these PIC annotations (yet), so play some pre-processor games.
fixes the problem in the testcase. A symbolic reference
from asm, which remains undefined at the end of processing is
always a global reference, not a static (STB_LOCAL) one.
This also affected the linux kernel.
win32/Makefile ("for cygwin") removed
- On cygwin, the normal ./configure && make can be used with either
cygwin's "GCC for Win32 Toolchain"
./configure --cross-prefix=i686-w64-mingw32-
or with an existing tcc:
./configure --cc=<old-tccdir>/tcc.exe
tcctest.c:
- exclude test_high_clobbers() on _WIN64 (does not work)
tests2/95_bitfield.c:
- use 'signed char' for ARM (where default 'char' is unsigned)
tests:
- remove -I "expr" diff option to allow tests with
busybox-diff.
libtcc.c, tcc.c:
- removed -iwithprefix option. It is supposed to be
combined with -iprefix which we don't have either.
tccgen.c:
- fix assignments and return of 'void', as in
void f() {
void *p, *q;
*p = *q:
return *p;
}
This appears to be allowed but should do nothing.
tcc.h, libtcc.c, tccpp.c:
- Revert "Introduce VIP sysinclude paths which are always searched first"
This reverts commit 1d5e386b0a.
The patch was giving tcc's system includes priority over -I which
is not how it should be.
tccelf.c:
- add DT_TEXTREL tag only if text relocations are actually
used (which is likely not the case on x86_64)
- prepare_dynamic_rel(): avoid relocation of unresolved
(weak) symbols
tccrun.c:
- for HAVE_SELINUX, use two mappings to the same (real) file.
(it was so once except the RX mapping wasn't used at all).
tccpe.c:
- fix relocation constant used for x86_64 (by Andrei E. Warentin)
- #ifndef _WIN32 do "chmod 755 ..." to get runnable exes on cygwin.
tccasm.c:
- keep forward asm labels static, otherwise they will endup
in dynsym eventually.
configure, Makefile:
- mingw32: respect ./configure options --bindir --docdir --libdir
- allow overriding tcc when building libtcc1.a and libtcc.def with
make XTCC=<tcc program to use>
- use $(wildcard ...) for install to allow installing just
a cross compiler for example
make cross-arm
make install
- use name <target>-libtcc1.a
build-tcc.bat:
- add options: -clean, -b bindir
tccgen.c:
doubles need to be aligned, on ARM. The section_reserve()
in init_putv does not do that.
-D ONE_SOURCE: is now the default and not longer needed. Also,
tcc.h now sets the default native target. These both make
compiling tcc simple as "gcc tcc.c -o tcc -ldl" again.
arm-asm.c:
enable pseudo asm also for inline asm
tests/tests2/Makefile:
disable bitfield tests except on windows and x86_64
and don't generate-always
tcc.c:
fix a loop with -dt on errors
configure:
print compiler version (as recognized)
tccpp.c:
actually define symbols for tcc -dt
clear static variables (needed for -dt or libtcc usage)
96_nodata_wanted.c:
use __label__ instead of asm
lib/files:
use native symbols (__i386__ etc.) instead of TCC_TARGET_...
* check that _Generic don't match unsigned char * with char *
this case is usefull as with -funsigned-char, 'char *' are unsigned
* change VT_LONG so it's now a qualifier
VT_LONG are never use for code generation, but only durring parsing state,
in _Generic we need to be able to make diference between
'long' and 'long long'
So VT_LONG is now use as a type qualifier, it's old behaviour is still
here, but we can keep trace of what was a long and what wasn't
* add TOK_CLONG and TOK_CULONG
tcc was directly converting value like '7171L' into TOK_CLLONG or
TOK_CINT depending of the machine architecture.
because of that, we was unable to make diference between a long and a
long long, which doesn't work with _Generic.
So now 7171L is a TOK_CLONG, and we can handle _Generic properly
* check that _Generic can make diference between long and long long
* uncomment "type match twice" as it should now pass tests on any platforms
* add inside_generic global
the point of this variable is to use VT_LONG in comparaison only
when we are evaluating a _Generic.
problem is with my lastest patchs tcc can now make the diference between
a 'long long' and a 'long', but in 64 bit stddef.h typedef uint64_t as
typedef signed long long int int64_t and stdint.h as unsigned long int, so tcc
break when stdint.h and stddef.h are include together.
Another solution woud be to modifie include/stddef.h so it define uint64_t as
unsigned long int when processor is 64 bit, but this could break some
legacy code, so for now, VT_LONG are use only inside generc.
* check that _Generic parse first argument correctly
* check that _Generic evaluate correctly exresion like "f() / 2"
* -dt now with lowercase t
* test snippets now separated by real preprocessor statements
which is valid C also for other compilers
#if defined test_xxx
< test snippet x >
#elif defined test_yyy
< test snippet y >
#elif ...
#endif
* simpler implementation, behaves like -run if no 'test_...' macros
are seen, works with -E too
* for demonstration I combined some of the small tests for errors
and warnings (56..63,74) in "60_errors_and_warnings.c"
Also:
* libtcc.c:
put tcc_preprocess() and tcc_assemble() under the setjmp clause
to let them return to caller after errors. This is for -dt -E.
* tccgen.c:
- get rid of save/restore_parse_state(), macro_ptr is saved
by begin_macro anyway, now line_num too.
- use expr_eq for parsing _Generic's controlling_type
- set nocode_wanted with const_wanted. too, This is to keep
VT_JMP on vtop when parsing preprocessor expressions.
* tccpp.c: tcc -E: suppress trailing whitespace from lines with
comments (that -E removes) such as
NO_GOTPLT_ENTRY,\t /* never generate ... */
The existing variable 'nocode_wanted' is now used to control
output of static data too. So...
(nocode_wanted == 0)
code and data (normal within functions)
(nocode_wanted < 0)
means: no code, but data (global or static data)
(nocode_wanted > 0)
means: no code and no data (code and data suppressed)
(nocode_wanted & 0xC0000000)
means: we're in declaration of static data
Also: new option '-dT' to be used with -run
tcc -dT -run file.c
This will look in file.c for certain comment-boundaries:
/*-* test-xxx: ...some description */
and then for each test below run it from memory. This way
various features and error messages can be tested with one
single file. See 96_nodata_wanted.c for an example.
Also: tccgen.c: one more bitfield fix
tccpp.c:
* #pragma comment(option,"-some-option")
to set commandline option from C code. May work only
for some options.
libtcc.c:
* option "-d1..9": sets a 'g_debug' global variable.
(for development)
tests2/Makefile:
* new make targets: tests2.37 / tests2.37+
run single test from tests2, optionally update .expect
* new variable GEN-ALWAYS to always generate certain .expects
* bitfields test
tccgen.c:
* bitfields: fix a bug and improve slightly more
* _Generic: ignore "type match twice"
The assembler uses the ->sym_scope member to walk up the symbols
until finding a non-automatic symbol. Since reordering the
members of Sym the sym_scope member contains a scope even for local
statics. Formerly the use of asm_label for statics was implicitely
clearing sym_scope, now we have to do that explicitely.
Add a testcase for that, and one I encountered when moving the
clearing of sym_scope too deep into the call chain (into put_extern_sym).
Like returned local variables also labels local to a statement expression
can be returned, and so their symbols must not be immediately freed
(though they need to be removed from the symbol table).
Use 2 level strategy to access packed bitfields cleanly:
1) Allow to override the original declaration type with
an auxilary "access type". This solves cases such as
struct {
...
unsigned f1:1;
};
by using VT_BYTE to access f1.
2) Allow byte-wise split accesses using two new functions
load/store_packed_bf. This solves any cases, also ones
such as
struct __attribute((packed)) _s {
unsigned x : 12;
unsigned char y : 7;
unsigned z : 28;
unsigned a: 3;
unsigned b: 3;
unsigned c: 3;
};
where for field 'z':
- VT_INT access from offset 2 would be unaligned
- VT_LLONG from offset 0 would go past the total
struct size (7)
and for field 'a' because it is in two bytes and
aligned access with VT_SHORT/INT is not possible.
Also, static bitfield initializers are stored byte-wise always.
Also, cleanup the struct_layout function a bit.
That means: we do not macro-expand the argument of 'defined'
Also: store the last token position into the TokenString
structure in order to get rid of the tok_last function.
This requires one more change in how macro arguments are expanded:
the standard requires that macro args are expanded before substituting
into the replacement list. This implies expanding them only once
even when they occur multiple times in the list. TCC expanded
them repeatedly in that case. Without __COUNTER__ that's harmless.
So, simply always expand arguments (when used without # and ##) once
and store the resulting tokens.
See added testcase 19.c from a bug report. The problem is reading outside
the arguments buffers, even though we aren't allowed to. The single
can_read_stream variable is not enough, sometimes we need to be able
to pop into outer contexts but not into arbitrarily outside contexts.
The trick is to terminate argument tokens with a EOF (and not just with
0 that makes us pop contexts), and deal with that in the few places
we have to,
This enables some cleanups of the can_read_stream variable use.
If there were more than 6 integer arguments before the ellipsis, or
there were used more than 8 slots used until the ellipsis (e.g. by
a large intermediate struct) we generated wrong code. See testcase.