This requires one more change in how macro arguments are expanded:
the standard requires that macro args are expanded before substituting
into the replacement list. This implies expanding them only once
even when they occur multiple times in the list. TCC expanded
them repeatedly in that case. Without __COUNTER__ that's harmless.
So, simply always expand arguments (when used without # and ##) once
and store the resulting tokens.
See added testcase 19.c from a bug report. The problem is reading outside
the arguments buffers, even though we aren't allowed to. The single
can_read_stream variable is not enough, sometimes we need to be able
to pop into outer contexts but not into arbitrarily outside contexts.
The trick is to terminate argument tokens with a EOF (and not just with
0 that makes us pop contexts), and deal with that in the few places
we have to,
This enables some cleanups of the can_read_stream variable use.
If there were more than 6 integer arguments before the ellipsis, or
there were used more than 8 slots used until the ellipsis (e.g. by
a large intermediate struct) we generated wrong code. See testcase.
- configure:
- add --config-uClibc,-musl switch and suggest to use
it if uClibc/musl is detected
- make warning options magic clang compatible
- simplify (use $confvars instead of individual options)
- Revert "Remove some unused-parameter lint"
7443db0d5f
rather use -Wno-unused-parameter (or just not -Wextra)
- #ifdef functions that are unused on some targets
- tccgen.c: use PTR_SIZE==8 instead of (X86_64 || ARM64)
- tccpe.c: fix some warnings
- integrate dummy arm-asm better
For integer promotion with for example arithmetics or
expr_cond (x ? y : z), integral types need to be promoted
to signed if they fit.
According to latest standards, this also applies to bit-field
types taking into account their specific width.
In tcc, VT_BITFIELD set means width < original type width
Field-widths between 33 and 63 are promoted to signed long long
accordingly.
struct { unsigned long long ullb:35; } s = { 1 };
#define X (s.ullb - 2)
int main (void)
{
long long Y = X;
printf("%d %016llx %016llx\n", X < 0, -X, -Y);
return 0;
}
Results:
GCC 4.7 : 0 0000000000000001 FFFFFFF800000001
MSVC : 1 0000000000000001 0000000000000001
TCC : 1 0000000000000001 0000000000000001
Also, gcc would promote long long bitfields of size < 32
to int as well. Example:
struct { unsigned long long x:20; } t = { 123 };
/* with gcc: */ printf("%d %d\n", t.x, 456);
/* with tcc: */ printf("%lld %d\n", t.x, 456);
* tccgen: re-allow long double constants for x87 cross
sizeof (long double) may be 12 or 16 depending on host platform
(i386/x86_64 on unix/windows).
Except that it's 8 if the host is on windows and not gcc
was used to compile tcc.
* win64: fix builtin_va_start after VT_REF removal
See also a8b83ce43a
* tcctest.c: remove outdated limitation for ll-bitfield test
It always worked, there is no reason why it should not work
in future.
* libtcc1.c: exclude long double conversion on ARM
* Makefile: remove CFLAGS from link recipes
* lib/Makefile: use target DEFINES as passed from main Makefile
* lib/armflush.c lib/va_list.c: factor out from libtcc1.c
* arm-gen.c: disable "depreciated" warnings for now
'extern int i = 42;' at file scope (but not in function scope!) is
allowed and is a proper definition, even though questionable style;
some compilers warn about this.
See the added testcase. When one used designators like .a.x to initialize
sub-members of members, and didn't then initialize all of them the
required zero-initialization of the other sub-members wasn't done.
The fix also enables tiny code cleanups.
See testcase. If an enum has only positive values, fits N bits,
and is placed in a N-bit bit-field that bit-fields must be treated
as unsigned, not signed.
This invalid function definition:
int f()[] {}
was tried to be handled but there was no testcase if it actually worked.
This fixes it and adds a TCC only testcase.
factor code a bit for transforming tokens into SValues. This revealed
a bug in TOK_GET (see testcase), which happened to be harmless before.
So fix that as well.
Our code generation assumes that it can load/store with the
bit-fields base type, so bit_pos/bit_size must be in range for this.
We could change the fields type or adjust offset/bit_pos; we do the
latter.
Checked the lcc testsuite for bitfield stuff (in cq.c and fields.c),
fixed one more error in initializing unnamed members (which have
to be skipped), removed the TODO.
- configure/Makefiles: minor adjustments
- build-tcc.bat: add -static to gcc options
(avoids libgcc_s*.dll dependency with some mingw versions)
- tccpe.c/tcctools.c: eliminate MAX_PATH
(not available for cross compilers)
- tccasm.c: use uint64_t/strtoull in unary()
(unsigned long sometimes is only uint32_t, as always on windows)
- tccgen.c: Revert (f077d16c) "tccgen: gen_cast: cast FLOAT to DOUBLE"
Was a rather experimental, tentative commit, not really necessary
and somewhat ugly too.
- cleanup recent osx support:
- Makefile/libtcc.c: cleanup copy&paste code
- tccpp.c: restore deleted function
Arg substitution leaves placeholder marker in the stream for
empty arguments. Those need to be skipped when searching for
a fnlike macro invocation in the replacement list itself. See
testcase.
Commit bb93064 changed the path seperator from ':' to ';', which was
likely accidental. While path seperator on Windows is generally ';', the
Makefile clearly expects a posix-y shell, and in such environments the
separator is ':'.
This fixes the test run in MSYS2 and MSYS(1) environments, which got
broken on bb93064 .
supports building cross compilers on the fly without need
for configure --enable-cross
$ make cross # all compilers
$ make cross-TARGET # only TARGET-compiler & its libtcc1.a
with TARGET one from
i386 x86_64 i386-win32 x86_64-win32 arm arm64 arm-wince c67
Type 'make help' for more information
the avoidance of mov im32->reg64 wasn't working when reg64 was rax.
While fixing this also fix instructions which had the REX prefix
hardcoded in opcode and so didn't support extended registers which
would have added another REX prefix.
Forgot about it. It allows to compile several
sources (and other .o's) to one single .o file;
tcc -r -o all.o f1.c f2.c f3.S o4.o ...
Also:
- option -fold-struct-init-code removed, no effect anymore
- (tcc_)set_environment() moved to tcc.c
- win32/lib/(win)crt1 minor fix & add dependency
- debug line output for asm (tcc -c -g xxx.S) enabled
- configure/Makefiles: x86-64 -> x86_64 changes
- README: cleanup
usage:
tcc -ar [rcsv] lib files...
tcc -impdef lib.dll [-v] [-o lib.def]
also:
- support more files with -c: tcc -c f1.c f2.c ...
- fix a bug which caused tcc f1.c f2.S to produce no asm
- allow tcc -ar @listfile too
- change prototype: _void_ tcc_set_options(...)
- apply -Wl,-whole-archive when a librariy is given
as libxxx.a also (not just for -lxxx)
- lib/Makefile: add (win)crt1_w.o
- crt1.c/_runtmain: return to tcc & only use for UNICODE
(because it might be not 100% reliable with for example
wildcards (tcc *.c -run ...)
- tccrun.c/tccpe.c: load -run startup_code only if called
from tcc_run(). Otherwise main may not be defined. See
libtcc_test.c
- tests2/Makefile: pass extra options in FLAGS to allow
overriding TCC
Also:
- tccpe.c: support weak attribute. (I first tried to solve
the problem above by using it but then didn't)
Also:
- in tests: generate .expect files only if not yet present,
because
1) some files were adjusted manually
2) switching git branche might change timestamps and
cause unwanted update
tests/Makefile: fix out-of-tree build issues
Also:
- win64: align(16) MEM_DEBUG user memory
on win64 the struct jmp_buf in the TCCState structure which we
allocate by tcc_malloc needs alignment 16 because the msvcrt
setjmp uses MMX instructions.
- libtcc_test.c: win32/64 need __attribute__((dllimport)) for
extern data objects
- tcctest.c: exclude stuff that gcc does not compile
except for relocation_test() the other issues are mostly ASM
related. We should probably check GCC versions but I have
no idea which mingw/gcc versions support what and which don't.
- lib/Makefile: use tcc to compile libtcc1.a (except on arm
which needs arm-asm
Some more subtle issues with code suppression:
- outputting asms but not their operand setup is broken
- but global asms must always be output
- statement expressions are transparent to code suppression
- vtop can't be transformed from VT_CMP/VT_JMP when nocode_wanted
Also remove .exe files from tests2 if they don't fail.
Also ...
tcctest.c:
- exclude stuff that gcc doesn't compile on windows.
libtcc.c/tccpp.c:
- use unsigned for memory sizes to avoid printf format warnings
- use "file:line: message" to make IDE error parsers happy.
tccgen.c: fix typo
on 32bit long long support was sometimes broken. This fixes
code-gen for long long values in switches, disables a x86-64 specific
testcase and avoid an undefined shift amount. It comments out
a bitfield test involving long long bitfields > 32 bit; with GCC layout
they can straddle multiple words and code generation isn't prepared
for this.
when an alignment is explicitely given on the member itself,
or on its types attributes then respect it always. Was only
allowed to increase before, but GCC is allowing it.
The linux kernel has some structures that are page aligned,
i.e. 4096. Instead of enlarging the bit fields to specify this,
use the fact that alignment is always power of two, and store only
the log2 minus 1 of it. The 5 bits are enough to specify an alignment
of 1 << 30.
Such struct decl:
struct S { char a; int i;} __attribute__((packed));
should be accepted and cause S to be five bytes long (i.e.
the packed attribute should matter). So we can't layout
the members during parsing already. Split off the offset
and alignment calculation for this.
See testcases. We now support 64bit case constants. At the same time
also 64bit enum constants on L64 platforms (otherwise the Sym struct
isn't large enough for now). The testcase also checks for various
cases where sign/zero extension was confused.
See testcase. We must always paste tokens (at least if not
currently substing a normal argument, which is a speed optimization
only now) but at the same time must not regard a ## token
coming from argument expansion as the token-paste operator, nor
if we constructed a ## token due to pasting itself (that was already
checked by pp/01.c).
In certain very specific situations (involving switches
with asms inside dead statement expressions) we could generate
invalid code (clobbering the buffer so much that we generated
invalid instructions). Don't emit the decision table if the
switch itself is dead.
The callee saved registers (among them r12-r15) really need
saving/restoring if mentioned in asm clobbers, even if TCC
itself doesn't use them. E.g. the linux kernel relies on that
in its switch_to() implementation.
When intializing members where the initializer needs relocations
and the member is initialized multiple times we can't allow
that to lead to multiple relocations to the same place. The last
one must win.
Similar to GCC a local asm register variable enforces the use of a
specified register in asm operands (and doesn't otherwise
matter). Works only if the variable is directly mentioned as
operand. For that we now generally store a backpointer from
an SValue to a Sym when the SValue was the result of unary()
parsing a symbol identifier.
If the destination is an indirect pointer access (which ends up
as VT_LLOCAL) the intermediate pointer must be loaded as VT_PTR,
not as whatever the pointed to type is.
If a condition is always zero/non-zero we can omit the
then or else code. This is complicated a bit by having to
deal with labels that might make such code reachable without
us yet knowing during parsing.
In statement expression we really mustn't emit backward jumps
under nocode_wanted (they will form infinte loops as no expressions
are evaluated). Do-while and explicit loop with gotos weren't
handled.
This happens when e.g. string constants (or other static data)
are passed as operands to inline asm as immediates. The produced
symbol ref wouldn't be found. So tighten the connection between
C and asm-local symbol table even more.
Our preprocessor throws away # line-comments in asm mode.
It did so also inside preprocessor directives, thereby
removing stringification. Parse defines in non-asm mode (but
retain '.' as identifier character inside macro definitions).
The return value of statement expressions might refer to local
symbols, so those can't be popped. The old error message always
was just a band-aid, and since disabling it for pointer types it
wasn't effective anyway. It also never considered that also the
vtop->sym member might have referred to such symbols (see the
testcase with the local static, that used to segfault).
For fixing this (can be seen better with valgrind and SYM_DEBUG)
simply leave local symbols of stmt exprs on the stack.
The include directive needs to be parsed as pp-tokens, not
as token (i.e. no conversion to TOK_STR or TOK_NUM). Also fix
parsing computed includes using quoted strings.
That, as well as "sym = expr", if expr contains symbols.
Slightly tricky because a definition from .set is overridable,
whereas proper definitions aren't.
This doesn't yet allow using this for override tricks from C
and global asm blocks because the symbol tables from C and asm
are separate.
But like GCC do warn about changes in signedness. The latter
leads to some changes in gen_assign_cast to not also warn about
unsigned* = int*
(where GCC warns, but only with extra warnings).
This requires correctly handling the REX prefix.
As bonus we now also support the four 8bit registers
spl,bpl,sil,dil, which are decoded as ah,ch,dh,bh in non-long-mode
(and require a REX prefix as well).
For
union U { struct {int a,b}; int c; };
union U u = {{ 1, 2, }};
The unnamed first member of union U needs to actually exist in the
structure so initializer parsing isn't confused about the double braces.
That means also the a and b members must be part of _that_, not of
union U directly. Which in turn means we need to do a bit more work
for field lookup.
See the testcase extension for more things that need to work.
Remove dead code and variables. Properly check for unions when
skipping fields in initializers. Make tests2/*.expect depend
on the .c files so they are automatically rebuilt when the latter
change.
E.g. "struct { struct S s; int a;} = { others, 42 };"
if 'others' is also a 'struct S'. Also when the value is a
compound literal. See added testcases.
This snippet is valid:
void foo(void);
... foo + 42 ...
the function designator is converted to pointer to function
implicitely. gen_op didn't do that and bailed out.
This must compile:
typedef int arrtype1[];
arrtype1 sinit19 = {1};
arrtype1 sinit20 = {2,3};
and generate two arrays of one resp. two elements. Before the fix
the determined size of the first array was encoded in the type
directly, so sinit20 couldn't be parsed anymore (because arrtype1
was thought to be only one element long).
lar can accept multiple sizes as well (wlx), like lsl. When using
autosize it's important to look at the destination operand first;
when it's a register that one determines the size, not the input
operand.
Given this code:
struct __attribute__((...)) Name {...};
TCC was eating "Name", hence generating an anonymous struct.
It also didn't apply any packed attributes to the parsed
members. Both fixed. The testcase also contains a case
that isn't yet handled by TCC (under a BROKEN #define).
In particular subtracting a defined symbol from current section
makes the value PC relative, and .org accepts symbolic expressions
as well, if the symbol is from the current section.
When tokens in macro definitions need cstr_buf inside get_tok_str,
the second might overwrite the first (happens when tokens are
multi-character non-identifiers, see testcase) in macro_is_equal,
failing to diagnose a difference. Use a real local buffer.
Now we can express prefixes with 0x0fxx opcodes we can correct the
movq mem64->xmm opcode, and restrict the movq xmm->mem64 movq to
not invalidly accept mmx.
In particular those that are extensions of existing mmx (or sse1)
instructions by a simple 0x66 prefix. There's one caveat for
x86-64: as we don't yet correctly handle the 0xf3 prefix
the movq mem64->xmm is wrong (tested in asmtest.S). Needs
some refactoring of the instr_type member.
- generate and use SYM@PLT for plt addresses
- get rid of patch_dynsym_undef hack (no idea what it did on FreeBSD)
- use sym_attrs instead of symtab_to_dynsym
- special case for function pointers into .so on i386
- libtcc_test: test tcc_add_symbol with data object
- move target specicic code to *-link.c files
- add R_XXX_RELATIVE (needed for PE)
The problem was with tcctest.c:
unsigned set;
__asm__("btsl %1,%0" : "=m"(set) : "Ir"(20) : "cc");
when with tcc compiled with the HAVE_SELINUX option, run with
tcc -run, it would use large addresses far beyond the 32bits
range when tcc did not use the pc-relative mode for accessing
'set' in global data memory. In fact the assembler did not
know about %rip at all.
Changes:
- memory operands use (%rax) not (%eax)
- conversion from VT_LLOCAL: use type VT_PTR
- support 'k' modifier
- support %rip register
- support X(%rip) pc-relative addresses
The test in tcctest.c is from Michael Matz.
Previously in order to perform a ll+ll operation tcc
was trying to 'lexpand' PTR in gen_opl which did
not work well. The case:
int printf(const char *, ...);
char t[] = "012345678";
int main(void)
{
char *data = t;
unsigned long long r = 4;
unsigned a = 5;
unsigned long long b = 12;
*(unsigned*)(data + r) += a - b;
printf("data %s\n", data);
return 0;
}
Also:
- regenerate all tests/pp/*.expect with gcc
- test "insert one space" feature
- test "0x1E-1" in asm mode case
- PARSE_FLAG_SPACES: ignore \f\v\r better
- tcc.h: move some things
tccgen.c:gv() when loading long long from lvalue, before
was saving all registers which caused problems in the arm
function call register parameter preparation, as with
void foo(long long y, int x);
int main(void)
{
unsigned int *xx[1], x;
unsigned long long *yy[1], y;
foo(**yy, **xx);
return 0;
}
Now only the modified register is saved if necessary,
as in this case where it is used to store the result
of the post-inc:
long long *p, v, **pp;
v = 1;
p = &v;
p[0]++;
printf("another long long spill test : %lld\n", *p);
i386-gen.c :
- found a similar problem with TOK_UMULL caused by the
vstack juggle in tccgen:gen_opl()
(bug seen only when using EBX as 4th register)
This was causing assembler bugs in a tcc compiled by itself
at i386-asm.c:352 when ExprValue.v was changed to uint64_t:
if (op->e.v == (int8_t)op->e.v)
op->type |= OP_IM8S;
A general test case:
#include <stdio.h>
int main(int argc, char **argv)
{
long long ll = 4000;
int i = (char)ll;
printf("%d\n", i);
return 0;
}
Output was "4000", now "-96".
Also: add "asmtest2" as asmtest with tcc compiled by itself
On 2016-08-11 09:24 +0100, Balazs Kezes wrote:
> I think it's just that that copy_params() never restores the spilled
> registers. Maybe it needs some extra code at the end to see if any
> parameters have been spilled to stack and then restore them?
I've spent some time on this and I've found an alternative solution.
Although I'm not entirely sure about it but I've attached a patch
nevertheless.
And while poking at that I've found another problem affecting the
unsigned long long division on arm and I've attached a patch for that
too.
More details in the patches themselves. Please review and consider them
for merging! Thank you!
--
Balazs
[PATCH 1/2] Fix slow unsigned long long division on ARM
The macro AEABI_UXDIVMOD expands to this bit:
#define AEABI_UXDIVMOD(name,type, rettype, typemacro) \
...
while (num >= den) { \
...
while ((q << 1) * den <= num && q * den <= typemacro ## _MAX / 2) \
q <<= 1; \
...
With the current ULONG_MAX version the inner loop goes only until 4
billion so the outer loop will progress very slowly if num is large.
With ULLONG_MAX the inner loop works as expected. The current version is
probably a result of a typo.
The following bash snippet demonstrates the bug:
$ uname -a
Linux eper 4.4.16-2-ARCH #1 Wed Aug 10 20:03:13 MDT 2016 armv6l GNU/Linux
$ cat div.c
int printf(const char *, ...);
int main(void) {
unsigned long long num, denom;
num = 12345678901234567ULL;
denom = 7;
printf("%lld\n", num / denom);
return 0;
}
$ time tcc -run div.c
1763668414462081
real 0m16.291s
user 0m15.860s
sys 0m0.020s
[PATCH 2/2] Fix long long dereference during argument passing on ARMv6
For some reason the code spills the register to the stack. copy_params
in arm-gen.c doesn't expect this so bad code is generated. It's not
entirely clear why the saving part is necessary. It was added in commit
59c35638 with the comment "fixed long long code gen bug" with no further
clarification. Given that tcctest.c passes without this, maybe it's no
longer needed? Let's remove it.
Also add a new testcase just for this. After I've managed to make the
tests compile on a raspberry pi, I get the following diff without this
patch:
--- test.ref 2016-08-22 22:12:43.380000000 +0100
+++ test.out3 2016-08-22 22:12:49.990000000 +0100
@@ -499,7 +499,7 @@
2
1 0 1 0
4886718345
-shift: 9 9 9312
+shift: 291 291 291
shiftc: 36 36 2328
shiftc: 0 0 9998683865088
manyarg_test:
More discussion on this thread:
https://lists.nongnu.org/archive/html/tinycc-devel/2016-08/msg00004.html
Except
- that libtcc1.a is now installed in subdirs i386/ etc.
- the support for arm and arm64
- some of the "Darwin" fixes
- tests are mosly unchanged
Also
- removed the "legacy links for cross compilers" (was total mess)
- removed "out-of-tree" build support (was broken anyway)
- from win32/include/winapi: various .h
The winapi header set cannot be complete no matter what. So
lets have just the minimal set necessary to compile the examples.
- remove CMake support (hard to keep up to date)
- some other files
Also, drop useless changes in win32/lib/(win)crt1.c
fixes 5c35ba66c5
Implementation was consistent within tcc but incompatible
with the ABI (for example library functions vprintf etc)
Also:
- tccpp.c/get_tok_str() : avoid "unknown format "%llu" warning
- x86_64_gen.c/gen_vla_alloc() : fix vstack leak
There were two errors in the arithmetic imm8 instruction. They accept
only REGW, and in case the user write a xxxb opcode that variant
needs to be rejected as well (it's not automatically rejected by REGW
in case the destination is memory).
Two things: negative constants were rejected (e.g. "add $-15,%eax").
Second the insn order was such that the arithmetic IM8S forms
weren't used (always the IM32 ones). Switching them prefers those
but requires a fix for size calculation in case the opcodes were
OPC_ARITH and OPC_WLX (whose size starts with 1, not zero).
Fix it to actually be able to parse 64bit immediates (enlarge
operand value type). Then, generally there's no need for accepting
IM64 anywhere, except in the 0xba+r mov opcodes, so OP_IM is
unnecessary, as is OPT_IMNO64. Improve the generated code a bit
by preferring the 0xc7 opcode for im32->reg64, instead of the
im64->reg64 form (which we therefore hardcode).
A bag of assembler fixes, to be either compatible with GAS
(e.g. order of 'test' operands), accept more instructions,
count correct foo{bwlq} variants on x86_64, fix modrm/sib bytes
on x86_64 to not use %rip relative addressing mode, to not use
invalid insns in tests/asmtest.S for x86_64.
Result is that now output of GAS and of tcc on tests/asmtest.S
is mostly the same.
Insert a space when it is required to prevent mistokenisation of
the output, and also in a few cases where it is not strictly
required, imitating GCC's behaviour.
* correct -E output for the case ++ + ++ concatenation
do this only for expanded from macro string
and only when tcc_state->output_type == TCC_OUTPUT_PREPROCESS
From gcc docs: "You may also specify attributes between the enum, struct or union tag and the name of the type rather than after the closing brace."
Adds `82_attribs_position.c` in `tests/tests2`
made like in pcc
(pcc.ludd.ltu.se/ftp/pub/pcc-docs/pcc-utf8-ver3.pdf)
We treat all chars with high bit set as alphabetic.
This allow code like
#include <stdio.h>
int Lefèvre=2;
int main() {
printf("Lefèvre=%d\n",Lefèvre);
return 0;
}