The existing variable 'nocode_wanted' is now used to control
output of static data too. So...
(nocode_wanted == 0)
code and data (normal within functions)
(nocode_wanted < 0)
means: no code, but data (global or static data)
(nocode_wanted > 0)
means: no code and no data (code and data suppressed)
(nocode_wanted & 0xC0000000)
means: we're in declaration of static data
Also: new option '-dT' to be used with -run
tcc -dT -run file.c
This will look in file.c for certain comment-boundaries:
/*-* test-xxx: ...some description */
and then for each test below run it from memory. This way
various features and error messages can be tested with one
single file. See 96_nodata_wanted.c for an example.
Also: tccgen.c: one more bitfield fix
tccpp.c:
* #pragma comment(option,"-some-option")
to set commandline option from C code. May work only
for some options.
libtcc.c:
* option "-d1..9": sets a 'g_debug' global variable.
(for development)
tests2/Makefile:
* new make targets: tests2.37 / tests2.37+
run single test from tests2, optionally update .expect
* new variable GEN-ALWAYS to always generate certain .expects
* bitfields test
tccgen.c:
* bitfields: fix a bug and improve slightly more
* _Generic: ignore "type match twice"
The assembler uses the ->sym_scope member to walk up the symbols
until finding a non-automatic symbol. Since reordering the
members of Sym the sym_scope member contains a scope even for local
statics. Formerly the use of asm_label for statics was implicitely
clearing sym_scope, now we have to do that explicitely.
Add a testcase for that, and one I encountered when moving the
clearing of sym_scope too deep into the call chain (into put_extern_sym).
Like returned local variables also labels local to a statement expression
can be returned, and so their symbols must not be immediately freed
(though they need to be removed from the symbol table).
Use 2 level strategy to access packed bitfields cleanly:
1) Allow to override the original declaration type with
an auxilary "access type". This solves cases such as
struct {
...
unsigned f1:1;
};
by using VT_BYTE to access f1.
2) Allow byte-wise split accesses using two new functions
load/store_packed_bf. This solves any cases, also ones
such as
struct __attribute((packed)) _s {
unsigned x : 12;
unsigned char y : 7;
unsigned z : 28;
unsigned a: 3;
unsigned b: 3;
unsigned c: 3;
};
where for field 'z':
- VT_INT access from offset 2 would be unaligned
- VT_LLONG from offset 0 would go past the total
struct size (7)
and for field 'a' because it is in two bytes and
aligned access with VT_SHORT/INT is not possible.
Also, static bitfield initializers are stored byte-wise always.
Also, cleanup the struct_layout function a bit.
tcc.h:
* cleanup struct 'Sym'
* include some 'Attributes' into 'Sym'
* in turn get rid of VT_IM/EXPORT, VT_WEAK
* re-number VT_XXX flags
* replace some 'long' function args by 'int'
tccgen.c:
* refactor parse_btype()
- configure
* use aarch64 instead of arm64
- Makefile
* rename the custom include file to "config-extra.mak"
* Also avoid "rm -r /*" if $(tccdir) is empty
- pp/Makefile
* fix .expect generation with gcc
- tcc.h
* cleanup #defines for _MSC_VER
- tccgen.c:
* fix const-propagation for &,|
* fix anonymous named struct (ms-extension) and enable
-fms-extension by default
- i386-gen.c
* clear VT_DEFSIGN
- x86_64-gen.c/win64:
* fix passing structs in registers
* fix alloca (need to keep "func_scratch" below each alloca area on stack)
(This allows to compile a working gnu-make on win64)
- tccpp.c
* alternative approach to 37999a4fbf
This is to avoid some slowdown with ## token pasting.
* get_tok_str() : return <eof> for TOK_EOF
* -funsigned-char: apply to "string" literals as well
- tccpe/tools.c: -impdef: support both 32 and 64 bit dlls anyway
Simple implementation, I'm not even sure to respect C standart here,
but it should work with most use case.
This add an case in unary(), and generate TokString depending of _Generic
controlling exression, use begin_macro to "push"
the generated TokString, then call unary() again before exiting the switch
so the just add token are reevaluate again.
This reverts commit d4fe9aba3f.
I was confused by the fact that string literals aren't writable.
Nevertheless the type isn't const. As extension in GCC it's const
with -Wwrite-string, which is exactly what we had before.
GCC also wonders in a comment if it's really a good idea to change
expression types based on warning flags (IMHO it's not), but let's
be compatible. So restore the state from before.
Don't make the standard mandated types of string literals
depends on warning options. Instead make them always const,
but limit the emission of the warning by that option.
Only warn if the struct has a non-zero size. You can't create objects
of zero-sized structs, but they can be used inside sizeof (e.g.
"sizeof (struct {int :0;})". The warning would always trigger for these,
but as no objects can be created no accesses can ever happen.
- configure:
- add --config-uClibc,-musl switch and suggest to use
it if uClibc/musl is detected
- make warning options magic clang compatible
- simplify (use $confvars instead of individual options)
- Revert "Remove some unused-parameter lint"
7443db0d5f
rather use -Wno-unused-parameter (or just not -Wextra)
- #ifdef functions that are unused on some targets
- tccgen.c: use PTR_SIZE==8 instead of (X86_64 || ARM64)
- tccpe.c: fix some warnings
- integrate dummy arm-asm better
For integer promotion with for example arithmetics or
expr_cond (x ? y : z), integral types need to be promoted
to signed if they fit.
According to latest standards, this also applies to bit-field
types taking into account their specific width.
In tcc, VT_BITFIELD set means width < original type width
Field-widths between 33 and 63 are promoted to signed long long
accordingly.
struct { unsigned long long ullb:35; } s = { 1 };
#define X (s.ullb - 2)
int main (void)
{
long long Y = X;
printf("%d %016llx %016llx\n", X < 0, -X, -Y);
return 0;
}
Results:
GCC 4.7 : 0 0000000000000001 FFFFFFF800000001
MSVC : 1 0000000000000001 0000000000000001
TCC : 1 0000000000000001 0000000000000001
Also, gcc would promote long long bitfields of size < 32
to int as well. Example:
struct { unsigned long long x:20; } t = { 123 };
/* with gcc: */ printf("%d %d\n", t.x, 456);
/* with tcc: */ printf("%lld %d\n", t.x, 456);
bit_pos + bit_size > type_size * 8
must NEVER happen because the code generator can read/write
only the basic integral types.
Warn if tcc has to break GCC compatibility for that reason
and if -Wgcc-compat is given.
Example:
struct __attribute__((packed)) _s
{
unsigned int x : 12;
unsigned char y : 7;
unsigned int z : 28;
};
Expected (GCC) layout (sizeof struct = 6)
.xxxxxxxx.xxxxyyyy.yyyzzzzz.zzzzzzzz.zzzzzzzz.zzzzzzz0.
But we cannot read/write 'char y'from 2 bytes in memory.
So we have to adjust:
.xxxxxxxx.xxxx0000.yyyyyyyz.zzzzzzzz.zzzzzzzz.zzzzzzzz.zzz00000
Now 'int z' cannot be accessed from 5 bytes. So we arrive
at this (sizeof struct = 7):
.xxxxxxxx.xxxx0000.yyyyyyy0.zzzzzzzz.zzzzzzzz.zzzzzzzz.zzzz0000
Otherwise the bitfield load/store generator needs to be
changed to allow byte-wise accesses.
Also we may touch memory past the struct in some cases
currently. The patch adds a warning for that too.
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: 81 ec 04 00 00 00 sub $0x4,%esp
9: 90 nop
struct __attribute__((packed)) { unsigned x : 5; } b = {0} ;
a: 8b 45 ff mov -0x1(%ebp),%eax
d: 83 e0 e0 and $0xffffffe0,%eax
10: 89 45 ff mov %eax,-0x1(%ebp)
This touches -0x1 ... +0x3(%ebp), hence 3 bytes beyond
stack space. Since the data is not changed, nothing
else happens here.
* tccgen: re-allow long double constants for x87 cross
sizeof (long double) may be 12 or 16 depending on host platform
(i386/x86_64 on unix/windows).
Except that it's 8 if the host is on windows and not gcc
was used to compile tcc.
* win64: fix builtin_va_start after VT_REF removal
See also a8b83ce43a
* tcctest.c: remove outdated limitation for ll-bitfield test
It always worked, there is no reason why it should not work
in future.
* libtcc1.c: exclude long double conversion on ARM
* Makefile: remove CFLAGS from link recipes
* lib/Makefile: use target DEFINES as passed from main Makefile
* lib/armflush.c lib/va_list.c: factor out from libtcc1.c
* arm-gen.c: disable "depreciated" warnings for now
'extern int i = 42;' at file scope (but not in function scope!) is
allowed and is a proper definition, even though questionable style;
some compilers warn about this.
See the added testcase. When one used designators like .a.x to initialize
sub-members of members, and didn't then initialize all of them the
required zero-initialization of the other sub-members wasn't done.
The fix also enables tiny code cleanups.
See testcase. If an enum has only positive values, fits N bits,
and is placed in a N-bit bit-field that bit-fields must be treated
as unsigned, not signed.
This invalid function definition:
int f()[] {}
was tried to be handled but there was no testcase if it actually worked.
This fixes it and adds a TCC only testcase.
introduce common_section (SHN_COMMON), factorize some handling
in decl_initializer_alloc, add section_add and use it to factorize
some code that allocates stuff in sections (at the same time also fixing
harmless bugs re section alignment), use init_putv to emit float consts
into .data from gv() (fixing an XXX).
factor code a bit for transforming tokens into SValues. This revealed
a bug in TOK_GET (see testcase), which happened to be harmless before.
So fix that as well.
The canonical way to describe a local variable that actually holds
the address of an lvalue is VT_LLOCAL. Remove the last user of VT_REF,
and handling of it, thereby freeing a flag for SValue.r.
VT_LLOCAL is a flag on .r, not on type.t. Fixing this requires
minor surgery for compound literals which accidentally happened
to be subsumed by the bogus test.
the second argument can be an arbitrary expression (including
side-effects), not just a constant. This removes the last user
of expr_lor_const and hence also that function (and expr_land_const).
Also the argument to __builtin_constant_p can be only a non-comma
expression (like all functions arguments).
Our code generation assumes that it can load/store with the
bit-fields base type, so bit_pos/bit_size must be in range for this.
We could change the fields type or adjust offset/bit_pos; we do the
latter.
Checked the lcc testsuite for bitfield stuff (in cq.c and fields.c),
fixed one more error in initializing unnamed members (which have
to be skipped), removed the TODO.
- configure/Makefiles: minor adjustments
- build-tcc.bat: add -static to gcc options
(avoids libgcc_s*.dll dependency with some mingw versions)
- tccpe.c/tcctools.c: eliminate MAX_PATH
(not available for cross compilers)
- tccasm.c: use uint64_t/strtoull in unary()
(unsigned long sometimes is only uint32_t, as always on windows)
- tccgen.c: Revert (f077d16c) "tccgen: gen_cast: cast FLOAT to DOUBLE"
Was a rather experimental, tentative commit, not really necessary
and somewhat ugly too.
- cleanup recent osx support:
- Makefile/libtcc.c: cleanup copy&paste code
- tccpp.c: restore deleted function
Also, retain storage qualifiers in type_decl, in particular
also for function pointers. This allows to get rid of this
very early hack in decl()
type.t |= (btype.t & VT_STATIC); /* Retain "static". */
which was to fix the case of
int main() { static int (*foo)(); ...
Also:
- missing __declspec(dllimport) is an error now
- except if the symbol is "_imp__symbol"
- demonstrate export/import of data in the dll example (while
'extern' isn't strictly required with dllimport anymore)
- new function 'patch_storage()' replaces 'weaken_symbol()'
and 'apply_visibility()'
- new function 'update_storage()' applies storage attributes
to Elf symbols.
- put_extern_sym/2 accepts new pseudo section SECTION_COMMON
- add -Wl,-export-all-symbols as alias for -rdynamic
- add -Wl,-subsystem=windows for mingw compatibility
- redefinition of 'sym' error for initialized global data
Forgot about it. It allows to compile several
sources (and other .o's) to one single .o file;
tcc -r -o all.o f1.c f2.c f3.S o4.o ...
Also:
- option -fold-struct-init-code removed, no effect anymore
- (tcc_)set_environment() moved to tcc.c
- win32/lib/(win)crt1 minor fix & add dependency
- debug line output for asm (tcc -c -g xxx.S) enabled
- configure/Makefiles: x86-64 -> x86_64 changes
- README: cleanup
- tccgen.c/tcc.h: allow function declaration after use:
int f() { return g(); }
int g() { return 1; }
may be a warning but not an error
see also 76cb1144ef
- tccgen.c: redundant code related to inline functions removed
(functions used anywhere have sym->c set automatically)
- tccgen.c: make 32bit llop non-equal test portable
(probably not on C67)
- dynarray_add: change prototype to possibly avoid aliasing
problems or at least warnings
- lib/alloca*.S: ".section .note.GNU-stack,"",%progbits" removed
(has no effect)
- tccpe: set SizeOfCode field (for correct upx decompression)
- libtcc.c: fixed alternative -run invocation
tcc "-run -lxxx ..." file.c
(meant to load the library after file).
Also supported now:
tcc files ... options ... -run @ arguments ...
Some code in gen_opl was depending on a gvtst label
which in nocode_wanted mode is not set.
This was causing vstack leaks and crashes with for example
long long ll;
if (0)
return ll - 10 < 0;
Also:
- on windows i386 and x86-64, structures of size <= 8 are
NOT returned in registers if size is not one of 1,2,4,8.
- cleanup: put all tv-push/pop/swap/rot into one place
Some more subtle issues with code suppression:
- outputting asms but not their operand setup is broken
- but global asms must always be output
- statement expressions are transparent to code suppression
- vtop can't be transformed from VT_CMP/VT_JMP when nocode_wanted
Also remove .exe files from tests2 if they don't fail.
Restore ebx from *ebp because alloca might change esp.
Also disable USE_EBX for upcoming release.
Actually the benefit is less than one would expect, it
appears that tcc can't do much with more than 3 registers
except with extensive use of long longs where the disassembly
looks much prettier (and shorter also).
Also: tccgen/expr_cond() : fix wrong gv/save_regs order
Also ...
tcctest.c:
- exclude stuff that gcc doesn't compile on windows.
libtcc.c/tccpp.c:
- use unsigned for memory sizes to avoid printf format warnings
- use "file:line: message" to make IDE error parsers happy.
tccgen.c: fix typo
tccgen.c: remove any 'nocode_wanted' checks, except in
- greloca(), disables output elf symbols and relocs
- get_reg(), will return just the first suitable reg)
- save_regs(), will do nothing
Some minor adjustments were made where nocode_wanted is set.
xxx-gen.c: disable code output directly where it happens
in functions:
- g(), output disabled
- gjmp(), will do nothing
- gtst(), dto.
when an alignment is explicitely given on the member itself,
or on its types attributes then respect it always. Was only
allowed to increase before, but GCC is allowing it.
The linux kernel has some structures that are page aligned,
i.e. 4096. Instead of enlarging the bit fields to specify this,
use the fact that alignment is always power of two, and store only
the log2 minus 1 of it. The 5 bits are enough to specify an alignment
of 1 << 30.
Another corner case:
struct foo6_1
{
char x;
short p:8;
short :0;
short :0;
short p2:8;
char y;
};
In MS layout the second anon :0 bit-field does _not_ adjust size or
alignment of the struct again. The first one does, though.
Bit-fields are layed out differently in visual C, this implements
a compatible mode. Checked against Visual C/C++ 2016.
Unfortunately the GCC implementation of MS layout (behind
-mms-bitfields) actually is different, and hence not compatible
with MS in all cases :-/
Such struct decl:
struct S { char a; int i;} __attribute__((packed));
should be accepted and cause S to be five bytes long (i.e.
the packed attribute should matter). So we can't layout
the members during parsing already. Split off the offset
and alignment calculation for this.
See testcases. We now support 64bit case constants. At the same time
also 64bit enum constants on L64 platforms (otherwise the Sym struct
isn't large enough for now). The testcase also checks for various
cases where sign/zero extension was confused.
In certain very specific situations (involving switches
with asms inside dead statement expressions) we could generate
invalid code (clobbering the buffer so much that we generated
invalid instructions). Don't emit the decision table if the
switch itself is dead.
When intializing members where the initializer needs relocations
and the member is initialized multiple times we can't allow
that to lead to multiple relocations to the same place. The last
one must win.
Similar to GCC a local asm register variable enforces the use of a
specified register in asm operands (and doesn't otherwise
matter). Works only if the variable is directly mentioned as
operand. For that we now generally store a backpointer from
an SValue to a Sym when the SValue was the result of unary()
parsing a symbol identifier.
If a condition is always zero/non-zero we can omit the
then or else code. This is complicated a bit by having to
deal with labels that might make such code reachable without
us yet knowing during parsing.
Not fully thought out. You can't jump inside stmt exprs,
but you can jump out of them. So there's a difference
between undefined but declared labels at the end of stmt
exprs and those defined inside. Additionally it should
also be checked if a label defined inside a stmt expr
was tentatively created as declared from outside.
I'm not prepared doing that right now, so simply revert.
This reverts commit 9160e4cab9147d77840cc44a285031fdb4640cf9.
One can't jump into statement expressions from outside
them, like the following:
int i = ({ label: foo(); 42; });
goto label;
We reject this by making the labels simply not available
outside (GCC has a nicer error message about jumping into
a statement expression).
In statement expression we really mustn't emit backward jumps
under nocode_wanted (they will form infinte loops as no expressions
are evaluated). Do-while and explicit loop with gotos weren't
handled.
The return value of statement expressions might refer to local
symbols, so those can't be popped. The old error message always
was just a band-aid, and since disabling it for pointer types it
wasn't effective anyway. It also never considered that also the
vtop->sym member might have referred to such symbols (see the
testcase with the local static, that used to segfault).
For fixing this (can be seen better with valgrind and SYM_DEBUG)
simply leave local symbols of stmt exprs on the stack.
But like GCC do warn about changes in signedness. The latter
leads to some changes in gen_assign_cast to not also warn about
unsigned* = int*
(where GCC warns, but only with extra warnings).
For
union U { struct {int a,b}; int c; };
union U u = {{ 1, 2, }};
The unnamed first member of union U needs to actually exist in the
structure so initializer parsing isn't confused about the double braces.
That means also the a and b members must be part of _that_, not of
union U directly. Which in turn means we need to do a bit more work
for field lookup.
See the testcase extension for more things that need to work.
Remove dead code and variables. Properly check for unions when
skipping fields in initializers. Make tests2/*.expect depend
on the .c files so they are automatically rebuilt when the latter
change.
E.g. "struct { struct S s; int a;} = { others, 42 };"
if 'others' is also a 'struct S'. Also when the value is a
compound literal. See added testcases.
Start reimplementing the whole initializer handling to be
conforming to ISO C. This patch just reimplements current
functionality to prepare for further changes, all tests pass.
This snippet is valid:
void foo(void);
... foo + 42 ...
the function designator is converted to pointer to function
implicitely. gen_op didn't do that and bailed out.
This must compile:
typedef int arrtype1[];
arrtype1 sinit19 = {1};
arrtype1 sinit20 = {2,3};
and generate two arrays of one resp. two elements. Before the fix
the determined size of the first array was encoded in the type
directly, so sinit20 couldn't be parsed anymore (because arrtype1
was thought to be only one element long).
Given this code:
struct __attribute__((...)) Name {...};
TCC was eating "Name", hence generating an anonymous struct.
It also didn't apply any packed attributes to the parsed
members. Both fixed. The testcase also contains a case
that isn't yet handled by TCC (under a BROKEN #define).
add_elf_sym is a confusing name because it is not clear what the
function does compared to put_elf_sym. As a matter of fact, put_elf_sym
also adds a symbol in a symbol table. Besides, "add_elf_sym" fails to
convey that the function can be used to update a symbol (for instance
its value). "set_elf_sym" seems like a more appropriate name: it will
set a symbol to a given set of properties (value, size, etc.) and create
a new one if non exist for that name as one would expect.
With the last improvements to lexpand it's now harmful
to use on native 64bit platforms when not necessary. For gv_dup
it's not necessary there. It can still be used with really
transforming a 64bit value into two 32bit ones.
Previously, long longs were 'lexpand'ed into two registers
always.
Now, it expands
- constants into two constants (lo-part, hi-part)
- variables into two lvalues with offset+4 for the hi-part.
This makes long long operations look a bit nicer.
Also: don't apply i386 'inc/dec' optimization if carry
generation is wanted.
gen_cast() failed to truncate long long's if they
were unsigned, which was causing mess on the vstack.
There was a similar bug here
tccgen: 32bits: fix PTR +/- long long
ed15cddacd
Both were not visible until this patch
tccgen: arm/i386: save_reg_upstack
b691585785
I'd still assume that this patch is correct per se.
Also:
- remove 2x !nocode_wanted (we are already under a general
"else if (!nocode_wanted)" clause above).
__GNUC__ nowadays as macro seems to mean the "GNU C dialect"
rather than the compiler itself. See also
http://gcc.gnu.org/ml/gcc/2008-07/msg00026.html
This patch will probably cause problems of various kinds but
maybe we should try nonetheless.
Previously in order to perform a ll+ll operation tcc
was trying to 'lexpand' PTR in gen_opl which did
not work well. The case:
int printf(const char *, ...);
char t[] = "012345678";
int main(void)
{
char *data = t;
unsigned long long r = 4;
unsigned a = 5;
unsigned long long b = 12;
*(unsigned*)(data + r) += a - b;
printf("data %s\n", data);
return 0;
}
The back end functions gen_op(comparison) and gtst() might allocate
registers so case_reg should be left on the value stack while they
are called and set again afterwards.
This bug fix was first applied as ff3f9aa (20 Feb 2015), but the fix
was reverted by fc0fc6a (21 Sep 2016, "switch: collect case ranges
first, then generate code"). Here the fix is updated for the new code.
Also:
- regenerate all tests/pp/*.expect with gcc
- test "insert one space" feature
- test "0x1E-1" in asm mode case
- PARSE_FLAG_SPACES: ignore \f\v\r better
- tcc.h: move some things
We need to preserve the type of the pointer to the structure, f.ex.
when a global structure is returned.
This is not a perfect solution. Registers loaded in the first iteration
might be overwritten in a following iteration as the register is no
longer on vtop. This is not a problem for ARM32 as gfunc_sret returns
a maximum of 1 in the integer case.
Makefile :
- do not 'uninstall' peoples /usr/local/doc entirely
libtcc.c :
- MEM_DEBUG : IDE-friendly output "file:line: ..."
- always ELF for objects
tccgen.c :
- fix memory leak in new switch code
- move static 'in_sizeof' out of function
profiling :
- define 'static' to empty
resolve_sym() :
- replace by dlsym()
win32/64: fix R_XXX_RELATIVE fixme
- was fixed for i386 already in
8e4d64be2f
- do not -Lsystemdir if compiling to .o
tccgen.c:gv() when loading long long from lvalue, before
was saving all registers which caused problems in the arm
function call register parameter preparation, as with
void foo(long long y, int x);
int main(void)
{
unsigned int *xx[1], x;
unsigned long long *yy[1], y;
foo(**yy, **xx);
return 0;
}
Now only the modified register is saved if necessary,
as in this case where it is used to store the result
of the post-inc:
long long *p, v, **pp;
v = 1;
p = &v;
p[0]++;
printf("another long long spill test : %lld\n", *p);
i386-gen.c :
- found a similar problem with TOK_UMULL caused by the
vstack juggle in tccgen:gen_opl()
(bug seen only when using EBX as 4th register)
"make test" crashes without that "save_regs()".
This partially reverts
commit 49d3118621.
Found another solution: In a 2nd pass Just look if
any of the argument registers has been saved again,
and restore if so.
This was causing assembler bugs in a tcc compiled by itself
at i386-asm.c:352 when ExprValue.v was changed to uint64_t:
if (op->e.v == (int8_t)op->e.v)
op->type |= OP_IM8S;
A general test case:
#include <stdio.h>
int main(int argc, char **argv)
{
long long ll = 4000;
int i = (char)ll;
printf("%d\n", i);
return 0;
}
Output was "4000", now "-96".
Also: add "asmtest2" as asmtest with tcc compiled by itself
On 2016-08-11 09:24 +0100, Balazs Kezes wrote:
> I think it's just that that copy_params() never restores the spilled
> registers. Maybe it needs some extra code at the end to see if any
> parameters have been spilled to stack and then restore them?
I've spent some time on this and I've found an alternative solution.
Although I'm not entirely sure about it but I've attached a patch
nevertheless.
And while poking at that I've found another problem affecting the
unsigned long long division on arm and I've attached a patch for that
too.
More details in the patches themselves. Please review and consider them
for merging! Thank you!
--
Balazs
[PATCH 1/2] Fix slow unsigned long long division on ARM
The macro AEABI_UXDIVMOD expands to this bit:
#define AEABI_UXDIVMOD(name,type, rettype, typemacro) \
...
while (num >= den) { \
...
while ((q << 1) * den <= num && q * den <= typemacro ## _MAX / 2) \
q <<= 1; \
...
With the current ULONG_MAX version the inner loop goes only until 4
billion so the outer loop will progress very slowly if num is large.
With ULLONG_MAX the inner loop works as expected. The current version is
probably a result of a typo.
The following bash snippet demonstrates the bug:
$ uname -a
Linux eper 4.4.16-2-ARCH #1 Wed Aug 10 20:03:13 MDT 2016 armv6l GNU/Linux
$ cat div.c
int printf(const char *, ...);
int main(void) {
unsigned long long num, denom;
num = 12345678901234567ULL;
denom = 7;
printf("%lld\n", num / denom);
return 0;
}
$ time tcc -run div.c
1763668414462081
real 0m16.291s
user 0m15.860s
sys 0m0.020s
[PATCH 2/2] Fix long long dereference during argument passing on ARMv6
For some reason the code spills the register to the stack. copy_params
in arm-gen.c doesn't expect this so bad code is generated. It's not
entirely clear why the saving part is necessary. It was added in commit
59c35638 with the comment "fixed long long code gen bug" with no further
clarification. Given that tcctest.c passes without this, maybe it's no
longer needed? Let's remove it.
Also add a new testcase just for this. After I've managed to make the
tests compile on a raspberry pi, I get the following diff without this
patch:
--- test.ref 2016-08-22 22:12:43.380000000 +0100
+++ test.out3 2016-08-22 22:12:49.990000000 +0100
@@ -499,7 +499,7 @@
2
1 0 1 0
4886718345
-shift: 9 9 9312
+shift: 291 291 291
shiftc: 36 36 2328
shiftc: 0 0 9998683865088
manyarg_test:
More discussion on this thread:
https://lists.nongnu.org/archive/html/tinycc-devel/2016-08/msg00004.html
- "utf8 in identifiers"
from 936819a1b9
- CValue: remove member str.data_allocated
- make tiny allocator private to tccpp
- allocate macro_stack objects on heap
because otherwise it could crash after error/setjmp
in preprocess_delete():end_macro()
- mov "TinyAlloc" defs to tccpp.c
- define_push: take int* str again
fixes 5c35ba66c5
Implementation was consistent within tcc but incompatible
with the ABI (for example library functions vprintf etc)
Also:
- tccpp.c/get_tok_str() : avoid "unknown format "%llu" warning
- x86_64_gen.c/gen_vla_alloc() : fix vstack leak
The case below previously was causing an assertion failure
in the target specific generator.
It probably is not incorrect not to allow this even if
gcc does.
struct S { long b; };
void f(struct S *x)
{
struct S y[1] = { *x };
}
The check for structs was too late and on amd64 and aarch64 could
lead to accepting and then asserting with code like:
struct S {...} s;
char *c = (char*)0x10 - s;
... for fast redeclaration checks
Also, check function parameters too:
void foo(int a) { int a; ... }
Also, try to fix struct/union/enum's on different scopes:
{ struct xxx { int x; };
{ struct xxx { int y; }; ... }}
and some (probably not all) combination with incomplete
declarations "struct xxx;"
Replaces 2bfedb1867
and 07d896c8e5
Fixes cf95ac399c
A constant expression removed from the loop.
If subroutine have 50000+ local variables, then currently
compilation of such code takes obly 15 sec. Was 2 min.
gcc-4.1.2 compiles such code in 7 sec. pcc -- 3.44 min.
A test generator:
#include <stdio.h>
int main() {
puts("#include <stdio.h>"); puts("int main()"); puts("{");
for (int i = 0; i < 50000; ++i) printf("int X%d = 1;\n", i);
for (int i = 0; i < 50000; ++i) puts("scanf(\"%d\", &X0);");
puts("}");
return 0;
}
don't catch redefinition for local vars. With this option on
tcc accepts the following code:
int main()
{
int a = 0;
long a = 0;
}
But if you shure there is no problem with your local variables,
then a compilation speed can be improved if you have a lots of
the local variables (50000+)