On 2016-08-11 09:24 +0100, Balazs Kezes wrote:
> I think it's just that that copy_params() never restores the spilled
> registers. Maybe it needs some extra code at the end to see if any
> parameters have been spilled to stack and then restore them?
I've spent some time on this and I've found an alternative solution.
Although I'm not entirely sure about it but I've attached a patch
nevertheless.
And while poking at that I've found another problem affecting the
unsigned long long division on arm and I've attached a patch for that
too.
More details in the patches themselves. Please review and consider them
for merging! Thank you!
--
Balazs
[PATCH 1/2] Fix slow unsigned long long division on ARM
The macro AEABI_UXDIVMOD expands to this bit:
#define AEABI_UXDIVMOD(name,type, rettype, typemacro) \
...
while (num >= den) { \
...
while ((q << 1) * den <= num && q * den <= typemacro ## _MAX / 2) \
q <<= 1; \
...
With the current ULONG_MAX version the inner loop goes only until 4
billion so the outer loop will progress very slowly if num is large.
With ULLONG_MAX the inner loop works as expected. The current version is
probably a result of a typo.
The following bash snippet demonstrates the bug:
$ uname -a
Linux eper 4.4.16-2-ARCH #1 Wed Aug 10 20:03:13 MDT 2016 armv6l GNU/Linux
$ cat div.c
int printf(const char *, ...);
int main(void) {
unsigned long long num, denom;
num = 12345678901234567ULL;
denom = 7;
printf("%lld\n", num / denom);
return 0;
}
$ time tcc -run div.c
1763668414462081
real 0m16.291s
user 0m15.860s
sys 0m0.020s
[PATCH 2/2] Fix long long dereference during argument passing on ARMv6
For some reason the code spills the register to the stack. copy_params
in arm-gen.c doesn't expect this so bad code is generated. It's not
entirely clear why the saving part is necessary. It was added in commit
59c35638 with the comment "fixed long long code gen bug" with no further
clarification. Given that tcctest.c passes without this, maybe it's no
longer needed? Let's remove it.
Also add a new testcase just for this. After I've managed to make the
tests compile on a raspberry pi, I get the following diff without this
patch:
--- test.ref 2016-08-22 22:12:43.380000000 +0100
+++ test.out3 2016-08-22 22:12:49.990000000 +0100
@@ -499,7 +499,7 @@
2
1 0 1 0
4886718345
-shift: 9 9 9312
+shift: 291 291 291
shiftc: 36 36 2328
shiftc: 0 0 9998683865088
manyarg_test:
More discussion on this thread:
https://lists.nongnu.org/archive/html/tinycc-devel/2016-08/msg00004.html
Except
- that libtcc1.a is now installed in subdirs i386/ etc.
- the support for arm and arm64
- some of the "Darwin" fixes
- tests are mosly unchanged
Also
- removed the "legacy links for cross compilers" (was total mess)
- removed "out-of-tree" build support (was broken anyway)
-- Not a fix
This reverts commit 089ce6235c.
Revert "handle a -s option by executing sstrip/strip program"
-- related, not a fix.
This reverts commit 5cd4393a54.
- from win32/include/winapi: various .h
The winapi header set cannot be complete no matter what. So
lets have just the minimal set necessary to compile the examples.
- remove CMake support (hard to keep up to date)
- some other files
Also, drop useless changes in win32/lib/(win)crt1.c
- "utf8 in identifiers"
from 936819a1b9
- CValue: remove member str.data_allocated
- make tiny allocator private to tccpp
- allocate macro_stack objects on heap
because otherwise it could crash after error/setjmp
in preprocess_delete():end_macro()
- mov "TinyAlloc" defs to tccpp.c
- define_push: take int* str again
Also:
- allow more than one item per line
- respect "quoted items" and escaped quotes \"
(also for LIBTCCAPI tcc_setoptions)
- cleanup some copy & paste
after several "fixes" and "improvements"
b3782c3cf55fb57bead4
feature did not work at all
- Use 'once' flag, not 'ifndef_macro'
- Ignore filename letter case on _WIN32
- Increment global pp_once for each compilation
- would parse linker args in two different places
- would mess up "tcc -v ..." output:
tcc -v test.c
-> test.c
+> test.c
- would use function "tcc_load_alacarte()" to do the contrary of
what its name suggests.
This reverts commit 19a169ceb8.
fixes 5c35ba66c5
Implementation was consistent within tcc but incompatible
with the ABI (for example library functions vprintf etc)
Also:
- tccpp.c/get_tok_str() : avoid "unknown format "%llu" warning
- x86_64_gen.c/gen_vla_alloc() : fix vstack leak
_alloca is not part of msvcrt (and therefore not found if used), and tcc has
an internal implementation for alloca for x86[_64] since d778bde7 - initally
as _alloca and later changed to alloca. Use it instead.
- Syntax is now much closer to gnu ar, but still supports whatever was
supported before, with the following exceptions (which gnu ar has too):
- lib is now mandatory (was optional and defaulted to ar_test.a before).
- Path cannot start with '-' (but ./-myfile.o is OK).
- Unlike gnu ar, modes are still optional (as before).
- Now supports also (like gnu ar):
- First argument as options doesn't have to start with '-', later options do.
- Now supports mode v (verbose) with same output format as gnu ar.
- Any names for lib/objs (were limited to .a/.o - broke cmake on windows).
- Now explicitly fail on options which would be destructive for the user.
- Now doesn't get confused by options between file arguments.
- Still ignores other unknown options - as before.
- Now doesn't read out-of-bounds if an option is one char.
- As a result, cmake for windows can now use tiny_libmaker as ar, and
configure can also detect tiny_libmaker as a valid ar (both couldn't before).
Ignoring all options could previously cause to misinterpret the mode in a
destructive way, e.g. if the user wanted to do something with an existing
archive (such as p - print, or x - extract, etc), then it would instead just
delete (re-create) the archive.
Modes which can be destructive if ignored now explicitly fail. These include
[habdioptxN]. Note that 'h' can be ignored, but this way we also implicitly
print the usage for -h/--help.
The .a/.o name limitations previously resulted in complete failure on some
cases, such as cmake on windows which uses <filename>.obj and <libname>.lib .
Fixed: e.g. 'tiny_libmaker r x.a x.o' was reading out of bounds [-1] for 'r'.
The case below previously was causing an assertion failure
in the target specific generator.
It probably is not incorrect not to allow this even if
gcc does.
struct S { long b; };
void f(struct S *x)
{
struct S y[1] = { *x };
}