Jsut for testing. It works for me (don't break anything)
Small fixes for x86_64-gen.c in "tccpp: fix issues, add tests"
are dropped in flavor of this patch.
Pip Cet:
Okay, here's a first patch that fixes the problem (but I've found
another bug, yet unfixed, in the process), though it's not
particularly pretty code (I tried hard to keep the changes to the
minimum necessary). If we decide to actually get rid of VT_QLONG and
VT_QFLOAT (please, can we?), there are some further simplifications in
tccgen.c that might offset some of the cost of this patch.
The idea is that an integer is no longer enough to describe how an
argument is stored in registers. There are a number of possibilities
(none, integer register, two integer registers, float register, two
float registers, integer register plus float register, float register
plus integer register), and instead of enumerating them I've
introduced a RegArgs type that stores the offsets for each of our
registers (for the other architectures, it's simply an int specifying
the number of registers). If someone strongly prefers an enum, we
could do that instead, but I believe this is a place where keeping
things general is worth it, because this way it should be doable to
add SSE or AVX support.
There is one line in the patch that looks suspicious:
} else {
addr = (addr + align - 1) & -align;
param_addr = addr;
addr += size;
- sse_param_index += reg_count;
}
break;
However, this actually fixes one half of a bug we have when calling a
function with eight double arguments "interrupted" by a two-double
structure after the seventh double argument:
f(double,double,double,double,double,double,double,struct { double
x,y; },double);
In this case, the last argument should be passed in %xmm7. This patch
fixes the problem in gfunc_prolog, but not the corresponding problem
in gfunc_call, which I'll try tackling next.
* fix some macro expansion issues
* add some pp tests in tests/pp
* improved tcc -E output for better diff'ability
* remove -dD feature (quirky code, exotic feature,
didn't work well)
Based partially on ideas / researches from PipCet
Some issues remain with VA_ARGS macros (if used in a
rather tricky way).
Also, to keep it simple, the pp doesn't automtically
add any extra spaces to separate tokens which otherwise
would form wrong tokens if re-read from tcc -E output
(such as '+' '=') GCC does that, other compilers don't.
* cleanups
- #line 01 "file" / # 01 "file" processing
- #pragma comment(lib,"foo")
- tcc -E: forward some pragmas to output (pack, comment(lib))
- fix macro parameter list parsing mess from
a3fc543459a715d7143d
(some coffee might help, next time ;)
- introduce TOK_PPSTR - to have character constants as
written in the file (similar to TOK_PPNUM)
- allow '\' appear in macros
- new functions begin/end_macro to:
- fix switching macro levels during expansion
- allow unget_tok to unget more than one tok
- slight speedup by using bitflags in isidnum_table
Also:
- x86_64.c : fix decl after statements
- i386-gen,c : fix a vstack leak with VLA on windows
- configure/Makefile : build on windows (MSYS) was broken
- tcc_warning: fflush stderr to keep output order (win32)
The old code assumed that if an argument doesn't fit into the available
registers, none of the subsequent arguments do, either. But that's
wrong: passing 7 doubles, then a two-double struct, then another double
should generate code that passes the 9th argument in the 8th register
and the two-double struct on the stack. We now do so.
However, this patch does not yet fix the function calling code to do the
right thing in the same case.
With the x86_64 Linux ELF ABI, we're currently failing two of these
three tests, which have been disabled for now. The problem is mixed
structures such as struct { double x; char c; }, which the x86_64 ABI
specifies are to be passed/returned in one integer register and one SSE
register; our current approach, marking the structure as VT_QLONG or
VT_QFLOAT, fails in this case.
(It's possible to fix this by getting rid of VT_QLONG and VT_QFLOAT
entirely as at https://github.com/pipcet/tinycc, but the changes aren't
properly isolated at present. Anyway, there might be a less disruptive
fix.)
I ran into an issue playing with tinycc, and tracked it down to a rather
weird assumption in the function calling code. This breaks only when
varargs and float/double arguments are combined, I think, and only when
calling GCC-generated (or non-TinyCC, at least) code. The problem is we
sometimes generate code like this:
804a468: 4c 89 d9 mov %r11,%rcx
804a46b: b8 01 00 00 00 mov $0x1,%eax
804a470: 48 8b 45 c0 mov -0x40(%rbp),%rax
804a474: 4c 8b 18 mov (%rax),%r11
804a477: 41 ff d3 callq *%r11
for a function call. Note how $eax is first set to the correct value,
then clobbered when we try to load the function pointer into R11. With
the patch, the code generated is:
804a468: 4c 89 d9 mov %r11,%rcx
804a46b: b8 01 00 00 00 mov $0x1,%eax
804a470: 4c 8b 5d c0 mov -0x40(%rbp),%r11
804a474: 4d 8b 1b mov (%r11),%r11
804a477: 41 ff d3 callq *%r11
which is correct.
This becomes an issue when get_reg(RC_INT) is modified not always to
return %rax after a save_regs(0), because then another register (%ecx,
say) is clobbered, and the function passed an invalid argument.
A rather convoluted test case that generates the above code is
included. Please note that the test will not cause a failure because
TinyCC code ignores the %rax argument, but it will cause incorrect
behavior when combined with GCC code, which might wrongly fail to save
XMM registers and cause data corruption.
stdarg_test in abitest.c relies on a sum of some parameters made by both
the caller and the callee to reach the same result. However, the
variables used to store the temporary result of the additions are not
initialized to 0, leading to uncertainty as to the results. This commit
add this needed initialization.
long double arguments require 16-byte alignment on the stack, which
requires adjustment when the the stack offset is not an evven number of
8-byte words.
I've had to introduce the XMM1 register to get the calling convention
to work properly, unfortunately this has broken a fair bit of code
which assumes that only XMM0 is used.
There are probably still issues on x86-64 I've missed.
I've added a few new tests to abitest, which fail (2x long long and 2x double
in a struct should be passed in registers).
abitest now passes; however test1-3 fail in init_test. All other tests
pass. I need to re-test Win32 and Linux-x86.
I've added a dummy implementation of gfunc_sret to c67-gen.c so it
should now compile, and I think it should behave as before I created
gfunc_sret.
I expect that Linux-x86 is probably fine. All other architectures
except ARM are definitely broken since I haven't yet implemented
gfunc_sret for these, although replicating the current behaviour
should be straightforward.
Only one test so far, which fails on Windows (with MinGW as the native
compiler - I've tested the MinGW output against MSVC and it appears the
two are compatible).
I've also had to modify tcc.h so that tcc_set_lib_path can point to the
directory containing libtcc1.a on Windows to make the libtcc dependent
tests work. I'm not sure this is the right way to fix this problem.