Commit Graph

11 Commits

Author SHA1 Message Date
Michael Matz 0b3c8360a0 macos: section syms, strtab, runtime
uncovered by the backtrace/boundcheck tests:

* handle STT_SECTION symbols
* call tcc_add_runtime (to get the bcheck.o/bt-exe.o files added)
* add .stab strtab into segments (we should probably add all stab
  syms to the output LC_SYMTAB eventually, but right now TCC uses
  32 bit stabs, while mach-o uses 32/64bit stabs
2020-06-20 22:14:56 +02:00
Michael Matz 71b0634168 Add find_c_sym and friends
for handling leading underscores when looking up symbols.
Necessary on MacOS, as there C symbols have a '_' prepended.
get_sym_addr (replacing get_elf_sym_addr) gets an argument to
specify if bare/raw/ELF symbols should be looked up or if decorated
C symbols should be looked up.  That reflects into tcc_get_symbol.
tcc_add_symbol is _not_ yet changed, but probably should be.
2020-06-20 22:12:02 +02:00
Michael Matz cc11750bda macos: Support .init_array and .fini_array
called different on Mach-O, but same functionality (list of pointers
to init and fini functions).
2020-06-20 22:12:02 +02:00
Michael Matz 51f15d9971 macos: Deal with leading underscore on Mach-O
all C/C++/ObjC symbols in symbols tables have a leading underscore
in Mach-O.  Within TCC there's some confusion with tcc_add_symbol
(not adding it) and tcc_get_elf_symbol (not expecting it), and
resolve_syms (using dlsym, which doesn't expect it) and -run support.
But this sort of works.
2020-06-20 22:12:02 +02:00
Michael Matz 1ca209dad0 macos: set LC_MAIN entrypoint correctly
to the file offset of 'main' not simply to the start of TEXT.
2020-06-20 22:12:02 +02:00
Michael Matz 84c3fecf5e macos: Support external functions
these are resolved non-lazy for now.  We only need to generate
the jump stub (using the GOT slot that will be initialized due
to the non-lazy pointer marking, like with data symbols).  On
x86-64 we don't even need special marking of these stubs (with
S_SYMBOL_STUBS and associated additional indirect symbol entries),
as that's only used on i386 (where the stubs are self-modifying).

So, this now works:

extern int _printf(const char*, ...);
int _start(void)
{
  _printf("hello\n");
  return 0;
}
2020-06-20 22:12:02 +02:00
Michael Matz d82e64c163 macos: handle undefined symbols a bit
at least data symbols coming from dylibs can be used now, as in the
below.  Note in the example that optind is defined in libc (really in
libsystem_c.dylib, reexported from libSystem.B.dylib):

static int loc;
extern int _optind;
int _start(void)
{
  _optind = 0;
  loc = 42 + _optind;
  return loc - 42;
}
2020-06-20 22:12:02 +02:00
Michael Matz fab8787b23 macos: Fix GOT access to local defined symbols
if a GOT slot is required (due to codegen, indicated by
presence of some relocation types), then it needs to contain
the address of the wanted symbol, also when it's local and defined,
i.e. not overridable.  For simplicity we use a GOT slot for that as
well (other storage would require rewriting relocs and symbols,
as resolving of GOT relocs is hardwired to be based on s1->got).
But that means we need to fill in its indirect symbol mapping slot as
well, for which Mach-O provides a mean to say "not symbol based,
resolved locally".  So this fixes this testcase:

static int loc;
int _start(void)
{
  loc = 42;
  return loc - 42;
}

(where our codegen currently uses a GOT-based access for the write
by accident)
2020-06-20 22:12:02 +02:00
Michael Matz 6ebd463021 macos: dynamic symbol table and __got
this now sorts the symbols properly (local, global defined, undefined;
the latter two by name), marks the three ranges within LC_DYSYMTAB,
generates a __got section (non-lazy pointers) and slots for
relocations which need them, and the indirect symbol mapping for
them.

This doesn't yet deal with undefined symbols.  But it means compared to
last example now this also works, i.e. read access to _global_ data:

% cat simple3.c
int loc = 42;
int _start(void)
{
  return loc - 42;
}
2020-06-20 22:12:02 +02:00
Michael Matz 1320d85742 macos: Create symtab
this creates a proper LC_SYMTAB, with reasonable entries.  It's
not sorted, so not usable for LC_DYSYMTAB.  But 'nm -x -no-sort'
allows to see us some useful info.

This also relocates sections and symbols, so now this example
works as well (i.e. read access to static local data):

% cat simple2.c
static int loc = 42;
int main(void)
{
  return loc - 42;
}
2020-06-20 22:09:21 +02:00
Michael Matz 94066765ed macos: First cut at generating Mach-O executables
this does generate a working executable for a very simple
example input, e.g. this:

% cat simple.c
int main(void)
{
  return 0;
}
% ./tcc -B. -c simple.c
% ./tcc -nostdlib -B. simple.o -lc
% ./a.out && echo okay
okay

(the -lc is actually not necessary right now, see below).  This
has many limitations:

* no symbol table, hence no calls to external functions from
  e.g. libc, aka libSystemB
* no proper entry point (should be main, but is hardcoded to first
  real .text address)
* libSystemB is hardcoded, no other libs are supported (but again
  no external calls anyway)
* generated Mach-O executable is in old format: neither LC_DYLD_INFO
  no export tries for symbols are created (no symbol table at all!)
* the __LINKEDIT segment is faked and empty, as dyld doesn't like
  it empty even if no symbols point into it
* same with __DATA, dyld wants a non-empty writable segment which
  we enforce with useless data
* no relocations, hence no function call stubs (lazy or not) are
  generated
* hardcodes some other constants as well
2020-06-20 22:09:21 +02:00