Commit Graph

10 Commits

Author SHA1 Message Date
Roy
d815896d4c bcheck: there is no unistd.h in win32. 2012-12-10 09:51:49 +08:00
Kirill Smelkov
dbeb4faf21 lib/bcheck: Fix code typo in __bound_delete_region()
We were calling get_page() with t2 index which is not correct, since
get_page() operate on t1 indices. The bug is here from day-1, from
60f781c4 (first version of bounds checker) and show as a crash in
__bound_delete_region() at program exit:

    $ ./tcc   -B. -DTCC_TARGET_I386 -DCONFIG_MULTIARCHDIR=\"i386-linux-gnu\" -b -run -DONE_SOURCE \
      ./tcc.c -B. -DTCC_TARGET_I386 -DCONFIG_MULTIARCHDIR=\"i386-linux-gnu\"    -run -DONE_SOURCE \
      ./tcc.c -B. -run tests/tcctest.c

    (lot's of correct output from tcctest)
    Runtime error: dereferencing invalid pointer
    at 0xa7c21cc4 __bound_delete_region()
    by (nil) ???
    Segmentation fault

The fix is simple - last page should be get through t1_end, like it is
done in __bound_new_region().

After this patch, tcc is being able to compile itself with -b, then
compile itself again and run tcctest with correct output. Tests follow.
2012-12-09 19:33:47 +04:00
Kirill Smelkov
efd9d92b7c lib/bcheck: Don't assume heap goes right after bss
At startup __bound_init() wants to mark malloc zone as invalid memory,
so that any access to memory on heap, not allocated through malloc be
invalid. Other pages are initialized as empty regions, access to which
is not treated as invalid by bounds-checking.

The problem is code incorrectly assumed that heap goes right after bss,
and that is not correct for two cases:

    1) if we are running from `tcc -b -run`, program text data and bss
       will be already in malloced memory, possibly in mmaped region
       insead of heap, and marking memory as invalid from _end
       will not cover heap and probably wrongly mark correct regions.

    2) if address space randomization is turned on, again heap does not
       start from _end, and we'll mark as invalid something else instead
       of malloc area.

For example with the following diagnostic patch ...

    diff --git a/tcc.c b/tcc.c
    index 5dd5725..31c46e8 100644
    --- a/tcc.c
    +++ b/tcc.c
    @@ -479,6 +479,8 @@ static int parse_args(TCCState *s, int argc, char **argv)
         return optind;
     }

    +extern int _etext, _edata, _end;
    +
     int main(int argc, char **argv)
     {
         int i;
    @@ -487,6 +489,18 @@ int main(int argc, char **argv)
         int64_t start_time = 0;
         const char *default_file = NULL;

    +    void *brk;
    +
    +    brk = sbrk(0);
    +
    +    fprintf(stderr, "\n>>> TCC\n\n");
    +    fprintf(stderr, "etext:\t%10p\n",  &_etext);
    +    fprintf(stderr, "edata:\t%10p\n",  &_edata);
    +    fprintf(stderr, "end:\t%10p\n",    &_end);
    +    fprintf(stderr, "brk:\t%10p\n",    brk);
    +    fprintf(stderr, "stack:\t%10p\n",  &brk);
    +
    +    fprintf(stderr, "&errno: %p\n", &errno);
         s = tcc_new();

         output_type = TCC_OUTPUT_EXE;

    diff --git a/tccrun.c b/tccrun.c
    index 531f46a..25ed30a 100644
    --- a/tccrun.c
    +++ b/tccrun.c
    @@ -91,6 +91,8 @@ LIBTCCAPI int tcc_run(TCCState *s1, int argc, char **argv)
         int (*prog_main)(int, char **);
         int ret;

    +    fprintf(stderr, "\n\ntcc_run() ...\n\n");
    +
         if (tcc_relocate(s1, TCC_RELOCATE_AUTO) < 0)
             return -1;

    diff --git a/lib/bcheck.c b/lib/bcheck.c
    index ea5b233..8b26a5f 100644
    --- a/lib/bcheck.c
    +++ b/lib/bcheck.c
    @@ -296,6 +326,8 @@ static void mark_invalid(unsigned long addr, unsigned long size)
         start = addr;
         end = addr + size;

    +    fprintf(stderr, "mark_invalid  %10p - %10p\n", (void *)addr, (void *)end);
    +
         t2_start = (start + BOUND_T3_SIZE - 1) >> BOUND_T3_BITS;
         if (end != 0)
             t2_end = end >> BOUND_T3_BITS;

... Look how memory is laid out for `tcc -b -run ...`:

    $ ./tcc -B. -b -DTCC_TARGET_I386 -DCONFIG_MULTIARCHDIR=\"i386-linux-gnu\"  -run   \
        -DONE_SOURCE ./tcc.c -B. -c x.c

    >>> TCC

    etext:   0x8065477
    edata:   0x8070220
    end:     0x807a95c
    brk:     0x807b000
    stack:  0xaffff0f0
    &errno: 0xa7e25688

    tcc_run() ...

    mark_invalid  0xfff80000 -      (nil)
    mark_invalid  0xa7c31d98 - 0xafc31d98

    >>> TCC

    etext:  0xa7c22767
    edata:  0xa7c2759c
    end:    0xa7c31d98
    brk:     0x8211000
    stack:  0xafffeff0
    &errno: 0xa7e25688
    Runtime error: dereferencing invalid pointer
    ./tccpp.c:1953: at 0xa7beebdf parse_number() (included from ./libtcc.c, ./tcc.c)
    ./tccpp.c:3003: by 0xa7bf0708 next() (included from ./libtcc.c, ./tcc.c)
    ./tccgen.c:4465: by 0xa7bfe348 block() (included from ./libtcc.c, ./tcc.c)
    ./tccgen.c:4440: by 0xa7bfe212 block() (included from ./libtcc.c, ./tcc.c)
    ./tccgen.c:5529: by 0xa7c01929 gen_function() (included from ./libtcc.c, ./tcc.c)
    ./tccgen.c:5767: by 0xa7c02602 decl0() (included from ./libtcc.c, ./tcc.c)

The second mark_invalid goes right after in-memory-compiled program's
_end, and oops, that's not where malloc zone is (starts from brk), and oops
again, mark_invalid covers e.g. errno. Then compiled tcc is crasshing by
bcheck on errno access:

    1776 static void parse_number(const char *p)
    1777 {
    1778     int b, t, shift, frac_bits, s, exp_val, ch;
         ...
    1951             *q = '\0';
    1952             t = toup(ch);
    1953             errno = 0;

The solution here is to use sbrk(0) as approximation for the program
break start instead of &_end:

    - if we are a separately compiled program, __bound_init() runs early,
      and sbrk(0) should be equal or very near to start_brk (in case other
      constructors malloc something), or

    - if we are running from under `tcc -b -run`, sbrk(0) will return
      start of heap portion which is under this program control, and not
      mark as invalid earlier allocated memory.

With this patch `tcc -b -run tcc.c ...` succeeds compiling above
small-test program (diagnostic patch is still applied too):

    $ ./tcc -B. -b -DTCC_TARGET_I386 -DCONFIG_MULTIARCHDIR=\"i386-linux-gnu\"  -run   \
        -DONE_SOURCE ./tcc.c -B. -c x.c

    >>> TCC

    etext:   0x8065477
    edata:   0x8070220
    end:     0x807a95c
    brk:     0x807b000
    stack:  0xaffff0f0
    &errno: 0xa7e25688

    tcc_run() ...

    mark_invalid  0xfff80000 -      (nil)
    mark_invalid   0x8211000 - 0x10211000

    >>> TCC

    etext:  0xa7c22777
    edata:  0xa7c275ac
    end:    0xa7c31da8
    brk:     0x8211000
    stack:  0xafffeff0
    &errno: 0xa7e25688

    (completes ok)

but running `tcc -b -run tcc.c -run tests/tcctest.c` sigsegv's - that's
the plot for the next patch.
2012-12-09 19:05:36 +04:00
Kirill Smelkov
cffb7af9f9 lib/bcheck: Prevent __bound_local_new / __bound_local_delete from being miscompiled
On i386 and gcc-4.7 I found that __bound_local_new was miscompiled -
look:

    #ifdef __i386__
    /* return the frame pointer of the caller */
    #define GET_CALLER_FP(fp)\
    {\
        unsigned long *fp1;\
        __asm__ __volatile__ ("movl %%ebp,%0" :"=g" (fp1));\
        fp = fp1[0];\
    }
    #endif

    /* called when entering a function to add all the local regions */
    void FASTCALL __bound_local_new(void *p1)
    {
        unsigned long addr, size, fp, *p = p1;
        GET_CALLER_FP(fp);
        for(;;) {
            addr = p[0];
            if (addr == 0)
                break;
            addr += fp;
            size = p[1];
            p += 2;
            __bound_new_region((void *)addr, size);
        }
    }

    __bound_local_new:
    .LFB40:
            .cfi_startproc
            pushl   %esi
            .cfi_def_cfa_offset 8
            .cfi_offset 6, -8
            pushl   %ebx
            .cfi_def_cfa_offset 12
            .cfi_offset 3, -12
            subl    $8, %esp            // NOTE prologue does not touch %ebp
            .cfi_def_cfa_offset 20
    #APP
    # 235 "lib/bcheck.c" 1
            movl %ebp,%edx              // %ebp -> fp1
    # 0 "" 2
    #NO_APP
            movl    (%edx), %esi        // fp1[0] -> fp
            movl    (%eax), %edx
            movl    %eax, %ebx
            testl   %edx, %edx
            je      .L167
            .p2align 2,,3
    .L173:
            movl    4(%ebx), %eax
            addl    $8, %ebx
            movl    %eax, 4(%esp)
            addl    %esi, %edx
            movl    %edx, (%esp)
            call    __bound_new_region
            movl    (%ebx), %edx
            testl   %edx, %edx
            jne     .L173
    .L167:
            addl    $8, %esp
            .cfi_def_cfa_offset 12
            popl    %ebx
            .cfi_restore 3
            .cfi_def_cfa_offset 8
            popl    %esi
            .cfi_restore 6
            .cfi_def_cfa_offset 4
            ret

here GET_CALLER_FP() assumed that its using function setups it's stack
frame, i.e. first save, then set %ebp to stack frame start, and then it
has to do perform two lookups: 1) to get current stack frame through
%ebp, and 2) get caller stack frame through (%ebp).

And here is the problem: gcc decided not to setup %ebp for
__bound_local_new and in such case GET_CALLER_FP actually becomes
GET_CALLER_CALLER_FP and oops, wrong regions are registered in bcheck
tables...

The solution is to stop using hand written assembly and rely on gcc's
__builtin_frame_address(1) to get callers frame stack(*). I think for the
builtin gcc should generate correct code, independent of whether it
decides or not to omit frame pointer in using function - it knows it.

(*) judging by gcc history, __builtin_frame_address was there almost
    from the beginning - at least it is present in 1992 as seen from the
    following commit:

    http://gcc.gnu.org/git/?p=gcc.git;a=commit;h=be07f7bdbac76d87d3006c89855491504d5d6202

    so we can rely on it being supported by all versions of gcc.

In my environment the assembly of __bound_local_new changes as follows:

    diff --git a/bcheck0.s b/bcheck1.s
    index 4c02a5f..ef68918 100644
    --- a/bcheck0.s
    +++ b/bcheck1.s
    @@ -1409,20 +1409,17 @@ __bound_init:
     __bound_local_new:
     .LFB40:
            .cfi_startproc
    -       pushl   %esi
    +       pushl   %ebp                // NOTE prologue saves %ebp ...
            .cfi_def_cfa_offset 8
    -       .cfi_offset 6, -8
    +       .cfi_offset 5, -8
    +       movl    %esp, %ebp          // ... and reset it to local stack frame
    +       .cfi_def_cfa_register 5
    +       pushl   %esi
            pushl   %ebx
    -       .cfi_def_cfa_offset 12
    -       .cfi_offset 3, -12
            subl    $8, %esp
    -       .cfi_def_cfa_offset 20
    -#APP
    -# 235 "lib/bcheck.c" 1
    -       movl %ebp,%edx
    -# 0 "" 2
    -#NO_APP
    -       movl    (%edx), %esi
    +       .cfi_offset 6, -12
    +       .cfi_offset 3, -16
    +       movl    0(%ebp), %esi       // stkframe -> stkframe.parent -> fp
            movl    (%eax), %edx
            movl    %eax, %ebx
            testl   %edx, %edx
    @@ -1440,13 +1437,13 @@ __bound_local_new:
            jne     .L173
     .L167:
            addl    $8, %esp
    -       .cfi_def_cfa_offset 12
            popl    %ebx
            .cfi_restore 3
    -       .cfi_def_cfa_offset 8
            popl    %esi
            .cfi_restore 6
    -       .cfi_def_cfa_offset 4
    +       popl    %ebp
    +       .cfi_restore 5
    +       .cfi_def_cfa 4, 4
            ret
            .cfi_endproc

i.e. now it compiles correctly.

Though I do not have x86_64 to test, my guess is that
__builtin_frame_address(1) should work there too. If not - please revert
only x86_64 part of the patch. Thanks.

Cc: Michael Matz <matz@suse.de>
2012-11-13 22:17:58 +04:00
Kirill Smelkov
646b51833f lib/bcheck: Prevent libc_malloc/libc_free etc from being miscompiled
On i386 and gcc-4.7 I found that libc_malloc was miscompiled - look:

static void *libc_malloc(size_t size)
{
    void *ptr;
    restore_malloc_hooks();     // __malloc_hook = saved_malloc_hook
    ptr = malloc(size);
    install_malloc_hooks();     // saved_malloc_hook = __malloc_hook, __malloc_hook = __bound_malloc
    return ptr;
}

	.type	libc_malloc, @function
libc_malloc:
.LFB56:
	.cfi_startproc
	pushl	%edx
	.cfi_def_cfa_offset 8
	movl	%eax, (%esp)
	call	malloc
	movl	$__bound_malloc, __malloc_hook
	movl	$__bound_free, __free_hook
	movl	$__bound_realloc, __realloc_hook
	movl	$__bound_memalign, __memalign_hook
	popl	%ecx
	.cfi_def_cfa_offset 4
	ret

Here gcc inlined both restore_malloc_hooks() and install_malloc_hooks()
and decided that

    saved_malloc_hook -> __malloc_hook -> saved_malloc_hook

stores are not needed and could be ommitted. Only it did not know
__molloc_hook affects malloc()...

So add compiler barrier to both install and restore hooks functions and
be done with it - the code is now ok:

    diff --git a/bcheck0.s b/bcheck1.s
    index 5f50293..4c02a5f 100644
    --- a/bcheck0.s
    +++ b/bcheck1.s
    @@ -42,8 +42,24 @@ libc_malloc:
            .cfi_startproc
            pushl   %edx
            .cfi_def_cfa_offset 8
    +       movl    saved_malloc_hook, %edx
    +       movl    %edx, __malloc_hook
    +       movl    saved_free_hook, %edx
    +       movl    %edx, __free_hook
    +       movl    saved_realloc_hook, %edx
    +       movl    %edx, __realloc_hook
    +       movl    saved_memalign_hook, %edx
    +       movl    %edx, __memalign_hook
            movl    %eax, (%esp)
            call    malloc
    +       movl    __malloc_hook, %edx
    +       movl    %edx, saved_malloc_hook
    +       movl    __free_hook, %edx
    +       movl    %edx, saved_free_hook
    +       movl    __realloc_hook, %edx
    +       movl    %edx, saved_realloc_hook
    +       movl    __memalign_hook, %edx
    +       movl    %edx, saved_memalign_hook
            movl    $__bound_malloc, __malloc_hook
            movl    $__bound_free, __free_hook
            movl    $__bound_realloc, __realloc_hook

For barrier I use

    __asm__ __volatile__ ("": : : "memory")

which is used as compiler barrier by Linux kernel, and mentioned in gcc
docs and in wikipedia [1].

Without this patch any program compiled with tcc -b crashes in startup
because of infinite recursion in libc_malloc.

[1] http://en.wikipedia.org/wiki/Memory_ordering#Compiler_memory_barrier
2012-11-13 22:17:51 +04:00
Michael Matz
b068e29df7 x86_64: Implement GET_CALLER_FP
TCC always uses %rbp frames, so we can use that one.
2012-04-18 20:57:13 +02:00
Thomas Preud'homme
776364f395 Add support for __FreeBSD_kernel__ kernel
Add support for kfreebsd-i386 and kfreebsd-amd64 Debian arch with
thanks to Pierre Chifflier <chifflier@cpe.fr>.
2010-09-10 21:09:07 +02:00
grischka
7fa712e00c win32: enable bounds checker & exception handler
exception handler borrowed from k1w1. Thanks.
2009-12-19 22:22:43 +01:00
grischka
0085c648f6 bcheck: restore malloc hooks when done 2009-07-18 21:54:47 +02:00
grischka
ea5e81bd6a new subdirs: include, lib, tests 2009-04-18 15:08:03 +02:00