554 lines
18 KiB
Plaintext
554 lines
18 KiB
Plaintext
|
Arm / Thumb Interworking
|
||
|
========================
|
||
|
|
||
|
The Cygnus GNU Pro Toolkit for the ARM7T processor supports function
|
||
|
calls between code compiled for the ARM instruction set and code
|
||
|
compiled for the Thumb instruction set and vice versa. This document
|
||
|
describes how that interworking support operates and explains the
|
||
|
command line switches that should be used in order to produce working
|
||
|
programs.
|
||
|
|
||
|
Note: The Cygnus GNU Pro Toolkit does not support switching between
|
||
|
compiling for the ARM instruction set and the Thumb instruction set
|
||
|
on anything other than a per file basis. There are in fact two
|
||
|
completely separate compilers, one that produces ARM assembler
|
||
|
instructions and one that produces Thumb assembler instructions. The
|
||
|
two compilers share the same assembler, linker and so on.
|
||
|
|
||
|
|
||
|
1. Explicit interworking support for C and C++ files
|
||
|
====================================================
|
||
|
|
||
|
By default if a file is compiled without any special command line
|
||
|
switches then the code produced will not support interworking.
|
||
|
Provided that a program is made up entirely from object files and
|
||
|
libraries produced in this way and which contain either exclusively
|
||
|
ARM instructions or exclusively Thumb instructions then this will not
|
||
|
matter and a working executable will be created. If an attempt is
|
||
|
made to link together mixed ARM and Thumb object files and libraries,
|
||
|
then warning messages will be produced by the linker and a non-working
|
||
|
executable will be created.
|
||
|
|
||
|
In order to produce code which does support interworking it should be
|
||
|
compiled with the
|
||
|
|
||
|
-mthumb-interwork
|
||
|
|
||
|
command line option. Provided that a program is made up entirely from
|
||
|
object files and libraries built with this command line switch a
|
||
|
working executable will be produced, even if both ARM and Thumb
|
||
|
instructions are used by the various components of the program. (No
|
||
|
warning messages will be produced by the linker either).
|
||
|
|
||
|
Note that specifying -mthumb-interwork does result in slightly larger,
|
||
|
slower code being produced. This is why interworking support must be
|
||
|
specifically enabled by a switch.
|
||
|
|
||
|
|
||
|
2. Explicit interworking support for assembler files
|
||
|
====================================================
|
||
|
|
||
|
If assembler files are to be included into an interworking program
|
||
|
then the following rules must be obeyed:
|
||
|
|
||
|
* Any externally visible functions must return by using the BX
|
||
|
instruction.
|
||
|
|
||
|
* Normal function calls can just use the BL instruction. The
|
||
|
linker will automatically insert code to switch between ARM
|
||
|
and Thumb modes as necessary.
|
||
|
|
||
|
* Calls via function pointers should use the BX instruction if
|
||
|
the call is made in ARM mode:
|
||
|
|
||
|
.code 32
|
||
|
mov lr, pc
|
||
|
bx rX
|
||
|
|
||
|
This code sequence will not work in Thumb mode however, since
|
||
|
the mov instruction will not set the bottom bit of the lr
|
||
|
register. Instead a branch-and-link to the _call_via_rX
|
||
|
functions should be used instead:
|
||
|
|
||
|
.code 16
|
||
|
bl _call_via_rX
|
||
|
|
||
|
where rX is replaced by the name of the register containing
|
||
|
the function address.
|
||
|
|
||
|
* All externally visible functions which should be entered in
|
||
|
Thumb mode must have the .thumb_func pseudo op specified just
|
||
|
before their entry point. eg:
|
||
|
|
||
|
.code 16
|
||
|
.global function
|
||
|
.thumb_func
|
||
|
function:
|
||
|
...start of function....
|
||
|
|
||
|
* All assembler files must be assembled with the switch
|
||
|
-mthumb-interwork specified on the command line. (If the file
|
||
|
is assembled by calling gcc it will automatically pass on the
|
||
|
-mthumb-interwork switch to the assembler, provided that it
|
||
|
was specified on the gcc command line in the first place.)
|
||
|
|
||
|
|
||
|
3. Support for old, non-interworking aware code.
|
||
|
================================================
|
||
|
|
||
|
If it is necessary to link together code produced by an older,
|
||
|
non-interworking aware compiler, or code produced by the new compiler
|
||
|
but without the -mthumb-interwork command line switch specified, then
|
||
|
there are two command line switches that can be used to support this.
|
||
|
|
||
|
The switch
|
||
|
|
||
|
-mcaller-super-interworking
|
||
|
|
||
|
will allow calls via function pointers in Thumb mode to work,
|
||
|
regardless of whether the function pointer points to old,
|
||
|
non-interworking aware code or not. Specifying this switch does
|
||
|
produce slightly slower code however.
|
||
|
|
||
|
Note: There is no switch to allow calls via function pointers in ARM
|
||
|
mode to be handled specially. Calls via function pointers from
|
||
|
interworking aware ARM code to non-interworking aware ARM code work
|
||
|
without any special considerations by the compiler. Calls via
|
||
|
function pointers from interworking aware ARM code to non-interworking
|
||
|
aware Thumb code however will not work. (Actually under some
|
||
|
circumstances they may work, but there are no guarantees). This is
|
||
|
because only the new compiler is able to produce Thumb code, and this
|
||
|
compiler already has a command line switch to produce interworking
|
||
|
aware code.
|
||
|
|
||
|
|
||
|
The switch
|
||
|
|
||
|
-mcallee-super-interworking
|
||
|
|
||
|
will allow non-interworking aware ARM or Thumb code to call Thumb
|
||
|
functions, either directly or via function pointers. Specifying this
|
||
|
switch does produce slightly larger, slower code however.
|
||
|
|
||
|
Note: There is no switch to allow non-interworking aware ARM or Thumb
|
||
|
code to call ARM functions. There is no need for any special handling
|
||
|
of calls from non-interworking aware ARM code to interworking aware
|
||
|
ARM functions, they just work normally. Calls from non-interworking
|
||
|
aware Thumb functions to ARM code however, will not work. There is no
|
||
|
option to support this, since it is always possible to recompile the
|
||
|
Thumb code to be interworking aware.
|
||
|
|
||
|
As an alternative to the command line switch
|
||
|
-mcallee-super-interworking, which affects all externally visible
|
||
|
functions in a file, it is possible to specify an attribute or
|
||
|
declspec for individual functions, indicating that that particular
|
||
|
function should support being called by non-interworking aware code.
|
||
|
The function should be defined like this:
|
||
|
|
||
|
int function __attribute__((interfacearm))
|
||
|
{
|
||
|
... body of function ...
|
||
|
}
|
||
|
|
||
|
or
|
||
|
|
||
|
int function __declspec(interfacearm)
|
||
|
{
|
||
|
... body of function ...
|
||
|
}
|
||
|
|
||
|
|
||
|
|
||
|
4. Interworking support in dlltool
|
||
|
==================================
|
||
|
|
||
|
Currently there is no interworking support in dlltool. This may be a
|
||
|
future enhancement.
|
||
|
|
||
|
|
||
|
|
||
|
5. How interworking support works
|
||
|
=================================
|
||
|
|
||
|
Switching between the ARM and Thumb instruction sets is accomplished
|
||
|
via the BX instruction which takes as an argument a register name.
|
||
|
Control is transfered to the address held in this register (with the
|
||
|
bottom bit masked out), and if the bottom bit is set, then Thumb
|
||
|
instruction processing is enabled, otherwise ARM instruction
|
||
|
processing is enabled.
|
||
|
|
||
|
When the -mthumb-interwork command line switch is specified, gcc
|
||
|
arranges for all functions to return to their caller by using the BX
|
||
|
instruction. Thus provided that the return address has the bottom bit
|
||
|
correctly initialised to indicate the instruction set of the caller,
|
||
|
correct operation will ensue.
|
||
|
|
||
|
When a function is called explicitly (rather than via a function
|
||
|
pointer), the compiler generates a BL instruction to do this. The
|
||
|
Thumb version of the BL instruction has the special property of
|
||
|
setting the bottom bit of the LR register after it has stored the
|
||
|
return address into it, so that a future BX instruction will correctly
|
||
|
return the instruction after the BL instruction, in Thumb mode.
|
||
|
|
||
|
The BL instruction does not change modes itself however, so if an ARM
|
||
|
function is calling a Thumb function, or vice versa, it is necessary
|
||
|
to generate some extra instructions to handle this. This is done in
|
||
|
the linker when it is storing the address of the referenced function
|
||
|
into the BL instruction. If the BL instruction is an ARM style BL
|
||
|
instruction, but the referenced function is a Thumb function, then the
|
||
|
linker automatically generates a calling stub that converts from ARM
|
||
|
mode to Thumb mode, puts the address of this stub into the BL
|
||
|
instruction, and puts the address of the referenced function into the
|
||
|
stub. Similarly if the BL instruction is a Thumb BL instruction, and
|
||
|
the referenced function is an ARM function, the linker generates a
|
||
|
stub which converts from Thumb to ARM mode, puts the address of this
|
||
|
stub into the BL instruction, and the address of the referenced
|
||
|
function into the stub.
|
||
|
|
||
|
This is why it is necessary to mark Thumb functions with the
|
||
|
.thumb_func pseudo op when creating assembler files. This pseudo op
|
||
|
allows the assembler to distinguish between ARM functions and Thumb
|
||
|
functions. (The Thumb version of GCC automatically generates these
|
||
|
pseudo ops for any Thumb functions that it generates).
|
||
|
|
||
|
Calls via function pointers work differently. Whenever the address of
|
||
|
a function is taken, the linker examines the type of the function
|
||
|
being referenced. If the function is a Thumb function, then it sets
|
||
|
the bottom bit of the address. Technically this makes the address
|
||
|
incorrect, since it is now one byte into the start of the function,
|
||
|
but this is never a problem because:
|
||
|
|
||
|
a. with interworking enabled all calls via function pointer
|
||
|
are done using the BX instruction and this ignores the
|
||
|
bottom bit when computing where to go to.
|
||
|
|
||
|
b. the linker will always set the bottom bit when the address
|
||
|
of the function is taken, so it is never possible to take
|
||
|
the address of the function in two different places and
|
||
|
then compare them and find that they are not equal.
|
||
|
|
||
|
As already mentioned any call via a function pointer will use the BX
|
||
|
instruction (provided that interworking is enabled). The only problem
|
||
|
with this is computing the return address for the return from the
|
||
|
called function. For ARM code this can easily be done by the code
|
||
|
sequence:
|
||
|
|
||
|
mov lr, pc
|
||
|
bx rX
|
||
|
|
||
|
(where rX is the name of the register containing the function
|
||
|
pointer). This code does not work for the Thumb instruction set,
|
||
|
since the MOV instruction will not set the bottom bit of the LR
|
||
|
register, so that when the called function returns, it will return in
|
||
|
ARM mode not Thumb mode. Instead the compiler generates this
|
||
|
sequence:
|
||
|
|
||
|
bl _call_via_rX
|
||
|
|
||
|
(again where rX is the name if the register containing the function
|
||
|
pointer). The special call_via_rX functions look like this:
|
||
|
|
||
|
.thumb_func
|
||
|
_call_via_r0:
|
||
|
bx r0
|
||
|
nop
|
||
|
|
||
|
The BL instruction ensures that the correct return address is stored
|
||
|
in the LR register and then the BX instruction jumps to the address
|
||
|
stored in the function pointer, switch modes if necessary.
|
||
|
|
||
|
|
||
|
6. How caller-super-interworking support works
|
||
|
==============================================
|
||
|
|
||
|
When the -mcaller-super-interworking command line switch is specified
|
||
|
it changes the code produced by the Thumb compiler so that all calls
|
||
|
via function pointers (including virtual function calls) now go via a
|
||
|
different stub function. The code to call via a function pointer now
|
||
|
looks like this:
|
||
|
|
||
|
bl _interwork_call_via_r0
|
||
|
|
||
|
Note: The compiler does not insist that r0 be used to hold the
|
||
|
function address. Any register will do, and there are a suite of stub
|
||
|
functions, one for each possible register. The stub functions look
|
||
|
like this:
|
||
|
|
||
|
.code 16
|
||
|
.thumb_func
|
||
|
_interwork_call_via_r0
|
||
|
bx pc
|
||
|
nop
|
||
|
|
||
|
.code 32
|
||
|
tst r0, #1
|
||
|
stmeqdb r13!, {lr}
|
||
|
adreq lr, _arm_return
|
||
|
bx r0
|
||
|
|
||
|
The stub first switches to ARM mode, since it is a lot easier to
|
||
|
perform the necessary operations using ARM instructions. It then
|
||
|
tests the bottom bit of the register containing the address of the
|
||
|
function to be called. If this bottom bit is set then the function
|
||
|
being called uses Thumb instructions and the BX instruction to come
|
||
|
will switch back into Thumb mode before calling this function. (Note
|
||
|
that it does not matter how this called function chooses to return to
|
||
|
its caller, since the both the caller and callee are Thumb functions,
|
||
|
and mode switching is necessary). If the function being called is an
|
||
|
ARM mode function however, the stub pushes the return address (with
|
||
|
its bottom bit set) onto the stack, replaces the return address with
|
||
|
the address of the a piece of code called '_arm_return' and then
|
||
|
performs a BX instruction to call the function.
|
||
|
|
||
|
The '_arm_return' code looks like this:
|
||
|
|
||
|
.code 32
|
||
|
_arm_return:
|
||
|
ldmia r13!, {r12}
|
||
|
bx r12
|
||
|
.code 16
|
||
|
|
||
|
|
||
|
It simply retrieves the return address from the stack, and then
|
||
|
performs a BX operation to return to the caller and switch back into
|
||
|
Thumb mode.
|
||
|
|
||
|
|
||
|
7. How callee-super-interworking support works
|
||
|
==============================================
|
||
|
|
||
|
When -mcallee-super-interworking is specified on the command line the
|
||
|
Thumb compiler behaves as if every externally visible function that it
|
||
|
compiles has had the (interfacearm) attribute specified for it. What
|
||
|
this attribute does is to put a special, ARM mode header onto the
|
||
|
function which forces a switch into Thumb mode:
|
||
|
|
||
|
without __attribute__((interfacearm)):
|
||
|
|
||
|
.code 16
|
||
|
.thumb_func
|
||
|
function:
|
||
|
... start of function ...
|
||
|
|
||
|
with __attribute__((interfacearm)):
|
||
|
|
||
|
.code 32
|
||
|
function:
|
||
|
orr r12, pc, #1
|
||
|
bx r12
|
||
|
|
||
|
.code 16
|
||
|
.thumb_func
|
||
|
.real_start_of_function:
|
||
|
|
||
|
... start of function ...
|
||
|
|
||
|
Note that since the function now expects to be entered in ARM mode, it
|
||
|
no longer has the .thumb_func pseudo op specified for its name.
|
||
|
Instead the pseudo op is attached to a new label .real_start_of_<name>
|
||
|
(where <name> is the name of the function) which indicates the start
|
||
|
of the Thumb code. This does have the interesting side effect in that
|
||
|
if this function is now called from a Thumb mode piece of code
|
||
|
outsside of the current file, the linker will generate a calling stub
|
||
|
to switch from Thumb mode into ARM mode, and then this is immediately
|
||
|
overridden by the function's header which switches back into Thumb
|
||
|
mode.
|
||
|
|
||
|
In addition the (interfacearm) attribute also forces the function to
|
||
|
return by using the BX instruction, even if has not been compiled with
|
||
|
the -mthumb-interwork command line flag, so that the correct mode will
|
||
|
be restored upon exit from the function.
|
||
|
|
||
|
|
||
|
8. Some examples
|
||
|
================
|
||
|
|
||
|
Given this test file:
|
||
|
|
||
|
int func (void) { return 1; }
|
||
|
|
||
|
int call (int (* ptr)(void)) { return ptr (); }
|
||
|
|
||
|
The following varying pieces of assembler are produced depending upon
|
||
|
the command line options used:
|
||
|
|
||
|
no options:
|
||
|
|
||
|
@ Generated by gcc cygnus-2.91.07 980205 (gcc-2.8.0 release) for ARM/pe
|
||
|
.code 16
|
||
|
.text
|
||
|
.globl _func
|
||
|
.thumb_func
|
||
|
_func:
|
||
|
mov r0, #1
|
||
|
bx lr
|
||
|
|
||
|
.globl _call
|
||
|
.thumb_func
|
||
|
_call:
|
||
|
push {lr}
|
||
|
bl __call_via_r0
|
||
|
pop {pc}
|
||
|
|
||
|
Note how the two functions have different exit sequences. In
|
||
|
particular call() uses pop {pc} to return. This would not work if the
|
||
|
caller was in ARM mode.
|
||
|
|
||
|
If -mthumb-interwork is specified on the command line:
|
||
|
|
||
|
@ Generated by gcc cygnus-2.91.07 980205 (gcc-2.8.0 release) for ARM/pe
|
||
|
.code 16
|
||
|
.text
|
||
|
.globl _func
|
||
|
.thumb_func
|
||
|
_func:
|
||
|
mov r0, #1
|
||
|
bx lr
|
||
|
|
||
|
.globl _call
|
||
|
.thumb_func
|
||
|
_call:
|
||
|
push {lr}
|
||
|
bl __call_via_r0
|
||
|
pop {r1}
|
||
|
bx r1
|
||
|
|
||
|
This time both functions return by using the BX instruction. This
|
||
|
means that call() is now two bytes longer and several cycles slower
|
||
|
than the version that is not interworking enabled.
|
||
|
|
||
|
If -mcaller-super-interworking is specified:
|
||
|
|
||
|
@ Generated by gcc cygnus-2.91.07 980205 (gcc-2.8.0 release) for ARM/pe
|
||
|
.code 16
|
||
|
.text
|
||
|
.globl _func
|
||
|
.thumb_func
|
||
|
_func:
|
||
|
mov r0, #1
|
||
|
bx lr
|
||
|
|
||
|
.globl _call
|
||
|
.thumb_func
|
||
|
_call:
|
||
|
push {lr}
|
||
|
bl __interwork_call_via_r0
|
||
|
pop {pc}
|
||
|
|
||
|
Very similar to the first (non-interworking) version, except that a
|
||
|
different stub is used to call via the function pointer. Note that
|
||
|
the assembly code for call() is not interworking aware, and so should
|
||
|
not be called from ARM code.
|
||
|
|
||
|
If -mcallee-super-interworking is specified:
|
||
|
|
||
|
@ Generated by gcc cygnus-2.91.07 980205 (gcc-2.8.0 release) for ARM/pe
|
||
|
.code 16
|
||
|
.text
|
||
|
.globl _func
|
||
|
.code 32
|
||
|
_func:
|
||
|
orr r12, pc, #1
|
||
|
bx r12
|
||
|
.code 16
|
||
|
.globl .real_start_of_func
|
||
|
.thumb_func
|
||
|
.real_start_of_func:
|
||
|
mov r0, #1
|
||
|
bx lr
|
||
|
|
||
|
.globl _call
|
||
|
.code 32
|
||
|
_call:
|
||
|
orr r12, pc, #1
|
||
|
bx r12
|
||
|
.code 16
|
||
|
.globl .real_start_of_call
|
||
|
.thumb_func
|
||
|
.real_start_of_call:
|
||
|
push {lr}
|
||
|
bl __call_via_r0
|
||
|
pop {r1}
|
||
|
bx r1
|
||
|
|
||
|
Now both functions have an ARM coded prologue, and both functions
|
||
|
return by using the BX instruction. These functions are interworking
|
||
|
aware therefore and can safely be called from ARM code. The code for
|
||
|
the call() function is now 10 bytes longer than the original, non
|
||
|
interworking aware version, an increase of over 200%.
|
||
|
|
||
|
If the source code is slightly altered so that only the call function
|
||
|
has an (interfacearm) attribute:
|
||
|
|
||
|
int func (void) { return 1; }
|
||
|
int call () __attribute__((interfacearm));
|
||
|
int call (int (* ptr)(void)) { return ptr (); }
|
||
|
int main (void) { return printf ("result: %d\n", call (func)); }
|
||
|
|
||
|
then this code is produced (with no command line switches):
|
||
|
|
||
|
@ Generated by gcc cygnus-2.91.07 980205 (gcc-2.8.0 release) for ARM/pe
|
||
|
.code 16
|
||
|
.text
|
||
|
.globl _func
|
||
|
.thumb_func
|
||
|
_func:
|
||
|
mov r0, #1
|
||
|
bx lr
|
||
|
|
||
|
.globl _call
|
||
|
.code 32
|
||
|
_call:
|
||
|
orr r12, pc, #1
|
||
|
bx r12
|
||
|
.code 16
|
||
|
.globl .real_start_of_call
|
||
|
.thumb_func
|
||
|
.real_start_of_call:
|
||
|
push {lr}
|
||
|
bl __call_via_r0
|
||
|
pop {r1}
|
||
|
bx r1
|
||
|
|
||
|
.globl _main
|
||
|
.thumb_func
|
||
|
_main:
|
||
|
push {r4, lr}
|
||
|
bl ___gccmain
|
||
|
ldr r4, .L4
|
||
|
ldr r0, .L4+4
|
||
|
bl _call
|
||
|
add r1, r0, #0
|
||
|
add r0, r4, #0
|
||
|
bl _printf
|
||
|
pop {r4, pc}
|
||
|
.L4:
|
||
|
.word .LC0
|
||
|
.word _func
|
||
|
|
||
|
.section .rdata
|
||
|
.LC0:
|
||
|
.ascii "result: %d\n\000"
|
||
|
|
||
|
So now only call() can be called via non-interworking aware ARM code.
|
||
|
When this program is assembled, the assembler detects the fact that
|
||
|
main() is calling call() in Thumb mode, and so automatically adjusts
|
||
|
the BL instruction to point to the real start of call():
|
||
|
|
||
|
Disassembly of section .text:
|
||
|
|
||
|
00000028 <_main>:
|
||
|
28: b530 b530 push {r4, r5, lr}
|
||
|
2a: fffef7ff f7ff bl 2a <_main+0x2>
|
||
|
2e: 4d06 4d06 ldr r5, [pc, #24] (48 <.L7>)
|
||
|
30: ffe8f7ff f7ff bl 4 <_doit>
|
||
|
34: 1c04 1c04 add r4, r0, #0
|
||
|
36: 4805 4805 ldr r0, [pc, #20] (4c <.L7+0x4>)
|
||
|
38: fff0f7ff f7ff bl 1c <.real_start_of_call>
|
||
|
3c: 1824 1824 add r4, r4, r0
|
||
|
3e: 1c28 1c28 add r0, r5, #0
|
||
|
40: 1c21 1c21 add r1, r4, #0
|
||
|
42: fffef7ff f7ff bl 42 <_main+0x1a>
|
||
|
46: bd30 bd30 pop {r4, r5, pc}
|
||
|
|