dyncall library - C foreign function interface dyncall library: home - news - download - source/repository - bindings - documentation - license - credits - showcase/users - contact University of Göttingen 3ds Max, Maya Plugin Development - Potion Studios

previous
index
next

Calling Conventions

Before we go any further

It is important to understand that this section isn’t a general purpose description of the present calling conventions. It merely explains the calling conventions for the parameter/return types supported by dyncall (not for e.g. unsupported types like SIMD data types (__m64, __m128, __m128i, __m128d), etc.).
We strongly advise the reader not to use this document as a general purpose calling convention reference.

x86 Calling Conventions

Overview

On this processor, a word is defined to be 16 bits in size, a dword 32 bits and a qword 64 bits.

There are numerous different calling conventions on the x86 processor architecture, like cdecl [8], MS fastcall [10], GNU fastcall [11], Borland fastcall [12], Watcom fastcall [13], Win32 stdcall [9], MS thiscall [14], GNU thiscall [15], the pascal calling convention [16] and a cdecl-like version for Plan9 [17] (dubbed plan9call by us), etc.

# of regs # regs to cleanup 64bit args
Name for params # preserve push order by via regs?






cdecl 0 4 caller -
MS fastcall 2 4 callee Y
GNU fastcall 2 4 callee N
Borland fastcall 3 4 callee N
Watcom fastcall 4 2-6 callee N
win32 stdcall 0 4 callee -
MS thiscall 1 4 callee N
GNU thiscall 0 4 caller -
pascal 0 4 callee -
plan9call 0 0 caller -
Table 10: short x86 calling convention comparison
dyncall support

Currently cdecl, stdcall, fastcall (MS and GNU), thiscall (MS and GNU) and plan9call are supported.
Dyncall can also be used to issue syscalls on Linux and *BSD by using the syscall number as target parameter and selecting the correct mode.

cdecl

Registers and register usage
Name Brief description


eax scratch, return value
ebx preserve
ecx scratch
edx scratch, return value
esi preserve
edi preserve
ebp preserve
esp stack pointer
st0 scratch, floating point return value
st1-st7 scratch
Table 11: Register usage on x86 cdecl calling convention
Parameter passing Return values Stack layout

Stack directly after function prolog:

               |                        |
               | ...                      |
 register save area------------------------|                  )
               |------------------------|                  ||||
       local data(|------------------------|)                 ||}
             { | arg n- 1                  |}                   caller’s frame
parameter area ( | ...                     |) stack parameters  ||||
               |-arg-0-------------------|                  ||)
               |-return address-----------|                  )
 register save area------------------------|                  ||}
       local data|------------------------|                    current frame
  parameter area|------------------------|                  ||)
               | ...                      |
Figure 1: Stack layout on x86 cdecl calling convention

MS fastcall

Registers and register usage
Name Brief description


eax scratch, return value
ebx preserve
ecx scratch, parameter 0
edx scratch, parameter 1, return value
esi preserve
edi preserve
ebp preserve
esp stack pointer
st0 scratch, floating point return value
st1-st7 scratch
Table 12: Register usage on x86 fastcall (MS) calling convention
Parameter passing Return values Stack layout

Stack directly after function prolog:

               |                        |
               | ...                      |
 register save area------------------------|                  )
               |------------------------|                  ||||
       local data(|------------------------|)                 ||}
             { | last arg                |}                   caller’s frame
parameter area ( | ...                     |) stack parameters  ||||
               |-first arg-passed via stack-|                  ||)
               |-return address-----------|                  )
 register save area------------------------|                  ||}
       local data|------------------------|                    current frame
  parameter area|------------------------|                  ||)
               | ...                      |
Figure 2: Stack layout on x86 fastcall (MS) calling convention

GNU fastcall

Registers and register usage
Name Brief description


eax scratch, return value
ebx preserve
ecx scratch, parameter 0
edx scratch, parameter 1, return value
esi preserve
edi preserve
ebp preserve
esp stack pointer
st0 scratch, floating point return value
st1-st7 scratch
Table 13: Register usage on x86 fastcall (GNU) calling convention
Parameter passing Return values Stack layout

Stack directly after function prolog:

               |                        |
               | ...                      |
 register save area------------------------|                  )
               |------------------------|                  ||||
       local data(|------------------------|)                 ||}
             { | last arg                |}                   caller’s frame
parameter area ( | ...                     |) stack parameters  ||||
               |-first arg-passed via stack-|                  ||)
               |-return address-----------|                  )
 register save area------------------------|                  ||}
       local data|------------------------|                    current frame
  parameter area|------------------------|                  ||)
               | ...                      |
Figure 3: Stack layout on x86 fastcall (GNU) calling convention

Borland fastcall

Also called register convention by Borland. Registers and register usage

Name Brief description


eax scratch, parameter 0, return value
ebx preserve
ecx scratch, parameter 2
edx scratch, parameter 1, return value
esi preserve
edi preserve
ebp preserve
esp stack pointer
st0 scratch, floating point return value
st1-st7 scratch
Table 14: Register usage on x86 fastcall (Borland) calling convention
Parameter passing Return values Stack layout

Stack directly after function prolog:

               |                        |
               | ...                      |
 register save area------------------------|                  )
               |------------------------|                  ||||
       local data(|------------------------|)                 ||}
             { | first arg passed via stack |}                   caller’s frame
parameter area ( | ...                     |) stack parameters  ||||
               |-last arg----------------|                  ||)
               |-return address-----------|                  )
 register save area------------------------|                  ||}
       local data|------------------------|                    current frame
  parameter area|------------------------|                  ||)
               | ...                      |
Figure 4: Stack layout on x86 fastcall (Borland) calling convention

Watcom fastcall

Registers and register usage
Name Brief description


eax scratch, parameter 0, return value
ebx scratch when used for parameter, otherwise preserve, parameter 2
ecx scratch when used for parameter, otherwise preserve, parameter 3
edx scratch when used for parameter, otherwise preserve, parameter 1, return value
esi scratch when used for return pointer, otherwise preserve
edi preserve
ebp preserve
esp stack pointer
st0 scratch, floating point return value
st1-st7 scratch
Table 15: Register usage on x86 fastcall (Watcom) calling convention
Parameter passing Return values Stack layout

Stack directly after function prolog:

               |                        |
               | ...                      |
 register save area------------------------|                  )
               |------------------------|                  ||||
       local data(|------------------------|)                 ||}
             { | last arg                |}                   caller’s frame
parameter area ( | ...                     |) stack parameters  ||||
               |-first arg-passed via stack-|                  ||)
               |-return address-----------|                  )
 register save area------------------------|                  ||}
       local data|------------------------|                    current frame
  parameter area|------------------------|                  ||)
               | ...                      |
Figure 5: Stack layout on x86 fastcall (Watcom) calling convention

win32 stdcall

Registers and register usage
Name Brief description


eax scratch, return value
ebx preserve
ecx scratch
edx scratch, return value
esi preserve
edi preserve
ebp preserve
esp stack pointer
st0 scratch, floating point return value
st1-st7 scratch
Table 16: Register usage on x86 stdcall calling convention
Parameter passing Return values Stack layout

Stack directly after function prolog:

               |                        |
               | ...                      |
 register save area------------------------|                  )
               |------------------------|                  ||||
       local data(|------------------------|)                 ||}
             { | arg n- 1                  |}                   caller’s frame
parameter area ( | ...                     |) stack parameters  ||||
               |-arg-0-------------------|                  ||)
               |-return address-----------|                  )
 register save area------------------------|                  ||}
       local data|------------------------|                    current frame
  parameter area|------------------------|                  ||)
               | ...                      |
Figure 6: Stack layout on x86 stdcall calling convention

MS thiscall

Registers and register usage
Name Brief description


eax scratch, return value
ebx preserve
ecx scratch, parameter 0
edx scratch, return value
esi preserve
edi preserve
ebp preserve
esp stack pointer
st0 scratch, floating point return value
st1-st7 scratch
Table 17: Register usage on x86 thiscall (MS) calling convention
Parameter passing Return values Stack layout

Stack directly after function prolog:

               |                        |
               | ...                      |
 register save area------------------------|                  )
               |------------------------|                  ||||
       local data(|------------------------|)                 ||}
             { | arg n- 1                  |}                   caller’s frame
parameter area ( | ...                     |) stack parameters  ||||
               |-arg-1-------------------|                  ||)
               |-return address-----------|                  )
 register save area------------------------|                  ||
       local data|------------------------|                  }
  parameter area|------------------------|                  || current frame
               | ..                      |                  )
                 .
Figure 7: Stack layout on x86 thiscall (MS) calling convention

GNU thiscall

This is equivalent to the cdecl calling convention, with the first parameter being the this pointer.

pascal

The best known uses of the pascal calling convention are the 16 bit OS/2 APIs, Microsoft Windows 3.x and Borland Delphi 1.x. It is a variation of stdcall, however, arguments are passed from left-to-right. Since this calling convention is for 16-bit APIs, it is not discussed in further detail, here.

plan9call

Registers and register usage
Name Brief description


eax scratch, return value
ebx scratch
ecx scratch
edx scratch
esi scratch
edi scratch
ebp scratch
esp stack pointer
st0 scratch, floating point return value
st1-st7 scratch
Table 18: Register usage on x86 plan9call calling convention
Parameter passing Return values Stack layout

Note there is no register save area at all. Stack directly after function prolog:

               |                        |
               | ...                      |
       local data|------------------------|                  )
             ( |------------------------|)                 ||||
             { | arg n- 1                  |}                 } caller’s frame
parameter area ( | ...                     |) stack parameters  ||
               |-arg-0-------------------|                  ||)
               |-return address-----------|                  )
       local data|------------------------|                  }
  parameter area|------------------------|                  ) current frame
               | ...                      |
Figure 8: Stack layout on x86 plan9call calling convention

Linux syscalls

Parameter passing

*BSD syscalls

Parameter passing

x64 Calling Conventions

Overview

The x64 (64bit) architecture designed by AMD is based on Intel’s x86 (32bit) architecture, supporting it natively. It is sometimes referred to as x86-64, AMD64, or, cloned by Intel, EM64T or Intel64.
On this processor, a word is defined to be 16 bits in size, a dword 32 bits and a qword 64 bits. Note that this is due to historical reasons (terminology didn’t change with the introduction of 32 and 64 bit processors).
The x64 calling convention for MS Windows [25] differs from the SystemV x64 calling convention [26] used by Linux/*BSD/... Note that this is not the only difference between these operating systems. The 64 bit programming model in use by 64 bit windows is LLP64, meaning that the C types int and long remain 32 bits in size, whereas long long becomes 64 bits. Under Linux/*BSD/... it’s LP64.

Compared to the x86 architecture, the 64 bit versions of the registers are called rax, rbx, etc.. Furthermore, there are eight new general purpose registers r8-r15.
dyncall support

Currently, the MS Windows and System V calling conventions are supported.
Dyncall can also be used to issue syscalls on System V platforms by using the syscall number as target parameter and selecting the correct mode.

MS Windows

Registers and register usage
Name Brief description


rax scratch, return value
rbx permanent
rcx scratch, parameter 0 if integer or pointer
rdx scratch, parameter 1 if integer or pointer
rdi permanent
rsi permanent
rbp permanent, may be used as frame pointer
rsp stack pointer
r8-r9 scratch, parameter 2 and 3 if integer or pointer
r10-r11 scratch, permanent if required by caller (used for syscall/sysret)
r12-r15 permanent
xmm0 scratch, floating point parameter 0, floating point return value
xmm1-xmm3 scratch, floating point parameters 1-3
xmm4-xmm5 scratch, permanent if required by caller
xmm6-xmm15 permanent
Table 19: Register usage on x64 MS Windows platform
Parameter passing Return values Stack layout

Stack frame is always 16-byte aligned. Stack directly after function prolog:

               |                        |
               | ...                      |
 register save area------------------------|                  )
               |------------------------|                  ||||
       local data(|------------------------|)                 ||||
             |||| | arg n- 1                  |}                 ||||
             |||| | ...                     |) stack parameters  ||}
             { | arg 4                   |)                   caller’s frame
parameter area || | r9 or xmm3              |||}                 ||||
             |||| | r8 or xmm2              |  spill area        ||||
             ||( | rdx or xmm1             |||)                 ||||
               |-rcx-or xmm0-------------|                  ||)
               |-return address-----------|                  )
 register save area------------------------|                  ||}
       local data|------------------------|                    current frame
  parameter area|------------------------|                  ||)
               | ...                      |
Figure 9: Stack layout on x64 Microsoft platform

System V (Linux / *BSD / MacOS X)

Registers and register usage
Name Brief description


rax scratch, return value, special use for varargs (in al, see below)
rbx permanent
rcx scratch, parameter 3 if integer or pointer
rdx scratch, parameter 2 if integer or pointer, return value
rdi scratch, parameter 0 if integer or pointer
rsi scratch, parameter 1 if integer or pointer
rbp permanent, may be used as frame pointer
rsp stack pointer
r8-r9 scratch, parameter 4 and 5 if integer or pointer
r10-r11 scratch
r12-r15 permanent
xmm0-xmm1 scratch, floating point parameters 0-1, floating point return value
xmm2-xmm7 scratch, floating point parameters 2-7
xmm8-xmm15 scratch
st0-st1 scratch, 16 byte floating point return value
st2-st7 scratch
Table 20: Register usage on x64 System V (Linux/*BSD)
Parameter passing Return values Stack layout

Stack frame is always 16-byte aligned. A 128 byte large zone beyond the location pointed to by the stack pointer is referred to as ”red zone”, considered to be reserved and not be modified by signal or interrupt handlers (useful for temporary data not needed to be preserved across calls, and for optimizations for leaf functions). Stack directly after function prolog:

                     |                       |
                     | ...                     |
      register save area|-----------------------|                   )
                     |-----------------------|                   ||||
local data (with padding()----------------------| )                 ||}
                  {  | arg n- 1                 | }                   caller’s frame
      parameter area( | ...                    | )stack parameters  ||||
                     |-arg-6------------------|                   ||)
                     |-return-address-----------|                   )
      register save area|-----------------------|                   ||}
            local data|-----------------------|                     current frame
        parameter area|-----------------------|                   ||)
                     | ...                     |
Figure 10: Stack layout on x64 System V (Linux/*BSD)

System V syscalls

Parameter passing

PowerPC (32bit) Calling Conventions

Overview dyncall support

Dyncall and dyncallback are supported for PowerPC (32bit) Big Endian (MSB), for Darwin’s and System V’s calling convention.
Dyncall can also be used to issue syscalls by using the syscall number as target parameter and selecting the correct mode.

Mac OS X/Darwin

Registers and register usage
Name Brief description


gpr0 scratch
gpr1 stack pointer
gpr2 scratch
gpr3,gpr4 return value, parameter 0 and 1 for integer or pointer, scratch
gpr5-gpr10 parameter 2-7 for integer or pointer parameters, scratch
gpr11 preserve
gpr12 branch target for dynamic code generation
gpr13-31 preserve
fpr0 scratch
fpr1 floating point return value, floating point parameter 0 (always double precision)
fpr2-fpr13 floating point parameters 1-12 (always double precision)
fpr14-fpr31 preserve
v0-v1 scratch
v2-v13 vector parameters
v14-v19 scratch
v20-v31 preserve
lr link-register, scratch
ctr count-register, scratch
cr0-cr7 conditional register fields, each 4-bit wide (cr0-cr1 and cr5-cr7 are scratch)
Table 21: Register usage on Darwin PowerPC 32-Bit
Parameter passing Return values Stack layout

Stack frame is always 16-byte aligned. Prolog opens frame with additional, fixed space for a linkage area, to hold a number of values (not all of them are required to be saved, though). Stack directly after function prolog:

               |                         |
               | ...                       |
 register save area-------------------------|                      )
               |-------------------------|                      ||||
       local data(|-------------------------|)                     ||||
             |||| | last arg                 |}                     ||||
             ||{ | ...                      |) stack parameters      ||||
parameter area   | 9th word of arg data       |)                     ||||
             |||| | gpr10                    |}                     ||}
             ||( | ...                      |) spill area (as needed) caller’s frame
             ( |-gpr3---------------------|                      ||||
             |||| | reserved                  |                      ||||
             ||{ | reserved                  |                      ||||
  linkage area   | reserved                  |                      ||||
             |||| | return address (callee saved)                      ||||
             ||( | condition reg (callee saved) |                      ||)
               |-parent stack frame-pointer-|                      )
 register save area-------------------------|                      ||}
       local data|-------------------------|                       current frame
  parameter area|-------------------------|                      ||)
     linkage area| ...                       |
Figure 11: Stack layout on ppc32 Darwin

System V PPC 32-bit

Status Registers and register usage
Name Brief description


r0 scratch
r1 stack pointer, preserve
r2 system-reserved
r3-r4 parameter passing and return value, scratch
r5-r10 parameter passing, scratch
r11-r12 scratch
r13 small data area pointer register
r14-r30 local variables, preserve
r31 used for local variables or environment pointer, preserve
f0 scratch
f1 parameter passing and return value, scratch
f2-f8 parameter passing, scratch
f9-13 scratch
f14-f31 local variables, preserve
cr0-cr7 conditional register fields, each 4-bit wide (cr0-cr1 and cr5-cr7 are scratch)
lr link register, scratch
ctr count register, scratch
xer fixed-point exception register, scratch
fpscr floating-point Status and Control Register
Table 22: Register usage on System V ABI PowerPC Processor
Parameter passing Return values Stack layout

Stack frame is always 16-byte aligned. Stack directly after function prolog:

               |                         |
               | ...                       |
 register save area-------------------------|                  )
               |-------------------------|                  ||||
       local data(|-------------------------|)                 ||||
             { | last arg                 |}                 } caller’s frame
parameter area ( | ...                      |) stack parameters  ||
               |-first arg-passed via stack--|                  ||||
               |-return address-(callee saved)                  ||)
               |-parent stack frame-pointer-|                  )
 register save area-------------------------|                  ||}
       local data|-------------------------|                    current frame
  parameter area|-------------------------|                  ||)
               | ...                       |
Figure 12: Stack layout on System V ABI for PowerPC 32-bit calling convention

System V PPC 32-bit / Linux Standard Base version

This is in essence the same as the System V PPC 32-bit calling convention, but differs for aggregate return values:

System V syscalls

Parameter passing

PowerPC (64bit) Calling Conventions

Overview dyncall support

Dyncall and dyncallback are supported for PowerPC (64bit) Big Endian and Little Endian ELF ABIs on System V systems. Mac OS X is not supported.
Dyncall can also be used to issue syscalls by using the syscall number as target parameter and selecting the correct mode.

PPC64 ELF ABI

Registers and register usage
Name Brief description


gpr0 scratch
gpr1 stack pointer
gpr2 TOC base ptr (offset table and data for position independent code), scratch
gpr3 return value, parameter 0 for integer or pointer, scratch
gpr4-gpr10 parameter 1-7 for integer or pointer parameters, scratch
gpr11 env pointer if needed, scratch
gpr12 used for exception handling and glink code, scratch
gpr13 used for system thread ID, preserve
gpr14-31 preserve
fpr0 scratch
fpr1-fpr4 floating point return value, floating point parameter 0-3 (always double precision)
fpr5-fpr13 floating point parameters 4-12 (always double precision)
fpr14-fpr31 preserve
v0-v1 scratch
v2-v13 vector parameters
v14-v19 scratch
v20-v31 preserve
lr link-register, scratch
ctr count-register, scratch
xer fixed point exception register, scratch
fpscr floating point status and control register, scratch
cr0-cr7 conditional register fields, each 4-bit wide (cr0-cr1 and cr5-cr7 are scratch)
Table 23: Register usage on PowerPC 64-Bit ELF ABI
Parameter passing Return values Stack layout

Stack frame is always 16-byte aligned. Stack directly after function prolog:

               |                         |
               | ...                       |
 register save area-------------------------|                      )
               |-------------------------|                      ||||
       local data(|-------------------------|)                     ||||
             |||| | last arg                 |}                     ||||
             ||{ | ...                      |) stack parameters      ||||
parameter area   | arg 8                    |)                     ||||
             |||| | gpr10                    |}                     ||}
             ||( | ...                      |) spill area (as needed) caller’s frame
             ( |-gpr3---------------------|                      ||||
             |||| | TOC ptr reg             |                      ||||
             ||{ | reserved                  |                      ||||
  linkage area   | reserved                  |                      ||||
             |||| | return address (callee saved)                      ||||
             ||( | condition reg (callee saved) |                      ||)
               |-parent stack frame-pointer-|                      )
 register save area-------------------------|                      ||}
       local data|-------------------------|                       current frame
  parameter area|-------------------------|                      ||)
     linkage area| ...                       |
Figure 13: Stack layout on ppc64 ELF ABI

System V syscalls

Parameter passing

ARM32 Calling Conventions

Overview

The ARM32 family of processors is based on the Advanced RISC Machines (ARM) processor architecture (32 bit RISC). The word size is 32 bits (and the programming model is LLP64).
Basically, this family of microprocessors can be run in 2 major modes:

Mode Description


ARM 32bit instruction set
THUMB compressed instruction set using 16bit wide instruction encoding


For more details, take a look at the ARM-THUMB Procedure Call Standard (ATPCS) [18], the Procedure Call Standard for the ARM Architecture (AAPCS) [19], as well as Debian’s ARM EABI port [23] and hard-float [24] wiki pages.

dyncall support

Currently, the dyncall library supports the ARM and THUMB mode of the ARM32 family (ATPCS [18], EABI [23], the ARM hard-float (armhf) [23] varian, as well as Apple’s calling convention based on the ATPCS), excluding manually triggered ARM-THUMB interworking calls.
Also supported is armhf, a calling convention with register support to pass floating point numbers. FPA and the VFP (scalar mode) procedure call standards, as well as some instruction sets accelerating DSP and multimedia application like the ARM Jazelle Technology (direct Java bytecode execution, providing acceleration for some bytecodes while calling software code for others), etc., are not supported by the dyncall library.

ATPCS ARM mode

Registers and register usage

In ARM mode, the ARM32 processor has sixteen 32 bit general purpose registers, namely r0-r15:

Name Alias Brief description



r0 a1 parameter 0, scratch, return value
r1 a2 parameter 1, scratch, return value
r2,r3 a3,a4 parameters 2 and 3, scratch
r4-r9 v1-v6 permanent
r10 sl permanent
r11 fp frame pointer, permanent
r12 ip scratch
r13 sp stack pointer, permanent
r14 lr link register, permanent
r15 pc program counter (note: due to pipeline, r15 points to 2 instructions ahead)
Table 24: Register usage on arm32
Parameter passing Return values Stack layout

Stack directly after function prolog:

                                |                       |
                                |...                      |
                 register save area|-----------------------|                      )
                                |-----------------------|                      ||||
                       local da(ta|-----------------------| )                    } caller’s frame
                             ||||  |last arg                | }                    ||
                             ||||  |...                    | )stack parameters     ||)
                             {  |5th-word of arg data---| )                    )
                 parameter area|| |r3                     | ||}                    |||
                             ||||  |r2                     |  spill area (if needed) ||||
                             ||(  |r1                     | ||)                    |}
                                |r0---------------------|                      | current frame
register save area (with return address)----------------------|                      ||||
                       local data|-----------------------|                      |||)
                   parameter area|...                      |
Figure 14: Stack layout on arm32

ATPCS THUMB mode

Status Registers and register usage

In THUMB mode, the ARM32 processor family supports eight 32 bit general purpose registers r0-r7 and access to high order registers r8-r15:

Name Alias Brief description



r0 a1 parameter 0, scratch, return value
r1 a2 parameter 1, scratch, return value
r2,r3 a3,a4 parameters 2 and 3, scratch
r4-r6 v1-v3 permanent
r7 v4 frame pointer, permanent
r8-r11 v5-v8 permanent
r12 ip scratch
r13 sp stack pointer, permanent
r14 lr link register, permanent
r15 pc program counter (note: due to pipeline, r15 points to 2 instructions ahead)
Table 25: Register usage on arm32 thumb mode
Parameter passing Return values Stack layout

Stack directly after function prolog:

                                |                       |
                                |...                      |
                 register save area|-----------------------|                      )
                                |-----------------------|                      ||||
                       local da(ta|-----------------------| )                    } caller’s frame
                             ||||  |last arg                | }                    ||
                             ||||  |...                    | )stack parameters     ||)
                             {  |5th-word of arg data---| )                    )
                 parameter area|| |r3                     | ||}                    |||
                             ||||  |r2                     |  spill area (if needed) ||||
                             ||(  |r1                     | ||)                    |}
                                |r0---------------------|                      | current frame
register save area (with return address)----------------------|                      ||||
                       local data|-----------------------|                      |||)
                   parameter area|...                      |
Figure 15: Stack layout on arm32 thumb mode

EABI (ARM and THUMB mode)

The ARM EABI is very similar to the ABI outlined in ARM-THUMB procedure call standard (ATPCS) [18] - however, the EABI requires the stack to be 8-byte aligned at function entries, as well as for 64 bit parameters. The latter are aligned on 8-byte boundaries on the stack and 2-registers for a parameter passed via register. In order to achieve such an alignment, a register might have to be skipped for parameters passed via registers, or 4-bytes on the stack for parameters passed via the stack. Refer to the Debian ARM EABI port wiki for more information [23].

ARM on Apple’s iOS (Darwin) Platform (ARM and THUMB mode)

The iOS runs on ARMv6 (iOS 2.0) and ARMv7 (iOS 3.0) architectures. Both, ARM and THUMB are available, code is usually compiled in THUMB mode.

Register usage

Name Alias Brief description



r0 parameter 0, scratch, return value
r1 parameter 1, scratch, return value
r2,r3 parameters 2 and 3, scratch
r4-r6 permanent
r7 frame pointer, permanent
r8 permanent
r9 permanent (iOS 2.0) / scratch (since iOS 3.0)
r10-r11 permanent
r12 scratch, intra-procedure scratch register (IP) used by dynamic linker
r13 sp stack pointer, permanent
r14 lr link register, permanent
r15 pc program counter (note: due to pipeline, r15 points to 2 instructions ahead)
cpsr program status register
d0-d7 scratch, aliases s0-s15, on ARMv7 also as q0-q3; not accessible from Thumb mode on ARMv6
d8-d15 permanent, aliases s16-s31, on ARMv7 also as q4-q7; not accesible from Thumb mode on ARMv6
d16-d31 only available in ARMv7, aliases q8-q15
fpscr VFP status register
Table 26: Register usage on ARM Apple iOS
Parameter passing and Return values

The ABI is based on the AAPCS but with the following important differences:

Stack layout

Stack directly after function prolog:

                                |                           |
                                |...                          |
                 register save area|---------------------------|                     )
                                |---------------------------|                     ||||
                       local da(ta|---------------------------|)                    } caller’s frame
                             ||||  |last arg                    |}                    ||
                             ||||  |...                        |) stack parameters     ||)
                             {  |5th-word of arg data-@@@verify|)                  )
                 parameter area|| |r3                         |||}                    |||
                             ||||  |r2                         |  spill area (if needed) ||||
                             ||(  |r1                         |||)                    |}
                                |r0-------------------------|                     | current frame
register save area (with return address)--------------------------|                     ||||
                       local data|---------------------------|                     |||)
                   parameter area|...                          |
Figure 16: Stack layout on arm32 (Apple)

ARM hard float (armhf)

Most debian-based Linux systems on ARMv7 (or ARMv6 with FPU) platforms use a calling convention referred to as armhf, using 16 32-bit floating point registers of the FPU of the VFPv3-D16 extension to the ARM architecture. Refer to the debian wiki for more information [24].

Code is little-endian, rest is similar to EABI with an 8-byte aligned stack, etc..

Register usage

Name Alias Brief description



r0 a1 parameter 0, scratch, non floating point return value
r1 a2 parameter 1, scratch, non floating point return value
r2,r3 a3,a4 parameters 2 and 3, scratch
r4-r9 v1-v6 permanent
r10 sl permanent
r11 fp frame pointer, permanent
r12 ip scratch, intra-procedure scratch register (IP) used by dynamic linker
r13 sp stack pointer, permanent
r14 lr link register, permanent
r15 pc program counter (note: due to pipeline, r15 points to 2 instructions ahead)
cpsr program status register
s0 floating point argument, floating point return value, single precision
d0 floating point argument, floating point return value, double precision, aliases s0-s1
s1-s15 floating point arguments, single precision
d1-d7 aliases s2-s15, floating point arguments, double precision
fpscr VFP status register
Table 27: Register usage on armhf
Parameter passing Return values Stack layout

Stack directly after function prolog:

                                |                       |
                                |...                      |
                 register save area|-----------------------|                      )
                                |-----------------------|                      ||||
                       local da(ta|-----------------------| )                    } caller’s frame
                             ||||  |last arg                | }                    ||
                             ||||  |...                    | )stack parameters     ||)
                             {  |first arg passed via-stack| )                    )
                 parameter area|| |r3                     | ||}                    |||
                             ||||  |r2                     |  spill area (if needed) ||||
                             ||(  |r1                     | ||)                    |}
                                |r0---------------------|                      | current frame
register save area (with return address)----------------------|                      ||||
                       local data|-----------------------|                      |||)
                   parameter area|...                      |
Figure 17: Stack layout on arm32 armhf

Architectures

The ARM architecture family contains several revisions with capabilities and extensions (such as thumb-interworking, more vector registers, ...) The following table sums up the most important properties of the various architecture standards, from a calling convention perspective.

Arch Platforms Details



ARMv4
ARMv4T ARM 7, ARM 9, Neo FreeRunner (OpenMoko)
ARMv5 ARM 9E BLX instruction available
ARMv6 No vector registers available in thumb
ARMv7 iPod touch, iPhone 3GS/4, Raspberry Pi 2 VFP, armhf convention on some platforms
ARMv8 iPhone 6 and higher 64bit support
Table 28: Overview of ARM Architecture, Platforms and Details

ARM64 Calling Conventions

Overview

ARMv8 introduced the AArch64 calling convention. ARM64 chips can be run in 64 or 32bit mode, but not by the same process. Interworking is only intra-process.
The word size is defined to be 32 bits, a dword 64 bits. Note that this is due to historical reasons (terminology didn’t change from ARM32).
For more details, take a look at the Procedure Call Standard for the ARM 64-bit Architecture [20].
dyncall support

The dyncall library supports the ARM 64-bit AArch64 PCS ABI, as well as Apple’s and Microsoft’s conventions which are derived from it, for both, calls and callbacks.

AAPCS64 Calling Convention

Registers and register usage

ARM64 features thirty-one 64 bit general purpose registers, namely r0-r30, which are referred to as either x0-x30 for 64bit access, or w0-w30 for 32bit access (with upper bits either cleared or sign extended on load).
Also, there is sp/xzr/wzr, a register with restricted use, used for the stack pointer in instructions dealing with the stack (sp) or a hardware zero register for all other instructions xzr/wzr, and pc, the program counter. Additionally, there are thirty-two 128 bit registers v0-v31, to be used as SIMD and floating point registers, referred to as q0-q31, d0-d31 and s0-s31, respectively (in contrast to AArch32, those do not overlap multiple narrower registers), depending on their use:

Name Brief description


x0-x7 parameters, scratch, return value
x8 indirect result location pointer
x9-x15 scratch
x16 permanent in some cases, can have special function (IP0), see doc
x17 permanent in some cases, can have special function (IP1), see doc
x18 reserved as platform register, advised not to be used for handwritten, portable asm, see doc
x19-x28 permanent
x29 permanent, frame pointer
x30 permanent, link register
sp permanent, stack pointer
pc program counter
v0-v7 scratch, float parameters, return value
v8-v15 lower 64 bits are permanent, scratch
v16-v31 scratch
xzr zero register, always zero
Table 29: Register usage on arm64
Parameter passing Return values Stack layout

Stack directly after function prolog:

                                |                       |
                                |...                      |
                 register save area|-----------------------|                      )
                                |-----------------------|                      ||||
                       local da(ta|-----------------------| )                    } caller’s frame
                             ||||  |arg n- 1                 | }                    ||
                             ||||  |...                    | )stack parameters     ||)
                             ||||  |arg-8------------------| )                    )
                             {  |x7                     | |||                    |||
                 parameter area|| |...                    | |||}                    ||||
                             ||||  |x? (first unnamed reg)    |  spill area (if needed) ||||
                             ||||  |q7                     | |||                    |}
                             ||(  |...                    | |||)                    | current frame
                                |q0---------------------|                      ||||
register save area (with return address)----------------------|                      ||||
                       local data|-----------------------|                      |||)
                   parameter area|...                      |
Figure 18: Stack layout on arm64

Apple’s ARM64 Function Calling Convention

Overview

Apple’s ARM64 calling convention is based on the AAPCS64 standard, however, diverges in some ways. Only the differences are listed here, for more details, take a look at Apple’s official documentation [21].

Microsoft’s ARM64 Function Calling Convention

Overview

Microsoft’s ARM64 calling convention is based on the AAPCS64 standard, however, diverges for variadic functions. Only the differences are listed here, for more details, take a look at Microsoft’s official documentation [22].

MIPS32 Calling Conventions

Overview

Multiple revisions of the MIPS Instruction set exist, namely MIPS I, MIPS II, MIPS III, MIPS IV, MIPS32 and MIPS64. Nowadays, MIPS32 and MIPS64 are the main ones used for 32-bit and 64-bit instruction sets, respectively.
Given MIPS processors are often used for embedded devices, several add-on extensions exist for the MIPS family, for example:

MIPS-3D
simple floating-point SIMD instructions dedicated to common 3D tasks.
MDMX
(MaDMaX) more extensive integer SIMD instruction set using 64 bit floating-point registers.
MIPS16e
adds compression to the instruction stream to make programs take up less room (allegedly a response to the THUMB instruction set of the ARM architecture).
MIPS MT
multithreading additions to the system similar to HyperThreading.

Unfortunately, there is actually no such thing as ”The MIPS Calling Convention”. Many possible conventions are used by many different environments such as O32[38], O64[39], N32[40], N64[40], EABI[41] and NUBI[42].
dyncall support

Currently, dyncall supports for MIPS 32-bit architectures the widely-used O32 calling convention (for all four combinations of big/little-endian, and soft/hard-float targets), as well as EABI (little-endian/hard-float, which is used on the Homebrew SDK for the Playstation Portable). dyncall currently does not support MIPS16e (contrary to the like-minded ARM-THUMB, which is supported). Both, calls and callbacks are supported.

MIPS EABI 32-bit Calling Convention

Register usage
Name Alias Brief description



$0 $zero hardware zero, scratch
$1 $at assembler temporary, scratch
$2-$3 $v0-$v1 integer results, scratch
$4-$11 $a0-$a7 integer arguments, or double precision float arguments, scratch
$12-$15,$24 $t4-$t7,$t8 integer temporaries, scratch
$25 $t9 integer temporary, address of callee for PIC calls (by convention), scratch
$16-$23 $s0-$s7 preserve
$26,$27 $kt0,$kt1 reserved for kernel
$28 $gp global pointer, preserve
$29 $sp stack pointer, preserve
$30 $s8/$fp frame pointer (some assemblers name it $fp), preserve
$31 $ra return address, preserve
hi, lo multiply/divide special registers
$f0,$f2 float results, scratch
$f1,$f3,$f4-$f11,$f20-$f23 float temporaries, scratch
$f12-$f19 single precision float arguments, scratch
Table 30: Register usage on MIPS32 EABI calling convention
Parameter passing Return values Stack layout

Stack directly after function prolog:

                                |                       |
                                |...                      |
                 register save area|-----------------------|                      )
                                |-----------------------|                      ||||
                       local da(ta|-----------------------| )                    } caller’s frame
                             ||||  |last arg                | }                    ||
                             ||{  |...                    | )stack parameters     ||)
                 parameter area  |first arg passed via-stack| )                    )
                             ||||  |$a7                    | }                    |||
                             ||(  |...                    | )spill area (if needed) |||}
                                |$a?-(first-unnamed-reg)---|                        current frame
register save area (with return address)----------------------|                      |||
                       local data|-----------------------|                      |||)
                   parameter area|...                      |
Figure 19: Stack layout on MIPS EABI 32-bit calling convention

MIPS O32 32-bit Calling Convention

Register usage
Name Alias Brief description



$0 $zero hardware zero
$1 $at assembler temporary
$2-$3 $v0-$v1 return value (only integer on hard-float targets), scratch
$4-$7 $a0-$a3 first arguments (only integer on hard-float targets), scratch
$8-$15,$24 $t0-$t7,$t8 temporaries, scratch
$25 $t9 temporary, holds address of called function for PIC calls (by convention)
$16-$23 $s0-$s7 preserved
$26,$27 $k0,$k1 reserved for kernel
$28 $gp global pointer, preserved by caller
$29 $sp stack pointer, preserve
$30 $s8/$fp frame pointer (some assemblers name it $fp), preserve
$31 $ra return address, preserve
hi, lo multiply/divide special registers
$f0-$f3 only on hard-float targets: float return value, scratch
$f4-$f11,$f16-$f19 only on hard-float targets: float temporaries, scratch
$f12-$f15 only on hard-float targets: first floating point arguments, scratch
$f20-$f31 only on hard-float targets: preserved
Table 31: Register usage on MIPS O32 calling convention
Parameter passing Return values Stack layout

Stack directly after function prolog:

                                |                       |
                                |...                      |
register save area (with return address)----------------------|                   )
                                |-----------------------|                   ||||
           local data (and padding()|-----------------------|                   ||||
                             ||||  |padding (if needed)     | )                 ||||
                             ||||  |last arg                | }                 ||}
                             ||{  |...                    | )stack parameters    caller’s frame
                 parameter area  |first arg passed via stack| )                 ||||
                             ||||  |$a3                    | ||}                 ||||
                             ||||  |$a2                    |  spill area         ||||
                             ||(  |$a1                    | ||)                 ||)
                                |$a0--------------------|                   )
                 register save area|-----------------------|                   }
                       local data|-----------------------|                   ) current frame
                   parameter area|...                      |
Figure 20: Stack layout on MIPS O32 calling convention

MIPS64 Calling Conventions

Overview

There are two main ABIs in use for MIPS64 chips, N64[40] and N32[40]. Both are basically the same, except that N32 uses ILP32 as programming model (32-bit pointers and long integers), whereas N64 uses LP64 (64-bit pointers and long integers). All registers of a MIPS64 chip are considered to be 64-bit wide, even for the N32 calling convention.
The word size is defined to be 32 bits, a dword 64 bits. Note that this is due to historical reasons (terminology didn’t change from MIPS32).
Other than that there are correspoding 64-bit versions other MIPS32 ABIs, e.g. the EABI[41] and O64[39]. dyncall support

For MIPS 64-bit machines, dyncall supports the N64 calling conventions for calls and callbacks (for all four combinations of big/little-endian, and soft/hard-float targets). The N32 calling convention might work - it used to, but hasn’t been tested, recently.

MIPS N64 Calling Convention

Register usage
Name Alias Brief description



$0 $zero hardware zero
$1 $at assembler temporary, scratch
$2-$3 $v0-$v1 return value (only integers on hard-float targets), scratch
$4-$11 $a0-$a7 first arguments (only integers on hard-float targets), scratch
$12-$15,$24 $t4-$t7,$t8 temporaries, scratch
$25 $t9 temporary, address callee for all PIC calls (by convention), scratch
$16-$23 $s0-$s7 preserve
$26,$27 $kt0,$kt1 reserved for kernel
$28 $gp global pointer, preserve
$29 $sp stack pointer, preserve
$30 $s8 frame pointer, preserve
$31 $ra return address, preserve
hi, lo multiply/divide special registers
$f0,$f2 only on hard-float targets: float return values, scratch
$f1,$f3,$f4-$f11 only on hard-float targets: float temporaries, scratch
$f12-$f19 only on hard-float targets: float arguments, scratch
$f20-$f23 only on hard-float targets: float temporaries, scratch
$f24-$f31 only on hard-float targets: preserved
Table 32: Register usage on MIPS N64 calling convention
Parameter passing Return values Stack layout

Stack directly after function prolog:

                                |                       |
                                |...                      |
                 register save area|-----------------------|                      )
                                |-----------------------|                      ||||
                       local da(ta|-----------------------| )                    } caller’s frame
                             ||||  |arg n- 1                 | }                    ||
                             ||{  |...                    | )stack parameters     ||)
                 parameter area  |arg-8------------------| )                    )
                             ||||  |$a7                    | }                    |||
                             ||(  |...                    | )spill area (if needed) |||}
                                |$a?-(first-unnamed-reg)---|                        current frame
register save area (with return address)----------------------|                      |||
                       local data|-----------------------|                      |||)
                   parameter area|...                      |
Figure 21: Stack layout on MIPS N64 calling convention

MIPS N32 Calling Convention

Despite what one might think given the name, this is a MIPS 64-bit calling convention. As mentioned in the overview of this chapter, it is nearly identical to the N64 one, the differences being:

SPARC Calling Conventions

Overview

The SPARC family of processors is based on the SPARC instruction set architecture, which comes in basically three revisions, V7, V8[29][27] and V9[30][28]. The former two are 32-bit whereas the latter refers to the 64-bit SPARC architecture (see next chapter). SPARC uses big endian byte order.
The word size is defined to be 32 bits. dyncall support

dyncall fully supports the SPARC 32-bit instruction set (V7 and V8), for calls and callbacks.

SPARC (32-bit) Calling Convention

Register usage
Name Alias Brief description



%g0 %r0 Read-only, hardwired to 0
%g1-%g7 %r1-%r7 Global
%o0,%o1 and %i0,%i1 %r8,%r9 and %r24,%r25 Output and input argument registers, return value
%o2-%o5 and %i2-%i5 %r10-%r13 and %r26-%r29 Output and input argument registers
%o6 and %i6 %r14 and %r30, %sp and %fp Stack and frame pointer
%o7 and %i7 %r15 and %r31 Return address (caller writes to o7, callee uses i7)
%l0-%l7 %r16-%r23 preserve
%f0,%f1 Floating point return value
%f2-%f31 scratch
Table 33: Register usage on sparc calling convention
Parameter passing Return values Stack layout

Stack directly after function prolog:

                           |                       |
                           |...                      |
      local data (and padding) |-----------------------|                   )
                        (  |-----------------------|)                  ||||
                        ||||  |arg n-1                |}                  ||||
                        ||||  |...                    |) stack parameters   ||||
                        {  |7th word of arg data    |)                  }caller’s frame
            parameter area || |%o5                    |}                  ||
                        ||||  |...                    |) spill area         ||||
                        ||(  |%o0                    |                   ||||
                           |struct/union-return pointer|                   ||)
register save area (%i* and %l*)---------------------|                   )
      local data (and padding) |-----------------------|                   }
              parameter area |-----------------------|                   )current frame
                           |...                      |
Figure 22: Stack layout on sparc32 calling convention

SPARC64 Calling Conventions

Overview

The SPARC family of processors is based on the SPARC instruction set architecture, which comes in basically three revisions, V7, V8[29][27][31] and V9[30][28][31]. The former two are 32-bit (see previous chapter) whereas the latter refers to the 64-bit SPARC architecture. SPARC uses big endian byte order, however, V9 supports also little endian byte order, but for data access only, not instruction access.

There are two proposals, one from Sun and one from Hal, which disagree on how to handle some aspects of this calling convention.
dyncall support

dyncall fully supports the SPARC 64-bit instruction set (V9), for calls and callbacks.

SPARC (64-bit) Calling Convention

Name Alias Brief description



%g0 %r0 Read-only, hardwired to 0
%g1-%g7 %r1-%r7 Global
%o0-%o3 and %i0-%i3 %r8-%r11 and %r24-%r27 Output and input argument registers, return value
%o4,%o5 and %i4,%i5 %r12,%r13 and %r28,%r29 Output and input argument registers
%o6 and %i6 %r14 and %r30, %sp and %fp Stack and frame pointer (NOTE, offset with a BIAS of 2047)
%o7 and %i7 %r15 and %r31 Return address (caller writes to o7, callee uses i7)
%l0-%l7 %r16-%r23 preserve
%d0,%d2,%d4,%d6 scratch, Floating point arguments, return value
%d8,%d10,...,%d14 scratch, Floating point arguments
%d16,%d18,...,%d30 scratch (preserve for Hal), Floating point arguments
%d32,%d34,...,%d62 scratch (preserve for Hal)
Table 34: Register usage on sparc64 calling convention
Parameter passing Return values Stack layout

Stack directly after function prolog:

                           |                       |
                           |...                      |
      local data (and padding) |-----------------------|                   )
                        (  |-----------------------|)                  ||||
                        ||||  |arg n-1                |}                  ||||
                        ||{  |...                    |) stack parameters   ||}
            parameter area   |arg 6                  |)                   caller’s frame
                        ||||  |%o5                    |}                  ||||
                        ||(  |...                    |) spill area         ||||
                           |%o0--------------------|                   ||)
register save area (%i* and %l*)---------------------|                   )
      local data (and padding) |-----------------------|                   }
              parameter area |-----------------------|                   )current frame
                           |...                      |
Figure 23: Stack layout on sparc64 calling convention


previous
index
next