Go Assembly and ABI

The cover image is from Renee French and follows the Creative Commons 4.0 Attributions license

Registers#

Plan9	amd64		General Purpose
AX	rax	Accumulator	Stores arithmetic operands and return values
BX	rbx	Base Register	Stores memory base addresses (structures or arrays) or pointers
CX	rcx	Count Register	Count operations such as loop counters
DX	rdx	Data Register	Stores data such as multipliers/divisors
DI	rdi	Destination Index	Offset of the destination operand
SI	rsi	Source Index	Offset of the source operand
BP	rbp	Base Pointer	Saves stack base address
SP	rsp	Stack Pointer	Saves stack top pointer
PC	rip	Program Counter	Program counter
R8-R14	r8-r14		General registers

Pseudo Registers#

Name	Purpose
FP(Frame pointer)	Base address for parameters and local variables
PC(Program counter)	Program counter
SB(Static base pointer)	Base address for global variables
SP(Stack pointer)	Stack pointer (highest address of the current stack frame)

All local variables defined in user programs will be compiled into base addresses on the FP and SB pseudo registers with a certain offset.

The pseudo register SB is used to reference global variables such as:

foo(SB) indicates the memory address of the global variable foo
foo<>(SB) indicates that the global variable foo is only visible in the current file
foo+4(SB) indicates the memory base address of foo plus an offset of four bytes

The pseudo register FP is used to save a virtual stack pointer for referencing function parameters. The compiler uses FP plus an offset to access the parameters of the current function, and can also attach a parameter name during access, which, although not practically useful, helps in understanding and reading the code. Additionally, the assembler enforces that when using FP, a parameter name must be attached, such as:

0(FP) or first_arg+0(FP) indicates the first parameter of the current function
8(FP) or second_arg+8(FP) indicates the second parameter of the current function (the first parameter occupies 8 bytes)

Note: FP is a pseudo register regardless of the existence of a hardware FP register.

The pseudo register SP saves a virtual stack pointer for accessing local variables and function call parameters within the current stack frame. It points to the highest address of the current stack frame, so the offset can only be within the range of [−framesize, 0). For example, x-8(SP) y-4(SP) in architectures with hardware SP registers, there is a distinction between accessing the SP register with and without parameter name prefixes:

x-8(SP) accesses using the pseudo register SP with a parameter prefix
-8(SP) accesses using the hardware register SP without a parameter prefix

Symbol Definition#

In Go's object files and binaries, the complete symbol name consists of the package path followed by a dot and a symbol name, such as math/rand.Int. During the process of converting source files to assembly, the compiler converts it to math∕rand·Int, where the slash and dot are converted to U+2215 and U+00B7. When manually defining symbols in assembly, there is no need to include the full package name. During the linking process, the linker automatically adds the full package name to each symbol starting with a dot, so it is sufficient to define a symbol name like ·Int.

The assembler uses directives to bind code or data to symbols, for example:

Function Symbol Definition#

Functions (code segments) are defined using the TEXT directive, such as:

TEXT runtime·profileloop(SB),NOSPLIT,$24-8

pkgname: Package name, can be omitted
funcname(SB): Function name, since the function itself is a global symbol, referenced via SB
NOSPLIT: Assembler directive parameter, to be introduced later
$24-8: Indicates the size of the function's stack frame and (parameters + return values); the stack frame size must be provided when using the NOSPLIT parameter

Global Data Symbol Definition#

Global data is defined using a set of DATA directives along with a GLOBAL directive. The format of the DATA directive is:

DATA symbol+offset(SB)/width, value

This indicates initializing a memory segment of size width with an initial value of value at the specified offset of the symbol symbol. The offsets/widths of multiple DATA directives must be contiguous. The GLOBAL directive is used to declare global symbols and requires specifying the symbol name, parameters, and size. If the DATA directive does not have an initial value, GLOBAL will initialize it to 0, such as:

DATA divtab<>+0x00(SB)/4, $0xf4f8fcff
DATA divtab<>+0x04(SB)/4, $0xe6eaedf0
...
DATA divtab<>+0x3c(SB)/4, $0x81828384
GLOBL divtab<>(SB), RODATA, $64

GLOBL runtime·tlsoffset(SB), NOPTR, $4

The above code declares and initializes a 64-byte read-only global variable divtab and a 4-byte global variable runtime·tlsoffset, both initialized to 0. Here, NOPTR is declared, indicating that these data do not contain pointers.

Parameters for Symbol Definition#

Each assembly directive can contain one or two parameters. If there are two parameters, the first parameter must be a flag mask. All parameter definitions need to be included via #include "textflag.h". The parameters are as follows:

DUPOK: Allows multiple identical symbols in the binary; the linker will choose one of them
NOSPLIT: Used for TEXT directives, marking that stack overflow checks do not need to be inserted
RODATA: Used for DATA and GLOBAL directives, placing data in the read-only segment
NOPTR: Used for DATA and GLOBAL directives, marking that data does not contain pointers and does not require GC scanning
WRAPPER: Used for TEXT directives, marking that the function is just a wrapper and should not disable recover, see source code src/debug/gosym/pclntab.go
NEEDCTXT: Used for TEXT directives, marking that the function is a closure and requires the passed context register
TLSBSS: Used for DATA and GLOBAL directives, marking the allocation of TLS storage units and storing their offsets in variables
NOFRAME: Used for TEXT directives, marking that no instructions for allocating stack frame space should be inserted in the function, suitable for zero-stack-frame functions
REFLECTMETHOD: Marks that the function can call reflect.Type.Method/reflect.Type.MethodByName
TOPFRAME: Used for TEXT directives, marking this function as the top of the call stack, stack unwinding should stop here
ABIWRAPPER: Used for TEXT directives, marking this function as an ABI wrapper

Using Go Types and Constants in Assembly#

If a package contains .s files, the compiler will output a special header file go_asm.h during the build process. This header file contains many constant definitions, such as struct field offsets, struct type sizes, and constants defined in the current package. In assembly, Go types can be used by including this header file. In the go_asm.h file, various types are defined in the following forms:

Constants: const_name
Struct field offsets: type_field
Struct sizes: type__size

const bufSize = 1024

type reader struct {
    buf [bufSize]byte
    r   int
}

Using the above code as an example, in assembly code, you can:

Use const_bufSize to access the constant bufSize
Use reader__size to get the size of the struct reader
Use reader_buf and reader_r to get the offsets of the buf and r fields. If R1 contains a pointer to a reader, you can access the two fields via reader_buf(R1) and reader_r(R1).

Runtime#

To ensure the correctness of GC operations, the runtime must be aware of all pointers contained in stack frames and global variables. The compiler automatically inserts this information when compiling Go code, but in assembly code, it needs to be explicitly defined. Data symbols with the NOPTR parameter do not contain runtime-allocated data pointers; symbols with the RODATA parameter have their data allocated in the read-only segment of memory, thus also implicitly carrying the NOPTR mark; types smaller than pointer size naturally cannot contain pointers. Although symbols containing pointers cannot be defined in assembly code, they can be defined in Go code and referenced in assembly code via the corresponding symbols. Generally, the best practice is to define all non-read-only symbols in Go rather than in assembly code.

Each function needs to annotate the locations of its parameters, return values, and live pointers in the stack frame. If an assembly function has no pointer return values, no function calls, and no stack frame space requirements, it is sufficient to define the Go function prototype (signature) in the same package. For more complex situations, the funcdata.h header file needs to be included to reference pseudo-assembly directives for explicit annotation. Functions without parameters and return values (annotated as $n-0 in the TEXT directive) can ignore pointer information. Additionally, all pointer information must be provided through function prototypes (signatures) in Go code, even for assembly functions that are not directly called by Go functions.

At the beginning of a function, it can be assumed that parameters have been initialized, but return values are uninitialized. If there are pointers that survive during the function call in the return values, the function should initially set the return values to nil and execute the GO_RESULTS_INITIALIZED pseudo-instruction, which records that the return values have been initialized and should be scanned during stack transfers (expansion) and GC. In most cases, it is advisable to avoid returning pointers in assembly functions; at least, there are no assembly functions in the standard library that use GO_RESULTS_INITIALIZED.

If a function has no local stack frame (i.e., declared as $n-0 in the TEXT directive) or does not contain CALL instructions, pointer information can be ignored. Otherwise, local stack frames cannot contain pointers, and the assembler will execute the pseudo-instruction NO_LOCAL_POINTERS for verification. Since stack expansion and contraction are achieved by copying and moving stack space, the stack pointer may change during function calls; therefore, pointers to stack data should not be stored in local variables.

Assembly functions should always provide Go prototypes, as this can provide pointer information for parameters and return values and allow go vet to check the correctness of offset usage.

Memory Layout#

The size and alignment of built-in basic types in Go, as well as the calculation of field offsets in composite types (structures), can be found in the ABI documentation Memory layout. For other types:

The memory layout of map/chan/func types is equivalent to *T
The memory layout of array types [N]T consists of contiguous memory made up of N instances of type T
The string type consists of two parts in memory: an int representing the string's byte length and a pointer to [cap]T
The slice type []T consists of three parts in memory: an int representing the slice's effective length, an int representing the slice's capacity size, and a pointer to [cap]T

The memory of struct types is composed of contiguous memory for each of its fields. For example, the memory order of a struct of type struct { f1 t1; ...; fM tM } is t1, ..., tM, tP, where tP is an additional byte that is filled only when the size of the last field tM is zero and any preceding field ti has a non-zero size. Experiments show that when taking the address of a zero-sized field in a struct, it always returns the address of the first non-zero-sized type field that follows that field. Thus, a byte is filled after the last zero-sized field to ensure that taking the address does not access external memory.

type S struct { // 0xc00034c000
    A struct{}  // 0xc00034c000
    B int       // 0xc00034c000
    C struct{}  // 0xc00034c008
    D struct{}  // 0xc00034c008
    E int       // 0xc00034c008
    F struct{}  // 0xc00034c010
}

The empty interface interface{} type runtime.eface consists of the following parts:

A pointer to the runtime dynamic data type description
A pointer to the runtime dynamic data value of type unsafe.Pointer

Non-empty interface types consist of the following parts:

A pointer to runtime.itab containing:
- runtime.interfacetype containing method pointers related to this interface
- A pointer to the runtime dynamic data type description
A pointer to the runtime dynamic data value of type unsafe.Pointer

Interface types can be direct or indirect:

Direct interface types directly store data
Indirect interface types store pointers to data
If the value within the interface consists solely of a single pointer, then this interface type can only be a direct type

The above describes the memory layout structure of all types in Go, but when writing assembly functions manually, one should not rely on these rules but rather reference the constants defined in the go_asm.h header file.

Parameter and Return Value Passing in Function Calls#

During function calls, parameters/return values are passed via the stack and hardware registers. Each parameter/return value may be stored entirely in registers (multiple registers can be used to store a single parameter/return value) or stored on the stack. Generally, since accessing registers is faster than accessing memory, parameters/return values are preferentially stored in registers. However, when the remaining registers cannot store the complete value or contain non-fixed-length arrays, parameters/return values can only be passed via the stack.

Each architecture defines a set of integer registers and a set of floating-point registers. From a high-level perspective, all parameter and return value types can be decomposed into basic types and stored in registers in order. Parameters and return values can share a register, but they cannot share the same stack space. The caller will reserve a portion of overflow space on the call stack for parameters stored in registers, but this space will not be filled. The specific algorithm for allocating parameters/return values in registers or on the stack is quite complex; refer to Function call argument and result passing.

Before calling a method, a portion of memory must be allocated in the caller's stack frame to store the method receiver, stack parameters, stack return values, and overflow space for register parameters. Then, the corresponding parameter values are stored in the registers or stack space, and the call operation is executed. During the execution of the call, the return value stack space, overflow space, and return value registers are not initialized, and the callee must store the return values in the corresponding registers or stack frame space allocated according to the algorithm before returning. Since there are no callee-save registers, all registers without explicit meaning may be overwritten, including parameter registers.

In a 64-bit architecture with integer registers R0-R9, the function f signature and its call stack space are as follows:

func f(a1 uint8, a2 [2]uintptr, a3 uint8) (
    r1 struct { x uintptr; y [2]uintptr },
    r2 string,
)

// Stack space layout
// a2      [2]uintptr
// r1.x    uintptr
// r1.y    [2]uintptr
// a1Spill uint8
// a3Spill uint8
// _       [6]uint8  // alignment padding

Since a2 and r1 contain arrays, they can only be allocated on the stack for assignment, while other parameters and return values can be allocated in registers. r2 is decomposed into two independently assignable parts in registers. During the call, a1 will be assigned to the R0 register, a3 will be assigned to the R1 register, and a2 will be assigned on the stack. Upon return, r2.base will be assigned to the R0 register, r2.len will be assigned to the R1 register, and r1.x and r1.y will be assigned in the stack space.

Closures#

Function values like var f func are equivalent to a pointer to a closure object, which consists of the entry address of the closure function and some memory space related to the closure environment. The calling rules for closures are essentially the same as for static functions, with the only exception being that each architecture has a special closure context register, which holds the pointer to the closure object before calling the closure. This allows referencing the objects within the closure even after the closure function exits.

Common Instructions#

// TODO

Reference#

Official documentation and code:

Other resources: